Computational Research Center
The University of North Dakota (UND) Computational Research Center (CRC) is dedicated to helping researchers affiliated with the University of North Dakota solve increasingly challenging problems in science and society.
By supporting high performance computing hardware, software and staff through the Division of Research, the University provides quality computing resources to its researchers and faculty involved in creative activities which require intricate numerical modeling and data-intensive simulations.
Routine maintenance is scheduled on the second Thursday of every month.
The next monthly maintenance day is: Thursday, February 13.
- Estimated downtime: 5 a.m. - 5 p.m.
- Reason: Firmware and software updates
Outage notifications and maintenance reminders are published via the Computational Research Center twitter account: @UNDCompResearch
Effective immediately, the Hodor and Arya clusters will be queues under the new Talon cluster. You will now log in to talon.und.edu for all job submissions. Hodor is becoming two queues, one for GPU jobs (hodor-gpu) and one for CPU jobs (hodor-cpu). To submit jobs to Hodor or Arya, simply add the following line to your submission script:
queue_name will be the queue you want to use. If you don't specify a queue, it will default to talon which is the default queue. Below is the list of available queues:
hoffmann - Arya (authorized departments only)
manu - Arya (authorized departments only)
hodor-cpu - Hodor CPU
hodor-gpu - Hodor GPU
talon-gpu - Talon GPU deep learning nodes (authorized users only)
talon-fat - Large memory
talon - Talon CPU, this is the default queue
For scheduling fairness and backfill scheduling efficiency, all Talon and Hodor queues now have default and maximum walltimes enforced. The default and maximum walltimes can be found below. The maximum walltimes can also be found by running sinfo. Maximum walltimes are subject to maintenance windows. If the deafult 2-hour walltime is not enough, then you will need to set your walltime in your submission script. Application checkpointing is always strongly encouraged.
hodor-cpu - Default run time, 2 hours. Maximum run time, 8 days.
hodor-gpu - Default run time, 2 hours. Maximum run time, 8 days.
talon - Default run time, 2 hours. Maximum run time, 28 days.
talon-fat - Default run time, 2 hours. Maximum run time, unlimited.
talon-gpu - Default run time, 2 hours. Maximum run time, 28 days.