Computational Research Center
Skip navigation

Shale Submission Script

[ Shale User Instructions | Shale Submission Scripts | Shale User Paths ]

Cluster Scheduler

The Shale cluster uses a Job Scheduler to allocate resources for jobs.

When you test your compiled code, you need to submit your job as a PBS Script file.

QSUB

The "qsub" command is actually the command that submits the job to the cluster. So if you have a jobs submission script named "submit.pbs" , you would type the following on the command line after the command prompt:

qsub submit.pbs


QSTAT

The "qstat" command will report information about your submitted job.

The following command will provide you with the following information: qtat -a

master0:
Req'd Req'd   Elap
Job ID               Username Queue    Jobname          SessID NDS   TSK Memory Time  S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
11112.master0        aarontes batch    tscript             --      2  16    --  24:00 Q   --

Adding the -n flag to qstat will provide you with the node information about your job: qstat -n
master0: 
                                                                         Req'd  Req'd   Elap
Job ID               Username Queue    Jobname          SessID NDS   TSK Memory Time  S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
11112.master0        aarontes batch    tscript            6208     2  16    --  24:00 C 00:00
   node23/7+node23/6+node23/5+node23/4+node23/3+node23/2+node23/1+node23/0
   +node22/7+node22/6+node22/5+node22/4+node22/3+node22/2+node22/1+node22/0

Users can also use the -f flag to find out even more information about the submitted job: qstat -f


QDEL

The qdel command will kill a job that you have submitted to the cluster while it is either waiting in the queue or running. You will need to use the job number to designate which job you would like to kill. Users can only kill the jobs that they have submitted. Example usage: qdel 11112


PBS Submission Scripts

A submission submission scripts has three parts: 1) the PBS scheduler commands; 2) the bash shell script commands that prepares your job environment; and 3) the MPI command line that launches your application.

A properly formatted PBS Script looks like the following (the line numbers on the left-hand side aren't part of the pbs script):

1  #PBS -S /bin/bash
2 #PBS -N blah
3 #PBS -o /home/abergstr/testme.o
4 #PBS -e /home/abergstr/testme.e
5 #PBS -m abe
6 #PBS -M aaron.bergstrom@und.edu
7 #PBS -l nodes=2:ppn=8
8 #PBS -l walltime=24:00:00
9 np=`wc -l < $PBS_NODEFILE`
10 cd ~/testcode
11 mpiexec -n $np ./hello.x

Lines 1-8 contains the pbs scheduler commands.
Lines 9-10 contains the bash shell script commands
Line  11 is the mpi command that launches your job.

Lines 1-8: PBS Scheduler Commands

#PBS and #
Lines that begin with #PBS tell the scheduler that the line should be treated as an scheduler command. Lines that begin with a # sign, but that are not followed by PBS, are considerd comment lines.

PBS command flags - A full list of PBS command flags can be found here.

-S designates the path to the command shell that you would 
like the scheduler to use when executing your shell commands
and the mpi program.
-N designates the name of your job. This is used to identify the
job when you use the "qstat" command.
-o designates the path to the text file where "standard out" is
redirected.
-e designates the path to the text file where "standard error" is
redirected.
-m tells the scheduler to send you email whenver your job (b)egins,
(a)borts, or (e)nds.
-M designates the email address to which you want your scheduler
email sent.
-l nodes=2:ppn=8
This line tells the scheduler that you want to request 8 cores on
2 compute nodes for your computational work. For a total of 16
compute cores.
-l walltime=24:00:00
This line tells the scheduler that you request these resources
for 24 hours, 0 minutes, and 0 seconds. The lower the number,
the sooner the your job will run. However, your job will end
once its runtime has exceded this number whether or not your
computational work has completed. Therefore, you should request
a time allocation large enough to complete your work.
-naccesspolicy=singlejob
Though not listed above, you can request that only your jobs
be allowed to run on your node allocation for the durration
of your walltime. You may need to do this if your job
requires a large amount of memory, but few compute cores.

Lines 9-10: Bash Shell Commands

Line 9 is a Bash schell command that reads the number of node entries from the PBS Nodefile and stores that number in a Bash shell variable $np:
Example: np='wc -l < $PBS_NODEFILE'

This variable is later reused as the requested core number of the -n flag on the mpi command line.

Line 10 has the scheduler change the active directory to the "testcode" subdirectory of the user's home directory.
Example: cd ~/testcode

Neither of the above two lines is necessary if the user choses to type in the actual core number to be used, and the full path to the user's exacutable is given in the mpi command line.

 

Line 11: MPI Command

The MPI command is a standard command line executable. It has 4 parts: 1) the MPI executable (either mpirun or mpiexec); 2) the flag designating the number of cores to be used -n ; 3) followed by the actual number of the actual cores to be requested; and 4) the path to the user-compiled executable file.

The core number requested should always match the multiplied node:ppn value of line 7. In otherwords, if "node=2:ppn=8" is used, then the multiplied node:ppn value should equal 16, and thus the item 3 of the mpi command should be 16.

The case of our example PBS script above, we use the $np shell variable to ensure that no matter what node:ppn value we designate, we will always have the correct number of compute cores requested.
Example:
mpiexec -n $np ./hello.x