Using srun#
You can use the Slum command srun
to allocate an interactive job. This means you use specific options with srun
on the command line to tell Slurm what resources you need to run your job, such as number of nodes, amount of memory, and amount of
time. After typing your srun
command and options on the command line and pressing enter, Slurm will find and then allocate the resources
you specified. Depending on what you specified, it can take a few minutes for Slurm to allocate those resources. You can view all of the
srun
options on the Slurm documentation website.
The following image shows an example of an srun
command as run on a command line.
srun
examples#
This section details a few examples using srun
. You should first review the Hardware overview and Partitions sections
to be familiar with the available hardware and partition limits on Discovery. This way, you can tailor your request to fit both the needs of your job
and the limits of the partitions. For example, if you specify --partition=debug
and --time=01:00:00
, youโll get an error because the
time youโve specified exceeds the limit for that partition. Also keep in mind that while these examples are all valid, general examples, they might not work
for your particular job.
simple srun
example is to move to a compute node after you first log into Discovery.
srun --pty /bin/bash
To request one node and one task for 30 minutes with X11 forwarding on the short partition, type:
srun --partition=short --export=ALL --nodes=1 --ntasks=1 --x11 --mem=10G --time=00:30:00 --pty /bin/bash
To request one node, with 10 tasks and 2 CPUs per task (a total of 20 CPUs), 1GB of memory, for one hour on the express partition, type:
srun --partition=express --nodes 1 --ntasks 10 --cpus-per-task 2 --pty --export=ALL --mem=1G --time=01:00:00 /bin/bash
To request two nodes, each with 10 tasks per node and 2 CPUs per task (a total of 40 CPUs), 1GB of memory, for one hour on the express partition, type:
srun --partition=express --nodes=2 --ntasks 10 --cpus-per-task 2 --pty --export=ALL --mem=1G --time=01:00:00 /bin/bash
To allocate a GPU node, you should specify the gpu
partition and use the โgres option:
srun --partition=gpu --nodes=1 --ntasks=1 --export=ALL --gres=gpu:1 --mem=1Gb --time=01:00:00 --pty /bin/bash
For more information about working with GPUs, see Working with GPUs.
Monitor your jobs#
You can monitor your jobs by using the Slurm scontrol
command. Type scontrol show jobid -d <JOBID>
, where JOBID
is the number of your job.
In the figure at the top of the page, you can see that when you submit your srun
command, Slurm displays the unique ID number of your job (job 12962519
).
This is the number you use with scontrol
to monitor your job.