Running Jobs with job-assist¶
The job-assist
command is designed to simplify the process of submitting jobs on the Discovery cluster using SLURM’s srun
and sbatch
modes. It provides an intuitive interface for requesting resources and managing job submissions on the cluster.
Features¶
This command provides three modes of operation:
Mode |
Description |
---|---|
Default |
To quickly start a job with the default settings. |
Interactive |
To submit the job interactively with customized resource requests. |
Batch |
To generate and save |
Getting Started with job-assist Command¶
Load the job-assist Module¶
module load job-assist
To launch an interactive session, use the following command:¶
job-assist
This command will display a menu option with different modes of operation.
#Simple Example of the Menu Options
SLURM Menu:
1. Default mode (srun --pty /bin/bash)
2. Interactive Mode
3. Batch Mode
4. Exit
Enter your option:
Default Mode¶
This mode runs a basic SLURM job with the command:
srun --pty /bin/bash
By default, the resource allocated is short partition with 2 CPUs, 1 task, 1 node, and 2GB of memory for a duration of 4 hours.
Default Mode Example
[rc.computing@login-00 ~]$ job-assist
SLURM Menu:
1. Default mode (srun --pty /bin/bash)
2. Interactive Mode
3. Batch Mode
4. Exit
Enter your option: 1
Running default job: srun --pty /bin/bash
srun: job 43910949 queued and waiting for resources
srun: job 43910949 has been allocated resources
[rc.computing@d1019 ~]$
Interactive Mode¶
In this mode, job-assist
guide you through interactively requesting resources. You will be prompted to provide the following information:
Partition Name: Choose from available partitions like
debug
,express
,short
, orgpu
.Number of Nodes: Specify the number of nodes required.
Number of Tasks: Specify the number of tasks.
CPUs per Task: Specify how many CPUs are needed per task.
Time: Specify the maximum run time for the job in
HH:MM:SS
format.Memory: Specify the memory required per node in the correct format (e.g.,
4G
for 4 GB).
If you select the gpu
partition, you will have the option to choose the type of GPU needed. This mode automatically submits the job to the scheduler and allocates the resources.
Interactive Mode Example
[rc.computing@login-00 ~]$ job-assist
SLURM Menu:
1. Default mode (srun --pty /bin/bash)
2. Interactive Mode
3. Batch Mode
4. Exit
Enter your option: 2
Partitions:
debug express short gpu
Enter partition name: short
Maximum number of nodes available for short is 2
Enter number of nodes (1-2): 1
Enter number of tasks: 2
Enter number of cpus per task: 4
Maximum time for short partition is 24:00:00
Enter time(HH:MM:SS): 01:00:00
Enter memory required per node (<size>[units]): 4G
Running command: srun --partition=short --nodes=1 --ntasks=2 --cpus-per-task=4 --time=01:00:00 --mem=4G --pty /bin/bash
srun: job 43911332 queued and waiting for resources
srun: job 43911332 has been allocated resources
[rc.computing@d1019 ~]$
Note
For Default and Interactive Modes, once the resources are allocated, the job will transition from the login node (login-00
) to a compute node (e.g., d1019
).
Batch Mode¶
This mode allows you to create and save an sbatch
script with your specific requirements. By default, it will save the batch script in the /home
directory. If prompted for a specific directory, it will save it in the location you specify.
Batch Mode Example
[rc.computing@login-00 ~]$ job-assist
SLURM Menu:
1. Default mode (srun --pty /bin/bash)
2. Interactive Mode
3. Batch Mode
4. Exit
Enter your option: 3
Partitions:
debug express short gpu
Enter partition name: debug
Enter number of nodes: 2
Enter number of tasks: 4
Enter number of cpus per task: 4
Maximum time for debug partition is 00:20:00
Enter time(HH:MM:SS): 00:20:00
Enter memory required per node (<size>[units]): 32M
Do you want to specify a directory to save the batch script [Defaults to home]? (y/n): n
Enter the filename for the batch script without any extension: batch-mode
Batch script saved to /home/<user-name>/batch-mode.sh
[rc.computing@login-00 ~]$
To view the content of the batch script:
nano /home/<user-name>/batch-mode.sh
#Replace <user-name> with your actual username in the path
This creates a basic structure for the batch script to interpret with SLURM. You can edit it to add software, environment modules, and programs to execute.
Tip
Efficient Resource Usage: Requesting only the resources you need (e.g., nodes, memory, CPUs) can help your job start sooner. Be mindful that requesting more CPUs or longer run times may result in longer queue times.