Quick Start Guide for H200s¶
Introduction¶
The Research Computing team would like to share a general quick start guide on accessing the H200s on the Explorer HPC cluster. The H200s will be accessible for your jobs in the terminal in sbatch and srun sessions and on Open OnDemand in the gpu-short, gpu, and multigpu partitions.
We have documentation on utilizing GPU resources on the Explorer HPC cluster. Please see GPU Access. Only the gres
flag will have to be changed in the submission.
Using H200s in srun
¶
srun --partition=gpu --nodes=1 --pty --gres=gpu:h200:1 --ntasks=1 --mem=4GB --time=01:00:00 /bin/bash
This example is for an interactive session running on 1 node, with 1 CPU core and 4 GB of CPU memory for 1 hour with an H200 GPU.
Note
On the gpu
partition, requesting more than 1 GPU (--gres=gpu:1
) will cause your request to fail. Additionally, you cannot request all the CPUs on that GPU node, as those CPUs are reserved for other GPUs.
Using H200s in sbatch
¶
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --gres=gpu:h200:1
#SBATCH --time=01:00:00
#SBATCH --job-name=gpu_run
#SBATCH --mem=4GB
#SBATCH --ntasks=1
#SBATCH --output=myjob.%j.out
#SBATCH --error=myjob.%j.err
## <your code>
This example is for an sbatch job requesting 1 node, with 1 CPU core and 4 GB of CPU memory for 1 hour with an H200 GPU.
Note
Requesting a specific type of GPU could result in longer wait times, based on GPU availability at that time.
For use in Open OnDemand¶
In Open OnDemand, for the application you would like to launch, please select gpu-short
, gpu
, and multigpu
from the partition drop down menu. Then from the GPU Type drop down menu, please select the h200
option.