The Discovery cluster provides you with access to over 24,000 CPU cores and over 200 GPUs. Discovery is connected to the university network over 10 Gbps Ethernet (GbE) for high-speed data transfer. Compute nodes are connected to each other with either 10 GbE or a high-performance HDR200 InfiniBand (IB) interconnect running at 200 Gbps (with some nodes running HDR100 IB, if HDR200 IB is not supported on those nodes).
CPU nodes and their feature names¶
As of February 2021, all previous CPU node feature names were updated to human-friendly feature names following the archspec microarchitecture specification (https://archspec.readthedocs.io/en/latest/index.html). This update is only to the CPU node feature names. GPU node feature names remain unchanged at this time. Table 1 shows the new feature names and their corresponding previous feature names. Note that the old feature names will be removed from use with your jobs as of March 3, 2021. Make sure to update your scripts with the new feature names if you had used any of the old feature names previously.
Table 1: CPU Nodes
|Previous Feature Name||Current Node Feature Name||Number of Nodes|
If you are looking for information about GPUs, see Working with GPUs.
If you are looking for information about the partitions on Discovery, see Partitions.
sbatch, you can specify specific hardware features as part of your job by using the
--constraint= flag. Currently,
there are two supported options with the
For example, if you want to only include nodes that are connected by InfiniBand (IB) with a job that needs to use multiple nodes, you can
--constraint=ib in your
srun command or as a line in your
sbatch script. Using a constraint can mean that you
will wait longer for your job to start, as the scheduler (Slurm) will need to find and allocate the appropriate hardware that you have
specified for your job. For more information about running jobs, see Using Slurm.