Course Guide¶
We support classroom education at Northeastern University by providing access to computing resources (CPU and GPU) and storage resources for instructors and their students.
We’ve supported courses from many disciplines, including biology, chemistry, civil engineering, machine learning, computer science, mathematics, and physics.
To gain access to HPC resources instructors need to submit a classroom access form.
Important
Please submit classroom access requests prior to the beginning of each term (preferred), or at least one week prior to the start of when you plan on using the HPC cluster for your class. If you’re requesting a customized application we require one-months time to complete prior to when you’d like to use it.
Classroom setup¶
Once access is provided, each course will have a course-specific directory under /courses/
following this sample file tree. As shown for the course BINF6430.202410 below:
/courses/
└── BINF6430.202410/
├── data/
├── shared/
├── staff/
└── students/
The sub-directory staff/
will be populated with a folder for each of the following: instructors, co-instructors, and TAs. The students/
sub-directory contains a folder for each student. And the data/
and shared/
sub-directories can be populated by those in staff but is read-only for students. Students only have permission to read into their own directories under students/
and cannot view into another students space.
All those in staff have read-write-execute permissions within the entirety of their courses directory, allowing them to store data, homework assignments, build conda environments, create new directories, etc, as they see fit.
Each course directory gets a default 1TB of storage space. This amount can be increased in the initial application form for classroom access, or requested anytime during an actively running course, by contacting rchelp@northeastern.edu
Once the course has ended, and final grades have been submitted, all student personal directories will be deleted. The remaining courses space including all data and shared class files will be archived for one year. Any students who had access to the HPC cluster only though the course will no longer have access when the course is completed.
Please see our page on getting-access if you would like an account that persists through taking courses.
Courses Partitions¶
We have two partitions dedicated to the use of students and instructors for the duration of their course.
Name |
Time Limit (default/max) |
Running Jobs (max) |
RAM Limit |
---|---|---|---|
courses |
4 hrs / 24 hrs |
50 |
256 GB |
courses-gpu |
4 hrs / 8 hrs |
1 |
12 GB |
The resources available in the courses/courses-gpu partitions can be queried with the command sinfo
as run in the command line. We manage the resourses in courses/courses-gpu each term in response to the number of courses and requested usage per course.
sinfo -p courses-gpu --Format=nodes,cpus,gres,statecompact
Important
The compute resources for courses are shared across all courses each term. We monitor their usage daily. We send out email notifications to users who are idle on courses-gpu for one hour. We highly recommend ending jobs when your work has finished as this frees up the resource for other students. When all students do this, it increases the availability of resources for everyone in courses.
These partitions can be used in the following ways:
sbatch script¶
An sbatch script can be submitted on the command line via the command sbatch scriptname.sh
. Below are some examples of sbatch scripts using the courses and courses-gpu partitions. See slurm-running-jobs for more information on running sbatch scripts or run man sbatch
for additional sbatch parameters.
courses partition
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --time=4:00:00
#SBATCH --job-name=MyCPUJob
#SBATCH --partition=courses
#SBATCH --mail-type=ALL
#SBATCH [email protected]
# commands to execute
courses-gpu partition
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --time=4:00:00
#SBATCH --job-name=MyGPUJob
#SBATCH --partition=courses-gpu
#SBATCH --gres=gpu:1
#SBATCH --mail-type=ALL
#SBATCH [email protected]
# commands to execute for gpu
srun interactive session¶
An interactive session can be run on the command line via the srun
command as shown in the examples below. We have more information on running jobs using srun
. Or you can run man srun
in the command line to see additinal parameters that can be set with srun
.
courses partition
srun --time=4:00:00 --job-name=MyJob --partition=courses --pty /bin/bash
courses-gpu partition
srun --time=4:00:00 --job-name=MyJob --partition=courses-gpu --gres=gpu:1 --pty /bin/bash
Open OnDemand¶
We have have interactive versions of several widely-used applications available on the Open OnDemand including, Jupyterlab Notebook, Rstudio, Matlab, GaussView and more.
You can login to the Open OnDemand website via the link below.
All of the applications under the “Courses” tab on the dashboard can be set to either the courses
or courses-gpu
partitions via the applications specific pull down menus.
Monitoring Jobs¶
Whichever way you choose to run your jobs, you can monitor their progress with the command squeue
.
squeue -u username
You can also monitor jobs being run on either of the courses partitions.
squeue -p courses
squeue -p courses-gpu
Jobs can be canceled with the command scancel
and the slurm job id that is assigned when your job is submitted to the scheduler.
scancel jobid
Note
A cluster is a collection of shared resources. We highly recommend canceling any jobs that are still running in an interactive session (on the OOD or via srun) when you have completed your work. This frees up the resources for other classmates and instructors.
Software Applications¶
All courses have access to the command line.
We have many software applications installed system wide as modules that are available through the command line via the module command.
Professors should create custom conda environments for their course which can be used in JupyterLab notebook or used in interactive mode (srun) or sbatch scripts on the command line.
Custom Course Applications¶
At Northeastern University instructors have a great deal of flexibility in how they use the HPC for their classroom, and this is most apparent in the use of software applications.
We encourage professors to perform local software installations via conda environments within the /courses
directory for their class. These can be used by the students to complete tutorials and homework assignments. Students can also create their own conda environments in their /courses/course.code/students/username
directory to complete their own projects. Conda environments can be used to install a variety of research software and are not only useful for coding in python.
For most courses, the instructor is able to create a shared conda environment in their /courses
directory that can provide all the necessary packages for the class.
In other cases, where specialized software is needed, please book a classroom consultation with one of the RC team members to discuss what is needed. Please allow at least one month for specialized app development and testing. In some cases we may be unable to provide the exact specifications requested. We will work with the instructor to find a suitable solution.