Using MATLAB¶
MATLAB is available on the cluster as a module and as an interactive app on Open OnDemand. You can also download MATLAB for your computer through the Northeastern portal on the MATLAB website. Note that the procedures detailed below pertain specifically to using MATLAB on the cluster, and not to using MATLAB on your computer.
Installing MATLAB toolboxes¶
Use the following procedure if you need to install a MATLAB toolbox:
Download the toolbox from its source website.
Connect to the cluster.
Create a directory in your /home directory. We recommend creating a directory called
matlab
by typing:mkdir /home/
/matlab #where is your username Go to the directory you just created by typing:
cd /home/
/matlab Unzip the toolbox file by typing:
unzip
Load MATLAB by typing:
module load matlab
Start MATLAB by typing:
matlab
Add the toolbox to your PATH by typing:
addpath(‘/home/
/matlab/ ’) #where is the name of the toolbox you just unzipped If this is a toolbox you want to use more than once, you should save it to your path by typing:
savepath()
You can now use the toolbox within MATLAB. When you are done, type
quit
.
Using MATLAB Parallel Server¶
Configuration of MATLAB client on the HPC
The cluster has MATLAB Parallel Server installed and this section details an example of how you can set up and use the MATLAB Parallel Computing Toolbox on the HPC. This walkthrough uses MATLAB R2023a launched as an interactive app on the Open OnDemand web portal. There are several parts to this walkthrough so we suggest that you read it through completely before starting. The parameters presented represent only one scenario and the results may differ when you run the examples.
Go to http://ood.discovery.neu.edu. If prompted, sign in with your cluster username and password.
Click Interactive Apps, and select MATLAB.
Select MATLAB version R2023a or newer, and set the time for the length of your job, the number of cpus to be 8, and the memory to at least 16 GB. Click Launch.
If necessary, adjust the Compression and Image Quality, and then click Launch MATLAB.
Once MATLAB is open, click on the Home tab, click Parallel > Discover Clusters… to discover the profile. This will open a window where you should select ‘on your network’ and click next and then select ‘Discovery’ and click finished. Note, this is valid for R2023a and newer. Create a cluster profile to run parallel jobs by running
configCluster
in the Command Window. Note:configCluster
should only be called once on the HPC.
Now MATLAB is configured to have jobs submitted to the HPC cluster and not run in the current session.
Installation and Configuration of MATLAB on a Local Machine
MATLAB can be configured on your local machine to submit jobs to the HPC cluster. To complete this, you will need to download the following directory from Github: MATLAB Desktop Parallel Toolbox Setup.
In your desktop MATLAB instance, run:
>> userpath
and move the compressed directory you downloaded from Github to that location and unzip the directory. To configure MATLAB to run parallel jobs on the cluster, run the following in the Command Window:
>> configCluster
Note
This only needs to be called once per version of MATLAB you are using to run jobs on the HPC cluster.
Submitting jobs to the HPC cluster will require SSH credentials and you will be prompted for your username and password or a private SSH key. The username and/or location of the private key will be saved for future session in MATLAB.
Jobs will now default to running on the HPC cluster rather than submitting to your local machine.
To submit jobs to your local machine instead of the HPC cluster, run the following:
>> % Get a handle to the local resources
>> c = parcluster('local');
Configuring Jobs¶
Prior to submitting jobs to the HPC, various parameters can be assigned and adjusted for your job such as partition, email, job walltime, etc.
To get a handle on the cluster and the current configurations please run the following command in the Command Window:
>> % Get a handle to the cluster
>> c = parcluster;
A required field that needs to be set is the memory per CPU core:
>> % Specify memory to use, per core (default: 4gb)
>> c.AdditionalProperties.MemPerCPU = '6gb';
You need to save the changes after modifying the AdditionalProperties:
>> c.saveProfile
To view the values of the currently saved configuration, use the following command in the Command Window:
>> % To view current properties
>> c.AdditionalProperties
If you need to unset a saved parameter, you can do the following in the Command Window:
>> % Turn off email notifications
>> c.AdditionalProperties.EmailAddress = '';
>> c.saveProfile
The following are optional fields that can be set and adjusted for the parcluster and include the following items.
Specifying a constaint such as a CPU constraint:
>> % Specify a constraint
>> c.AdditionalProperties.Constraint = 'feature-name';
Setting an email to receive updates on the SLURM job:
>> % Request email notification of job status
>> c.AdditionalProperties.EmailAddress = '[email protected]';
Setting the number of GPUs used and the type of GPU:
>> % Specify number of GPUs (default: 0)
>> c.AdditionalProperties.GPUsPerNode = 1;
>> c.AdditionalProperties.GPUCard = 'gpu-card';
Note
If you need a GPU, you need to make sure the partition you are submitting the job to is the gpu
or multigpu
partition or a private partition that has GPUs available.
Selecting the partition you want the job to run on the HPC cluster:
>> % Specify the partition
>> c.AdditionalProperties.Partition = 'partition-name';
Setting the number of cores for the job:
>> % Specify cores per node (default: 0)
>> c.AdditionalProperties.ProcsPerNode = 4;
Setting exclusive use of the node:
>> % Set node exclusivity (default: false)
>> c.AdditionalProperties.RequireExclusiveNode = true;
Note
The exclusive node option can increase your wait time in the queue so only use if required.
If you are running the job on a reservation that has been setup:
>> % Use reservation
>> c.AdditionalProperties.Reservation = 'reservation-name';
Setting the time for the job to run on the HPC cluster:
>> % Specify the wall time (e.g., 1 day, 5 hours, 30 minutes)
>> c.AdditionalProperties.WallTime = '1-05:30';
Note
You need to make sure the time fits within the max time allowed on the partition you are submitting to or the jobs will not run.
Using the MATLAB Client Interactively on the HPC Cluster¶
If you want to run an interactive pool job on the cluster we can use a parpool
like above:
>> % Get a handle to the cluster
>> c = parcluster;
>> % Open a pool of 64 workers on the cluster
>> pool = c.parpool(64);
Instead of the open instance of MATLAB running this code, this will be run across the HPC cluster. Run the following code:
>> % Run a parfor over 1000 iterations
>> parfor idx = 1:1000
a(idx) = rand;
end
When you no longer need the pool, you can delete it and free up the resources with:
>> % Delete the pool
>> pool.delete
Using the MATLAB with a Batch Job on the HPC Cluster¶
You can use the batch
command to submit passively running jobs to the HPC cluster from MATLAB. This command will return a job object that can be used to access the output of the submitted batch job.
>> % Get a handle to the cluster
>> c = parcluster;
>> % Submit job to query where MATLAB is running on the cluster
>> job = c.batch(@pwd, 1, {}, 'CurrentFolder','.');
>> % Query job for state
>> job.State
>> % If state is finished, fetch the results
>> job.fetchOutputs{:}
>> % Delete the job after results are no longer needed
>> job.delete
To see the jobs that have been completed or still are running, you can call parcluster
to return the cluster object that stores this information. Run the following command in the Command Window:
>> c = parcluster;
>> jobs = c.Jobs
>>
>> % Get a handle to the second job in the list
>> job2 = c.Jobs(2);
Once you locate the job, you can retrieve the results with fetchOutputs
if you are using it interactively in the Command Window or you can use load
if you are using it in a MATLAB script.
>> % Fetch all results from the second job in the list
>> job2.fetchOutputs{:}
Parallel MATLAB Batch Job¶
The batch
command can also submit parallel workflows to the HPC cluster. As an example, save the following code to a MATLAB script called parallel_example.m.
function [sim_t, A] = parallel_example(iter)
if nargin==0
iter = 8;
end
disp('Start sim')
t0 = tic;
parfor idx = 1:iter
A(idx) = idx;
pause(2)
idx
end
sim_t = toc(t0);
disp('Sim completed')
save RESULTS A
end
When you submit the job using the batch
command, you will need to also specify the MATLAB Pool argument:
>> % Get a handle to the cluster
>> c = parcluster;
>> % Submit a batch pool job using 4 workers for 16 simulations
>> job = c.batch(@parallel_example, 1, {16}, 'Pool',4, ...
'CurrentFolder','.');
>> % View current job status
>> job.State
>> % Fetch the results after a finished state is retrieved
>> job.fetchOutputs{:}
ans =
8.8872
This example took 8.89 seconds to run utilizing four workers. These jobs will always request N+1 number of CPUs for the task since 1 worker is required to manage the batch job and the pool of workers.
We can run the same simulation but increase the size of the Pool. We will retrieve the results later so we will keep the submitted job id as a reference.
Note
Be careful with increasing the numbers of workers for your job because there can be a point where the returns on performance will diminish.
>> % Get a handle to the cluster
>> c = parcluster;
>> % Submit a batch pool job using 8 workers for 16 simulations
>> job = c.batch(@parallel_example, 1, {16}, 'Pool',8, ...
'CurrentFolder','.');
>> % Get the job ID
>> id = job.ID
id =
4
>> % Clear job from workspace (as though MATLAB exited)
>> clear job
You need to get a handle of the cluster with the parcluster
command and then you can use the findJob
method to obtain information about the job:
>> % Get a handle to the cluster
>> c = parcluster;
>> % Find the old job
>> job = c.findJob('ID', 4);
>> % Retrieve the state of the job
>> job.State
ans =
finished
>> % Fetch the results
>> job.fetchOutputs{:};
ans =
4.7270
This job ran for 4.73 seconds using 8 workers in MATLAB. You can run the code with different numbers of works to determine the ideal number of works for the job.
You can also use the MATLAB GUI to obtain the results from the job using the Job Monitor found in the Parallel dropdown in the Home tab and select Monitor Jobs.
Function |
Description |
Applies Only to Desktop |
---|---|---|
clusterFeatures |
List of cluster features/constraints |
|
clusterGpuCards |
List of cluster GPU cards |
|
clusterPartitionNames |
List of cluster partition/queue names |
|
disableArchiving |
Modify file archiving to resolve file mirroring issue |
true |
fixConnection |
Reestablish cluster connection (e.g., after reconnection of VPN) |
true |
willRun |
Explain why job is queued |
Debugging¶
You can debug MATLAB errors with the following commands depending if it is a serial or parallel job. If the job is a serial submission, you can use the getDebugLog
method to view the error log that is created. In the Command Window:
>> c.getDebugLog(job.Tasks)
If you submit a Pool job, you need to specify the job object:
>> c.getDebugLog(job)
When troubleshooting a job, having the SLURM job ID can be helpful to understand what might have occurred in the job. You can obtain this by calling `getTaskSchedulerIDs() in the Command Window:
>> job.getTaskSchedulerIDs()
ans =
25539
To Learn More¶
To learn more about the MATLAB Parallel Computing Toolbox, checkout these resources:
Parallel Computing Documentation