Home Directory Storage Quota¶
There are strict quotas for each home directory (i.e., /home/<username>
), and staying within the quota is vital for preventing issues on the HPC. This page provides some best practices for keeping within the quota. For more information about data storage on the HPC, see data-storage.
Important
All commands on this page should be run from a compute node on the partition short
, because they are CPU-intensive. You can find more information on getting a job on a compute node from Interactive Jobs: srun Command.
Utilize /projects and /scratch¶
Use /projects
for long-term storage. PIs can request a folder in /projects
via New Storage Space Request and additional storage via Storage Space Extension Request. Utilize /scratch/<username>
for temporary or intermediate files. Then, move files from /scratch
to /projects
for persistent storage (i.e., the recommended workflow).
Note
Please be mindful of the /scratch
purge policy, which can be found on the Research Computing Policy Page. See data-storage for information on /projects
and /scratch
.
How To Check Your Quotas¶
You can see exactly how much of your quota is being used in your /projects
, /scratch
, or your /home
directories by running the check-quota
script from any node in the short
partition.
First, launch a job on a node in the short
partition.
srun -p short --pty bash
And then run check-quota
with the desired path as follows:
# check your home quota
check-quota /home/<username>
# check your scratch quota
check-quota /scratch/<username>
# check a projects directory quota
check-quota /projects/<directory>
The output will be of the following form (inode count refers to the number of files):
Directory <> has the following quota:
Path: <directory>
Used Disk Space: 0.0 Tib
Disk Space Soft Limit: 0.07 Tib
Disk Space Hard Limit: 0.10 Tib
Used Inodes: 0
Inodes Soft Limit: 2500000
Inodes Hard Limit: 5000000
Warning
You will only be able to see quotas of directories to which you have access; attempting to see quotas for directories that you don’t have access to is not supported.
Analyze Disk Usage¶
To evaluate directory level usage you can use the command du
. From a compute node, run the following command from your /home/<username>
directory:
du -shc .[^.]* ~/*
This command will output the size of each file, directory, and hidden directory in your /home/<username>
space, with the total of your /home
directory being the last line of the output. After identifying the large files and directories, you can move them to the appropriate location (e.g., /projects
for research) or back up and delete them if they are no longer required. An example output would look like:
[<username>@<host> directory]$ du -shc .[^.]* ~/*
39M .git
106M examples
41K README.md
3.3M software-installation
147M total
;;;{note}
The du
command can take a few minutes to run in /home/<username>
Cleaning Directories
Local
We advise against using ‘pip install’ to install packages outside of a conda environment or python virtual environment (for example, while in a JupyterLab Notebook or interactive python session). These installations are placed in your .local
directory, adding to your /home
quota. Additionaly, the presence of different packages in .local
can have a negative impact on the function of applications on the OOD. Please ensure all the packages you need are installed in a conda or virtual python environment.
If there are no activly running processes the entire .local
directory can be moved to .local-off
or individual packages can be removed usually from within: /home/username/.local/lib/pythonXX/site-packages
You can check for running processes via:
squeue -u <username>
To move your .local to .local-off
mv /home/username/.local /home/username/.local-off
Conda¶
Note
Conda environments are part of your research and should be stored in your PI’s /projects
directory.
Here are some suggestions to reduce the storage size of the environments for those using the /home/<username>/.conda
directory.
Remove unused packages and clear caches of Conda by loading an Anaconda module and running the following:
source activate <your environment>
conda clean --all
This will only delete unused packages in your ~/.conda/pkgs
directory.
To remove any unused conda environments, run:
conda env list
conda env remove --name <your environment>
Apptainer¶
If you have pulled any containers to the HPC using Apptainer, you can clean your container cache in your /home/<username>
directory by running the following command from a compute node:
apptainer cache clean all
To avoid your ~/.apptainer
directory filling up, you can set a temporary directory for when you pull a container to store the cache in that location; an example of this procedure (where <project-name>
is your PI’s /projects
directory) is the following:
mkdir /projects/<project-name>/apptainer_tmp
export APPTAINER_TMPDIR=/projects/<project-name>/apptainer_tmp
Then, pull the container using Apptainer as usual.
Cache¶
The ~/.cache
directory can become large with the general use of HPC and Open OnDemand. Make sure you are not running any processes or jobs at the time by running the following:
squeue -u <username>
which prints a table with JOBID
, PARTITION
, NAME
, USER ST
, TIME
, NODES
, and NODELIST (REASON)
, which is empty when no jobs are running (i.e., it is safe to remove ~/.cache
when no jobs are running).
Storing research environments¶
Conda environments¶
Use conda environments for Python on HPC. To create an environment in /projects
, use the --prefix
flag as follows: (where <project-name>
is your PI’s /projects
directory and <my conda env>
is an empty directory to store your Conda environment):
conda create --prefix=/projects/<project-name>/<my conda env>
Utilize the same conda environment to save storage space and time (i.e., avoid duplicate conda environments). Hence, shared environments can be easily done for a project accessing the same /projects
directory.
Apptainer containers¶
Containers pulled, built, and maintained for research work should be stored in your PI’s /projects
directory, not your /home/<username>
directory.