Home Directory Storage Quota

There are strict quotas for each home directory (i.e., /home/<username>), and staying within the quota is vital for preventing issues on the HPC. This page provides some best practices for keeping within the quota. For more information about data storage on the HPC, see data-storage.

Important

All commands on this page should be run from a compute node on the partition short, because they are CPU-intensive. You can find more information on getting a job on a compute node from Interactive Jobs: srun Command.

Utilize /projects and /scratch

Use /projects for long-term storage. PIs can request a folder in /projects via New Storage Space Request and additional storage via Storage Space Extension Request. Utilize /scratch/<username> for temporary or intermediate files. Then, move files from /scratch to /projects for persistent storage (i.e., the recommended workflow).

Note

Please be mindful of the /scratch purge policy, which can be found on the Research Computing Policy Page. See data-storage for information on /projects and /scratch.

How To Check Your Quotas

You can see exactly how much of your quota is being used in your /projects, /scratch, or your /home directories by running the check-quota script from any node in the short partition.

First, launch a job on a node in the short partition.

srun -p short --pty bash

And then run check-quota with the desired path as follows:

# check your home quota
check-quota /home/<username>
# check your scratch quota
check-quota /scratch/<username>
# check a projects directory quota
check-quota /projects/<directory>

The output will be of the following form (inode count refers to the number of files):

Directory <> has the following quota:
	Path: <directory>
	Used Disk Space: 0.0 Tib
	Disk Space Soft Limit: 0.07 Tib
	Disk Space Hard Limit: 0.10 Tib
	Used Inodes: 0
	Inodes Soft Limit: 2500000
	Inodes Hard Limit: 5000000

Warning

You will only be able to see quotas of directories to which you have access; attempting to see quotas for directories that you don’t have access to is not supported.

Analyze Disk Usage

To evaluate directory level usage you can use the command du. From a compute node, run the following command from your /home/<username> directory:

du -shc .[^.]* ~/*

This command will output the size of each file, directory, and hidden directory in your /home/<username> space, with the total of your /home directory being the last line of the output. After identifying the large files and directories, you can move them to the appropriate location (e.g., /projects for research) or back up and delete them if they are no longer required. An example output would look like:

[<username>@<host> directory]$  du -shc .[^.]* ~/*
39M     .git
106M    examples
41K     README.md
3.3M    software-installation
147M    total

;;;{note} The du command can take a few minutes to run in /home/<username>

Cleaning Directories

Local

We advise against using ‘pip install’ to install packages outside of a conda environment or python virtual environment (for example, while in a JupyterLab Notebook or interactive python session). These installations are placed in your .local directory, adding to your /home quota. Additionaly, the presence of different packages in .local can have a negative impact on the function of applications on the OOD. Please ensure all the packages you need are installed in a conda or virtual python environment.

If there are no activly running processes the entire .local directory can be moved to .local-off or individual packages can be removed usually from within: /home/username/.local/lib/pythonXX/site-packages

You can check for running processes via:

squeue -u <username>

To move your .local to .local-off

mv /home/username/.local /home/username/.local-off

Conda

Note

Conda environments are part of your research and should be stored in your PI’s /projects directory.

Here are some suggestions to reduce the storage size of the environments for those using the /home/<username>/.conda directory.

Remove unused packages and clear caches of Conda by loading an Anaconda module and running the following:

source activate <your environment>
conda clean --all

This will only delete unused packages in your ~/.conda/pkgs directory.

To remove any unused conda environments, run:

conda env list
conda env remove --name <your environment>

Apptainer

If you have pulled any containers to the HPC using Apptainer, you can clean your container cache in your /home/<username> directory by running the following command from a compute node:

apptainer cache clean all

To avoid your ~/.apptainer directory filling up, you can set a temporary directory for when you pull a container to store the cache in that location; an example of this procedure (where <project-name> is your PI’s /projects directory) is the following:

mkdir /projects/<project-name>/apptainer_tmp
export APPTAINER_TMPDIR=/projects/<project-name>/apptainer_tmp

Then, pull the container using Apptainer as usual.

Cache

The ~/.cache directory can become large with the general use of HPC and Open OnDemand. Make sure you are not running any processes or jobs at the time by running the following:

squeue -u <username>

which prints a table with JOBID, PARTITION, NAME, USER ST, TIME, NODES, and NODELIST (REASON), which is empty when no jobs are running (i.e., it is safe to remove ~/.cache when no jobs are running).

Storing research environments

Conda environments

Use conda environments for Python on HPC. To create an environment in /projects, use the --prefix flag as follows: (where <project-name> is your PI’s /projects directory and <my conda env> is an empty directory to store your Conda environment):

conda create --prefix=/projects/<project-name>/<my conda env>

Utilize the same conda environment to save storage space and time (i.e., avoid duplicate conda environments). Hence, shared environments can be easily done for a project accessing the same /projects directory.

More information about creating custom Conda environments.

Apptainer containers

Containers pulled, built, and maintained for research work should be stored in your PI’s /projects directory, not your /home/<username> directory.