Skip to content

Running jobs with the SLURM System

Information

  • Jobs submission is done through the SLURM queuing system.
  • Ares and Athena share configuration and usage patterns defined previously for the Prometheus cluster, which you can find here.
  • Grants on Ares require specifying an extension to the grant name: -cpu, cpu-bigmem, or -gpu.
  • Grants on Athena require specifying an extension to the grant name: -gpu-a100.
  • Jobs can be submitted both in batch mode and interactively.

For detailed information on how to use the Ares and Athena clusters, please refer to the following pages:

Available Queues for Clusters

  • Ares
Name Timelimit Resource type (account suffix) Access requirements Description
plgrid 72h -cpu Generally available. Standard partition.
plgrid-testing 1h -cpu Generally available High priority, testing jobs, limited to 3 running jobs.
plgrid-now 12h -cpu Generally available The highest priority, interactive jobs, limited to 1 running or queued job.
plgrid-long 168h -cpu Requires a grant with a maximum job runtime of 168h Used for jobs with extended runtime.
plgrid-bigmem 72h -cpu-bigmem Requires a grant with CPU-BIGMEM resources Resources used for jobs requiring an extended amount of memory.
plgrid-gpu-v100 48h -gpu Requires a grant with GPGPU resources GPU partition.
  • Athena
Name Timelimit Account suffix Remarks
plgrid-gpu-a100 48h -gpu-a100 GPU A100 partition.
  • Tryton
Name Timelimit
plgrid-testing 1h
plgrid 72h
plgrid-long 168h
  • BEM2
Name Timelimit
plgrid-short 24h
plgrid 72h
plgrid-long 168h

Batch Mode

The sbatch command is used for submitting batch mode jobs. Usage: sbatch script.sh. All options should be preceded by the #SBATCH directive (with the # symbol). Detailed information can be found in man sbatch and sbatch --help.

Sample script:

Warning

This is a sample script and you SHOULD NOT run it. Before running it, customize it to your needs, changing, for example, the grant name, partition, job duration, etc.

#!/bin/bash -l
## Job Name
#SBATCH -J ADFtestjob
## Number of allocated nodes
#SBATCH -N 1
## Number of tasks per node (default is the number of allocated cores per node)
#SBATCH --ntasks-per-node=1
## Amount of memory per CPU core (default is 5GB per core)
#SBATCH --mem-per-cpu=1GB
## Maximum job duration (format HH:MM:SS)
#SBATCH --time=01:00:00 
## Grant name for resource usage accounting
#SBATCH -A <grant_id-suffix>
## Partition specification
#SBATCH -p plgrid-testing
## Standard output file
#SBATCH --output="output.out"
## Standard error file
#SBATCH --error="error.err"

## Switching to the directory where the sbatch command was initiated
cd $SLURM_SUBMIT_DIR

srun /bin/hostname
module load plgrid/apps/adf/2014.07 
adf input.adf

Interactive Mode

To submit jobs interactively with a shell, you can use the following command as an example:

srun -p plgrid-testing -N 1 --ntasks-per-node=1 -n 1 -A <grant_id-suffix> --pty /bin/bash -l

The above command will run the task in the plgrid-testing partition on 1 node with an allocation of 1 core.


Last update: September 10, 2024