Running jobs with the SLURM System
Information
- Jobs submission is done through the SLURM queuing system.
- Ares and Athena share configuration and usage patterns defined previously for the Prometheus cluster, which you can find here.
- Grants on Ares require specifying an extension to the grant name: -cpu, cpu-bigmem, or -gpu.
- Grants on Athena require specifying an extension to the grant name: -gpu-a100.
- Jobs can be submitted both in batch mode and interactively.
For detailed information on how to use the Ares and Athena clusters, please refer to the following pages:
Available Queues for Clusters
- Ares
Name | Timelimit | Resource type (account suffix) | Access requirements | Description |
---|---|---|---|---|
plgrid | 72h | -cpu | Generally available. | Standard partition. |
plgrid-testing | 1h | -cpu | Generally available | High priority, testing jobs, limited to 3 running jobs. |
plgrid-now | 12h | -cpu | Generally available | The highest priority, interactive jobs, limited to 1 running or queued job. |
plgrid-long | 168h | -cpu | Requires a grant with a maximum job runtime of 168h | Used for jobs with extended runtime. |
plgrid-bigmem | 72h | -cpu-bigmem | Requires a grant with CPU-BIGMEM resources | Resources used for jobs requiring an extended amount of memory. |
plgrid-gpu-v100 | 48h | -gpu | Requires a grant with GPGPU resources | GPU partition. |
- Athena
Name | Timelimit | Account suffix | Remarks |
---|---|---|---|
plgrid-gpu-a100 | 48h | -gpu-a100 | GPU A100 partition. |
- Tryton
Name | Timelimit |
---|---|
plgrid-testing | 1h |
plgrid | 72h |
plgrid-long | 168h |
- BEM2
Name | Timelimit |
---|---|
plgrid-short | 24h |
plgrid | 72h |
plgrid-long | 168h |
Batch Mode
The sbatch
command is used for submitting batch mode jobs. Usage: sbatch script.sh
. All options should be preceded by the #SBATCH directive (with the # symbol).
Detailed information can be found in man sbatch
and sbatch --help
.
Sample script:
Warning
This is a sample script and you SHOULD NOT run it. Before running it, customize it to your needs, changing, for example, the grant name, partition, job duration, etc.
#!/bin/bash -l
## Job Name
#SBATCH -J ADFtestjob
## Number of allocated nodes
#SBATCH -N 1
## Number of tasks per node (default is the number of allocated cores per node)
#SBATCH --ntasks-per-node=1
## Amount of memory per CPU core (default is 5GB per core)
#SBATCH --mem-per-cpu=1GB
## Maximum job duration (format HH:MM:SS)
#SBATCH --time=01:00:00
## Grant name for resource usage accounting
#SBATCH -A <grant_id-suffix>
## Partition specification
#SBATCH -p plgrid-testing
## Standard output file
#SBATCH --output="output.out"
## Standard error file
#SBATCH --error="error.err"
## Switching to the directory where the sbatch command was initiated
cd $SLURM_SUBMIT_DIR
srun /bin/hostname
module load plgrid/apps/adf/2014.07
adf input.adf
Interactive Mode
To submit jobs interactively with a shell, you can use the following command as an example:
srun -p plgrid-testing -N 1 --ntasks-per-node=1 -n 1 -A <grant_id-suffix> --pty /bin/bash -l
The above command will run the task in the plgrid-testing
partition on 1 node with an allocation of 1 core.