Slurm Batch system usage

This is introduction to test configuration of Slurm - a modern job scheduler for Linux clusters - at T3.

Currently t3ui07 is a single login node for Slurm. As any User Interface Node it should be used mostly for development and small quick tests.

For intensive computational work one should use Compute Nodes. There are two types of Compute Nodes - Worker Nodes for CPU usage and GPU machines. All new hardware is equipped with 256GB of RAM and 10GbE network:

Compute Node	Processor Type	Computing Resources: Cores/GPUs
t3gpu0[1-2]	Intel Xeon E5-2630 v4 (2.20GHz)	8 * GeForce GTX 1080 Ti
t3ui07 - login node	Intel Xeon Gold 6148 (2.40GHz)	80 Cores
t3wn38	Intel Xeon E5-2670 (2.6 GHz)	16 Cores
t3wn48	AMD Opteron 6272 (2.1GHz)	32 Cores
t3wn58	Intel Xeon E5-2698 (2.30GHz)	64 Cores
t3wn60	Intel Xeon Gold 6148 (2.40GHz)	80 Cores

Access to the Compute Nodes is controlled by Slurm.
Corresponding to computing resources there are two partitions (similar to SGE queues) implemented: wn and gpu.

Here is few useful commands start to work with Slurm:

sinfo      # view information about Slurm nodes and partitions
sbatch     # submit a batch script 
squeue     # view information about jobs in the scheduling queue
scancel    # to abort the job

To submit job to the wn partition issue: sbatch -p wn job.sh

One might create a shell script with a set of all directives starting with #SBATCH string like in the following examples.
GPU Example:


#!/bin/bash
#
#SBATCH --job-name=test_job 
#SBATCH --account=gpu_gres               # to access gpu resources
#SBATCH --partition=gpu                                           
#SBATCH --nodes=1                        # request to run job on single node                                       
##SBATCH --ntasks=10                     # request 10 CPU's (t3gpu01/02: balance between CPU and GPU : 5CPU/1GPU)      
#SBATCH --gres=gpu:2                     # request  for two GPU's on machine, this is total  amount of GPUs for job        
##SBATCH --mem=4000M                     # memory (per node)
#SBATCH --time=0-00:30                   # time  in format DD-HH:MM

# Slurm reserves two GPU's (according to requirement above), those ones that are recorded in shell variable CUDA_VISIBLE_DEVICES
echo CUDA_VISIBLE_DEVICES : $CUDA_VISIBLE_DEVICES
# python program script.py should use CUDA_VISIBLE_DEVICES variable (*NOT* hardcoded GPU's numbers)
python script.py

CPU Example:

#!/bin/bash
# 
#SBATCH -p wn
#SBATCH --time 01:00:00
#SBATCH -w t3wn60
#SBATCH -e cn-test.err 
#SBATCH -o cn-test.out  # replace default slurm-SLURM_JOB_ID.out

echo HOME: $HOME 
echo USER: $USER 
echo SLURM_JOB_ID: $SLURM_JOB_ID
echo HOSTNAME: $HOSTNAME

# each worker node has local /scratch space to be used during job run
mkdir -p /scratch/$USER/${SLURM_JOB_ID}
sleep 10
# here comes a computation
rmdir  /scratch/$USER/${SLURM_JOB_ID}
date

CPU Example for using multiple processors (threads) on a single physical computer:

#!/bin/bash 
#SBATCH --job-name=smp_illustration         # Job name
#SBATCH -p wn
#SBATCH -w t3wn58                           # choose particular Compute Node from wn partition
#SBATCH --ntasks=1                          # Run a single task
#SBATCH --cpus-per-task=2                   # Number of CPU cores per task
#SBATCH --mem=3gb                           # Job memory request
##SBATCH --mem-per-cpu=3072                 # example of memory request for one CPU core
#SBATCH --time=00:05:00                     # Time limit hrs:min:sec
#SBATCH --output=smp_%j.log                 # Standard output and error log

To start Slurm usage please ask T3 Administrators [cms-tier3@lists.psi.ch]‎ to add your user_id to Slurm accounts.

-- NinaLoktionova - 2019-05-08

This topic: CmsTier3 > WebHome > SlurmUsage
Topic revision: r7 - 2019-07-24 - NinaLoktionova