SlurmUsage < CmsTier3

<!-- keep this as a security measure:
   #uncomment if the subject should only be modifiable by the listed groups 
   # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup
   # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup
   #uncomment this if you want the page only be viewable by the listed groups
   # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup
-->

---+ Slurm Batch system usage

This is introduction to test configuration of  Slurm  - a modern job scheduler for Linux clusters -  at T3.

Currently t3ui07 is a single login node for Slurm. As any User Interface Node it should be used mostly for development and  small quick tests.

For intensive computational work one should use Compute Nodes. 
There are two types of Compute Nodes - Worker Nodes for CPU usage and GPU machines. All  new hardware is equipped with 256GB of RAM  and 10GbE network:

| *Compute Node* | *Processor Type* | *Computing Resources: Cores/GPUs* |
|t3ui07  - login node| Intel Xeon Gold 6148 (2.40GHz)  |  80 Cores  |
| t3gpu0[1-2] |  Intel Xeon E5-2630 v4 (2.20GHz)  |   8 * !GeForce GTX 1080 Ti | 
| t3wn60 | Intel Xeon Gold 6148 (2.40GHz) |  80 Cores   | 
| t3wn48 |   AMD Opteron 6272 (2.1GHz)   |  32 Cores   | 


Access to the Compute Nodes is controlled by Slurm. </br>
Corresponding to computing resources there are two partitions (similar to SGE  queues) implemented: *wn* and *gpu*.

Here is few useful commands start to work with Slurm:
<pre>
sinfo      # view information about Slurm nodes and partitions
sbatch     # submit a batch script 
squeue     # view information about jobs in the scheduling queue
scancel    # to abort the job
</pre>

To submit job to the wn partition issue: =sbatch -p wn job.sh= 
 
One might create  a shell script with a set of all directives starting with =#SBATCH=  string like in the following examples. </br>
*GPU Example*:
<pre>

#!/bin/bash
#
#SBATCH --job-name=test_job 
#SBATCH --account=gpu_gres               # to access gpu resources
#SBATCH --partition=gpu                                           
#SBATCH --nodes=1                        # request to run job on single node                                       
##SBATCH --ntasks=10                     # request 10 CPU's (t3gpu01/02: balance between CPU and GPU : 5CPU/1GPU)      
#SBATCH --gres=gpu:2                     # request  for two GPU's on machine, this is total  amount of GPUs for job        
##SBATCH --mem=4000M                     # memory (per node)
#SBATCH --time=0-00:30                   # time  in format DD-HH:MM

# Slurm reserves two GPU's (according to requirement above), those ones that are recorded in shell variable CUDA_VISIBLE_DEVICES
echo CUDA_VISIBLE_DEVICES : $CUDA_VISIBLE_DEVICES
# python program script.py should use CUDA_VISIBLE_DEVICES variable (*NOT* hardcoded GPU's numbers)
python script.py 

 </pre>
*CPU Example*:
<pre>
#!/bin/bash
# 
#SBATCH -p wn
#SBATCH --time 01:00:00
#SBATCH -w t3wn60
#SBATCH -e cn-test.err 
#SBATCH -o cn-test.out  # replace default slurm-SLURM_JOB_ID.out

echo HOME: $HOME 
echo USER: $USER 
echo SLURM_JOB_ID: $SLURM_JOB_ID
echo HOSTNAME: $HOSTNAME

# each worker node has local /scratch space to be used during job run
mkdir -p /scratch/$USER/${SLURM_JOB_ID}
sleep 10
# here comes a computation
rmdir  /scratch/$USER/${SLURM_JOB_ID}
date

</pre>

To start Slurm usage please ask T3 Administrators  [cms-tier3@lists.psi.ch]&#8206; to add your user_id to Slurm accounts.

-- Main.NinaLoktionova - 2019-05-08
This topic: CmsTier3 > WebHome > SlurmUsage
Topic revision: r4 - 2019-05-21 - NinaLoktionova