Tags:
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> ---+ Slurm Batch system usage ---++ Simple job submission and the concept of queues (partitions) On the Tier-3 you run jobs by submitting them to the Slurm job scheduler. The clusters main computational resources, the worker nodes, are only accessible through this batch system. The login nodes t3ui01-03 (User Interface or UI nodes) provide you an environment to compose, test and submit batch jobs to Slurm. We provide the following *partitions* (often also called *batch queues*) to which you can submit jobs | *PARTITION* | *NODES(A/I/O/T)* | *CPUS(A/I/O/T)* | *TIMELIMIT* | *DEFAULTTIME* | *GRES* | *NODELIST* | | short | 0/32/0/32 | 0/1680/0/1680 | 1:00:00 | 45:00 | (null) | t3wn[30-33,35-36,38-44,46,48-54,56,58-63,70-73] | | standard* | 0/32/0/32 | 0/1680/0/1680 | 12:00:00 | 12:00:00 | (null) | t3wn[30-33,35-36,38-44,46,48-54,56,58-63,70-73] | | long | 0/24/0/24 | 0/848/0/848 | 7-00:00:00 | 1-00:00:00 | (null) | t3wn[30-33,35-36,38-44,46,48-54,56,58-59] | | qgpu | 0/2/0/2 | 0/80/0/80 | 1:00:00 | 30:00 | gpu:8(S:0-1) | t3gpu[01-02] | | gpu | 0/2/0/2 | 0/80/0/80 | 7-00:00:00 | 1-00:00:00 | gpu:8(S:0-1) | t3gpu[01-02] | *A/I/O/T* means active/idle/offline/total, so the fourth number is the number of total nodes or CPUs. You can get the above list also by running the following command on one of the UI nodes. <pre> sinfo -o "%.12P %.16F %.16C %.14l %.16L %.12G %N" </pre> To launch a batch job, you first prepare a batch script (usually a normal bash script that launches your executable), e.g. =my-script.sh= and you then use the =sbatch= command to submit it to Slurm. Here we submit to the =short= queue and we use the account =t3= for normal CPU jobs. <pre> sbatch -p short --account=t3 my-script.sh </pre> The =sbatch= command supports a lot of additional configuration options (refer to its man page), e.g. you may want to specify the memory requirements of your job in MBs. <pre> sbatch -p standard --account=t3 --mem=3000 job.py </pre> Instead of passing all these options on the command line you can also put them inside of the batch script (usually in the header part), starting lines with the =#SBATCH= comment part, e.g. <pre> # This is my batch script #SBATCH --mem=3000 #SBATCH --account=t3 #SBATCH --time=04:00:00 #SBATCH --partition=standard # now start our executable myexecutable </pre> ---++ Example Job submission scripts [[GPU Example][GPU Example]] [[CPU Example][CPU Example]] [[CPU Example for using multiple processors (threads) on a single physical computer][CPU Example for using multiple processors (threads) on a single physical computer]] Here are some [[SlurmMonitoringCommands][useful commands to check Slurm jobs and nodes status]] and [[https://wiki.chipp.ch/twiki/bin/view/CmsTier3/SlurmUtilisation][T3 Slurm Monitoring]] page. The detailed slurm configuration can be examined on any Slurm node by listing the configuration file =/etc/slurm/slurm.conf=. Slurm itself calculates *priorities of jobs* taking into account </br> - *FairShare*: past cluster usage by the user (part of a decay function) </br> - *Age of Job*: time the job has been waiting in the queue </br> - *Job Size*: size of resource request CPU, Memory </br> The default memory per job slot (memory-per-cpu) is slightly below 2GB/CPU, given by the oldes nodes.
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r35
<
r34
<
r33
<
r32
<
r31
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r32 - 2021-08-10
-
DerekFeichtinger
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback