Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> %TOC% ---+ Useful Slurm commands ---++ Overview |*command*| * description * | |sinfo |monitor nodes and partitions queue information; check more info options by sinfo --help| |sinfo -o "%C %P" | report of CPU usage as Idle, Active,... for a partition | |squeue |view information about jobs in the scheduling queue| |scontrol show jobid !JobID | job status| |scontrol show jobid -dd !JobID |helpful for job troubleshooting| |sstat -j !JobID |information about running jobs (or specific job !JobID) | |scancel -j !JobID | abort job !JobID| |scancel -n !JobID |delete all jobs with job name !JobID| |sprio -l | priority of your jobs| |sshare -a | share information about all users| |sacct -j !JobID -o 'JobID,state,MaxVMSize,MaxRSS,Elapsed' |information on completed jobs (or specific job !JobID)| |sacct --helpformat |format options for sacct| |sacctmgr show user -s |user account information| |sreport -tminper cluster utilization --tres="cpu,gres/gpu" start=2019-12-01 |check utilisation of resources| ---++ How to check your past and current jobs' memory requirements For composing job memory requirements it is important to understand the memory behavior of jobs. The critical metric is the job's maximal *resident set size* (MaxRss), i.e. the maximal amount of memory that a job occupies in the physical RAM of the node. This is what you need to specify in SLURM request flags like =--mem-per-cpu=. You can use =sacct= in a line like the following to find out about your past and current jobs. <pre> sacct --format="JobID%16,User%12,State%16,partition,time,elapsed,ReqMem,MaxRss,MaxVMSize,ncpus,nnodes,reqcpus,reqnode,Start,End,NodeList" </pre> If you want to see older jobs than from today, you will have to add a starting time like =-S 2021-05-25=. Also, you can list specific jobs by adding the Job ID following the =-j= flag: <pre> sacct --format="JobID%16,User%12,State%16,partition,time,elapsed,ReqMem,MaxRss,MaxVMSize,ncpus,nnodes,reqcpus,reqnode,Start,End,NodeList" -j $YOUR_JOB_ID </pre> The total maximal memory consumed by your job may be larger, but this does not matter if most of it can be kept in virtual memory which is staged out to disk, and which need not be accessed frequently. The situation changes if that staged out memory also needs to be continually read back, which leads to the condition of *swapping*. The node is so busy staging in and out from your virtual memory that it can almost do no work at all for you in "user space", but is spending most of it's time in "kernel space". If you look at jobs with tools like =top=, these jobs usually appear in a *D* state.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r3 - 2021-06-01
-
DerekFeichtinger
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback