Commands to Monitor Slurm

command description
sinfo monitor nodes and partitions queue information; check more info options by sinfo --help
sinfo -o "%C %P" report of CPU usage as Idle, Active,... for a partition
squeue view information about jobs in the scheduling queue
scontrol show jobid JobID job status
scontrol show jobid -dd JobID helpful for job troubleshooting
sstat -j JobID information about running jobs (or specific job JobID)
scancel -j JobID abort job JobID
scancel -n JobID delete all jobs with job name JobID
sprio -l priority of your jobs
sshare -a share information about all users
sacct -j --format=JobID,JobName,MaxRSS,Elapsed information on completed jobs (or specific job JobID)
sacct --helpformat format options for sacct
sacctmgr show user -s user account information
sreport -tminper cluster utilization --tres="cpu,gres/gpu" start=2019-12-01 check utilisation of resources


This topic: CmsTier3 > WebHome > SlurmUsage > SlurmMonitoringCommands
Topic revision: r1 - 2020-03-31 - NinaLoktionova
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback