OS | UI Hostname | users group | Notes |
---|---|---|---|
SL6 | t3ui01 | PSI | 132GB RAM, 72cores, 4TB /scratch ( type RAID1+0 ) |
SL6 | t3ui02 | ETHZ | 132GB RAM, 72cores, 4TB /scratch ( type RAID1+0 ) |
SL6 | t3ui03 | UNIZ | 132GB RAM, 72cores, 4TB /scratch ( type RAID1+0 ) |
/shome
policies /shome/$USER
filesystem ( it's not a simple dir ) featuring : /shome/$USER/.zfs/snapshot
; to recover a file, or a whole dir, simply use the cp
command ; no interaction with the T3 admins will be needed !
/shome/$USER/.zfs/snapshot
,
/shome/$USER
, deleting it and then trying to download the same file again it will immediately fail reporting out of space
;
if a T3 user runs out of space then only the T3 admins will be able to recover space by serially deleting his/her oldest snapshots. /shome/$USER
usage by this URL :
$ lynx --dump --width=800 http://t3mon.psi.ch/PSIT3-custom/space.report | egrep "NAME|$USER"
NAME QUOTA AVAIL RESERV USED USEDDS USEDSNAP SSCOUNT RATIO CREATION
data01/shome/martinelli_f 800G 796G 10G 4.22G 4.22G 3.53M 46 1.25x Mon Dec 7 18:49 2015
short.q
: 90min
all.q
: 10h ( this is the default queue used by a qsub
command )
long.q
: 96h
short.q
: can run on all the 1040 available job slots.
all.q
and long.q
together: max 740 job slots.
long.q
: max 360 job slots.
short.q
: max 460 jobs.
all.q
: max 400 jobs.
long.q
: max 340 jobs.
t3wn
server and if the job will use more than 3GB then it will be killed; Read about h_vmem
qsub
option -l h_vmem=nG
, with n
<= 6G ( 6 GByte ); the more RAM you'll request the less jobs will get running in a t3wn
server, so check if you really need so much RAM ( in all the CMS worldwide grid centres is tolerated a max of 2GB of RAM ! )
qstat -j JOBID
you will see the h_vmem
RAM value that was requested at submission time, either by default or by you.
qquota
report the batch system quota usage, per a single user or per all the users ; the batch system policies are published on each t3ui1*
server in /gridware/sge_ce/tier3-policies/
; for instance during the day they are :
More... Close
$ grep -A 100000 -B 10000 --color TRUE /gridware/sge_ce/tier3-policies/day { name max_jobs_per_sun_host description Allow maximally 8 jobs per bl6270 host enabled TRUE limit queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts {@bl6270} to slots=8 } { name max_jobs_per_intel_host description Allow maximally 16 jobs per intel host enabled TRUE limit queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts {@wnintel} to slots=16 } { name max_jobs_per_intel2_host description Allow maximally 64 jobs per intel2 host enabled TRUE limit queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts {@wnintel2} to slots=64 } { name max_jobs_per_supermicro_host description Allow maximally 32 jobs per supermicro host enabled TRUE limit queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts {@wnsupermicro} to slots=32 } { name max_jobs_per_t3vm03 description NONE enabled FALSE limit queues all.q,short.q hosts t3vm03.psi.ch to slots=2 } { name test-rqs-admin2 description limit maximal number of jobs of a user in the admin queue enabled FALSE limit users {*} queues all.q.admin to slots=40 } { name test-rqs-admin description limit admin queue to 30 slots total enabled FALSE limit queues all.q.admin to slots=30 } { name max_allq_jobs description limit all.q and long.q to a maximal number of common slots enabled TRUE limit queues all.q,long.q to slots=740 } { name max_longq_jobs description limit long.q to a maximal number of slots enabled TRUE limit queues long.q to slots=360 } { name max_sherpagen_jobs description limit sherpa.gen.q to a maximal number of slots enabled TRUE limit queues sherpa.gen.q to slots=50 } { name max_sherpaintlong_jobs description limit sherpa.int.long.q to a maximal number of slots enabled TRUE limit queues sherpa.int.long.q to slots=32 } { name max_sherpaintvlong_jobs description limit sherpa.int.vlong.q to a maximal number of slots enabled TRUE limit queues sherpa.int.vlong.q to slots=32 } { name max_user_jobs_per_queue description Limit a user to a maximal number of concurrent jobs in each \ queue enabled TRUE limit users {*} queues all.q to slots=400 limit users {*} queues short.q to slots=460 limit users {*} queues long.q to slots=340 limit users {*} queues sherpa.gen.q to slots=32 limit users {*} queues sherpa.int.long.q to slots=32 limit users {*} queues sherpa.int.vlong.q to slots=32 } { name max_jobs_per_user description Limit the total number of concurrent jobs a user can run on \ the cluster enabled TRUE limit users {*} queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q to slots=500 }All the current quotas
qquota -u \*
More... Close resource quota rule limit filter -------------------------------------------------------------------------------- max_jobs_per_sun_host/1 slots=2/8 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn17 max_jobs_per_sun_host/1 slots=1/8 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn25 max_jobs_per_sun_host/1 slots=1/8 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn15 max_jobs_per_intel_host/1 slots=11/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn39 max_jobs_per_intel_host/1 slots=13/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn37 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn30 max_jobs_per_intel_host/1 slots=13/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn38 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn40 max_jobs_per_intel_host/1 slots=13/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn34 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn36 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn33 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn35 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn32 max_jobs_per_intel_host/1 slots=12/16 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn31 max_jobs_per_intel2_host/1 slots=25/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn52 max_jobs_per_intel2_host/1 slots=25/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn51 max_jobs_per_intel2_host/1 slots=24/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn55 max_jobs_per_intel2_host/1 slots=25/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn54 max_jobs_per_intel2_host/1 slots=24/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn53 max_jobs_per_intel2_host/1 slots=24/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn56 max_jobs_per_intel2_host/1 slots=24/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn57 max_jobs_per_intel2_host/1 slots=24/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn58 max_jobs_per_intel2_host/1 slots=10/64 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn59 max_jobs_per_supermicro_host/1 slots=1/32 queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q hosts t3wn50 max_allq_jobs/1 slots=344/740 queues all.q,long.q max_longq_jobs/1 slots=340/360 queues long.q max_user_jobs_per_queue/1 slots=4/400 users ggiannin queues all.q max_user_jobs_per_queue/3 slots=340/340 users wiederkehr_s queues long.q max_jobs_per_user/1 slots=340/500 users wiederkehr_s queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q max_jobs_per_user/1 slots=4/500 users ggiannin queues all.q,short.q,long.q,sherpa.gen.q,sherpa.int.long.q.sherpa.int.vlong.q
/tmp /scratch
usage /tmp
or the /scratch
partitions of UIs and WNs are full because some user had filled them with big and later forgotten files/dirs or simply because a job went crazy.
There was a clear users requirement to manage this space by themselves, so that there is no automatic cleaning.
Please clean it timely and remember that /scratch is the least protected area and doesn't dedicated to keep for a long an important data.
If you discover an abuse, write or call your colleague and invite him/her to cleanup.
Otherwise your peers work could be blocked.
Administrators can help when group members are still interested in outdated user files presence in scratch.
In this case the owner has to be explicitly changed to a new responsible person.