Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup --> ---+ Fair-share Meeting on 2018-11-13 * *Date and time*: 13 November 2018 14:00-15:00. (UTC+01:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague * *Place*: CSCS Meeting Room 1st Floor (F1) * *External link*: <p><font face="Calibri" size="2">Web Portal Address: [[https://vcmeeting.ethz.ch/][https://vcmeeting.ethz.ch]]<br /></font></p> <p><font face="Calibri" size="2">SCOPIA meeting ID 6708365 <br /></font></p> <p><font face="Calibri" size="2">SCOPIA via phone +41 43 244 89 30 | 6708365#</font></p> ---++ Agenda * *Fair-share problem introduction (ATLAS)* * Share issue first flagged on Piz Daint during the LHConCray commissioning project (1 year ago) * <u>Not enough job pressure from CMS</u> * <u>Relative shares between ATLAS and LHCb skewed in favour of LHCb</u> * Raised again at the f2f meeting on 21st June in ZH * In September realised that the issue has shown up on Phoenix too since ~May 2018 * Hard to keep track of, since monitoring dashboards cannot be accessed * Did some investigations with Dino and discussed further f2f with Pablo & Dino again *What is the fair-share problem?* * * <u>ATLAS MultiCore jobs wait too long in the queue compared to single core jobs</u> * ATLAS: ~80% MC, ~20% SC * 1 job=1 payload * internal fair-share done at the factory level, passed to the sites in the form of an ARC job option => lowers priority * walltime request passed to the sites in the form of an ARC job option (tuned to the payload to be executed) * CMS: 100% MC * 8-core (configurable) pilots sent to the sites * internal fair-share done at the factory level, 8-core pilots pull multiple MC and SC payloads * walltime request configured at the factory level (arbitrary number) * LHCb: 100% SC * 1-core pilots sent to the sites * internal fair-share done at the factory level, 1-core pilots pull SC payloads * wall request configured at CSCS. NOTE: this can be done at the factory level (arbitrary number)<br /><br /> *Why does that happen?:* * * Common problem to the large shared sites: SC vs MC scheduling: node fragmenting and backfill favour SC * SC slots are held due to the long running configured Walltime * SLURM is _not_ an HTC schduler, in the conditions shown above it is hard to judge whether it makes the right scheduling decisions according to its target settings * Factors that have an impact: * SC vs MC imbalance (per user) * cputime imbalance (per user) * backfill (although this should favour shorter jobs, these are MC jobs) * ATLAS job nice-ing (however this is turned off on Daint, but we _need_ it) * Number of queued jobs (per user): is this balanced? *Impact on ATLAS* * * relative shares between experiments skewed to ATLAS disadvantage * CPU delivery for ATLAS is really bumpy [1] [2]<br /> * jobs often wait too long and/or are cancelled by the experiment and re-directed somewhere else * this harms several workflows, specifically those that have higher (internal) priority * if we host data, we should have an adequate amount of resources available at any time for processing (~40% of the total as baseline) * we need to turn back on internal fair-share between the workloads<br /><br /> * *Options / proposals* * Sites have in general invested large efforts in the past and cooked their own recipies (but I know no shared site using SLURM)<br /><br /> * Option 1:<br /> * Track and fix the fair share. In order for such effort to be optimised, we need access to the relevant debugging dashboards * Might be a labor intensive task * Needs changes to the current shared model, very likely compromises between job-length and MC vs SC balance * Might not satisfy each experiment requirements (e.g., long jobs, or job nice-ing, etc) * Suggestion: pack the nodes with single core jobs first, rather than distributing them across the nodes * ... * Option 1: * Split resources according the the fair-share quotas and allow each experiment to submit to the other partitions on a pre-emptable basis. NOTE: <u>pre-emptable means job KILLED, not checkpointed</u> * Each experiment has their own quota and we delegate to them to claim any resource not used by another experiment * Each experiment can shape their jobs as they wish * ... * Option 3: * ...<br /><br />[1] http://dashb-atlas-job.cern.ch/dashboard/request.py/resourceutilization_individual?sites=CSCS-LCG2&sitesCat=All%20Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=2018-01-01&end=2018-10-31&timeRange=daily&granularity=Daily&generic=0&sortBy=20&diag1=0&diag2=0&diag3=0&diag4=0&diag5=0&diag6=0&diag7=0&diag8=0&diagT=0&diag8pl=0&series=All&type=a<br /><br />[2] http://dashb-atlas-job.cern.ch/dashboard/request.py/resourceutilization_individual?sites=CSCS-LCG2&sites=UNIBE-LHEP&sitesCat=All%20Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=2018-01-01&end=2018-10-31&timeRange=daily&granularity=Daily&generic=0&sortBy=0&diag1=0&diag2=0&diag3=0&diag4=0&diag5=0&diag6=0&diag7=0&diag8=0&diagT=0&diag8pl=0&series=All&type=a<br /><br /><br /> * *CSCS view* * * *Experiment views* * CMS * LHCb<br /><br /> * *Next step(s)* * * *AOB* ---++ Attendants * Roland * Christoph * Ginafranco * Thomas * Stefano * Nicholas * Dino * Gianni * Miguel ---++ Minutes * item ---++ Action items * item
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
pdf
CHIPP_Job_Analysis.pdf
r1
manage
8426.7 K
2018-11-13 - 15:08
NickCardo
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r4 - 2018-11-13
-
GianfrancoSciacca
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback