(Box = CHIPP allocated nodes at CSCS)
IMPLEMENT THE “INTERNAL DYNAMIC ALLOCATION” (easy to implement - but the idle cost goes back to the VOs)
Fair share + optimized priority with reservations
When a VO comes back will take a higher priority until it gets back to its target, then go back to normal
Align the boundaries at the node level (see [1] below in item 3.)
IMPLEMENT ON MON20 TEST UNTIL 29 JAN (MAINTENANCE).
PRELIMINARY NUMBERS to seize the shared resources:
CMS 50%
ATLAS 50%
LHCb 50%
START THE DISCUSSION TO GO FOR THE “DYNAMIC ALLOCATION”
- forced draining of nodes already in use “capped”
START THE DISCUSSION TO GO FOR THE “opportunistic”
- use only idle nodes
- issue: there are very few idel nodes
- (jobs has to be already in the queue - it cannot be detected)
START THE DISCUSSION TO GO FOR THE “opportunistic”
- use only idle nodes
- with short jobs “backfilling”
- (jobs has to be already in the queue - it cannot be detected)
IMPLEMENT A SAFE SITE LOG FOR CHIPP RESOURCES. Both CSCS and Experiment to compile it
TRY TO SET IT TO A LOW VALUE AND TEST → SCHEDULED AFTER THE TEST OF [1]
→ Give Gianfranco access to login on Daint and use sprior
→ input from VOreps (provide the API call) to CSCS and then put on the dashboard
ATLAS full transition timescale 18 months. Prepare a plan for the transition, follow up in mothly ops meetings.
Warning: Can't find topic "".""
|
|
|