Tags: view all tags

Action Items

Action items (9)

Legend: <number.> title (added: date, done:date) /

1. Reduce ATLAS dips within the box: (added: 16.01.2020, done:)

(Box = CHIPP allocated nodes at CSCS)

IMPLEMENT THE “INTERNAL DYNAMIC ALLOCATION” (easy to implement - but the idle cost goes back to the VOs)
- Fair share + optimized priority with reservations
- When a VO comes back will take a higher priority until it gets back to its target, then go back to normal
- Align the boundaries at the node level (see [1] below in item 3.)
- IMPLEMENT ON MON20 TEST UNTIL 29 JAN (MAINTENANCE).
- PRELIMINARY NUMBERS to seize the shared resources:
  - CMS 50%
  - ATLAS 50%
  - LHCb 50%

2. Reduce ATLAS dips outside the box: (added: 16.01.2020, done:)

- Discussion to be started with M. DeLorenzi and CSCS CTO*

START THE DISCUSSION TO GO FOR THE “DYNAMIC ALLOCATION”

- forced draining of nodes already in use “capped”

START THE DISCUSSION TO GO FOR THE “opportunistic”

- use only idle nodes

- issue: there are very few idel nodes

- (jobs has to be already in the queue - it cannot be detected)

START THE DISCUSSION TO GO FOR THE “opportunistic”

- use only idle nodes

- with short jobs “backfilling”

- (jobs has to be already in the queue - it cannot be detected)

3. Help reducing Cache occupancy (added: 16.01.2020, done:)

At the moment we run all VOs in one node, i.e. 3 stacks of software in one node

- go for user segregation: a portion of it (the one not in share resource band) can be done on the reservation test by drawing the boundaries on the node [1] (see item 1)