We don't over-subscribe memory anymore: nodes don't starve and crash
Memory usage is properly accounted for in 15.08 (PSS): no jobs killed on (artificial) over-limit of "vmem" (now the full address space reserved by a process, no what's allocated or used)
Comparing job fail rates between ce01 and ce02 (still on old SGE) has convinced me to rush the re-installation of ce02 (started earlier today)
ATLAS specific operations
Stable worflows by ATLAS (very large improvement since beginning of run II)
Stuck with the implementation of monthly dumps of the namespace on the DPM SE:
headnode on SLC5: the dump script does not work and also generating a valid proxy is problematic
decided to push the re-deployment of the head node on SLC6
legacy config tool (YAIM) no longer supported
puppet based configuration, got the right docs at the DPM workshop earlier this week in CERN
tests ongoing on a pps VM
also complicated by the fact my site-bdii is still co-located with the DPM head node
this will likely be the first task for 2016
UNIBE-ID
Xxx
UNIGE
Xxx
NGI_CH
Xxx
Other topics
Proposal to add to this meeting: T2 monthly pledge review (CSCS, UNIBE); GGUS open ticket review