Swiss Grid Operations Meeting on 2016-12-06 at 14:00

Site status



  • Started planning services transition to Piz Daint. Question about the new ARC CE version to Gianfranco
  • Issues with some nodes and slurm subnet, reconfiguration needed.
  • Grafana dashboard is ready to be exposed (data sync in progress)
  • Process to IPv6 / dual-stack completed. Required quite some effort
  • We need to add dual-stack to CMS02 too
  • Updated to 3.2.40
  • Planning next year data migration (due to a complete storage renewal)
  • Planning head nodes (re)installation


  • Stable operations
  • Much improved IO between (Daint) DVS nodes and storage nodes since the last DVS upgrade
  • Outperforming SSD cache nominal performance (see attachment "ssd-cache-perf.png)"
  • We should have a further improvement with the new software



  • Stable operation, slightly delivery for LHEP (dying nodes). Pledged: 18k, delivered 20.6k
  • Ubelix back on 8th November
  • Running an average >2100 slots (<1900 last month, 2500 typical), Ubelix back to 23% (typical)
  • Accounting numbers (from scheduler) from last month (October), LHEP only

VO Job Type Produced WC core-hours    
ATLAS Any 1102994 (1157991 in Oct)    
ops Any 46 (44 in Oct)    
t2k.org Any 0    
uboone Any 0    

6-month history Unibe (pledge: 18 kHS06)

  • Swiss ATLAS statistics

    • HC availability
    • could not retrieve data
    • Runnins slots
    • CSCS: 3000 (3300 October) ; UniBe: 2150 (1850 October)
    • Accounting Numbers from ATLAs dashboard (November) CSCS+UniBe

Cluster Job Type Produced WC core-hours Good vs Bad WC % CPU eff good jobs %
CSCS Any 2397014; 63% (Oct: 2901550 69%) 0.79 (Oct: 0.71) 0.85 (Oct: 0.89)
UniBe Any 1411100; 37% (Oct: 1266896 31%) 0.84 (Oct: 0.85) 0.83 (Oct: 0.85)


Stable operations in November after the stuck a-rex issue in October
During the mid December's maitenance down:Deommissioning of nodes that comprise the el6legacy partition
Setup of a subordinate partition for preemptable jobs of the ATLAS experiment at the same tabme

No sysadmin from UNIBE-ID can join this afternoon


Discussing this week how to revive the ARC CE @UniGe


NGI-CH Open Tickets review

Ticket-ID TypeSorted ascending VO Site Priority Resp. Unit Status Last Update Subject Scope
133695 lhcb CSCS-LCG2 urgent NGI_CH assigned waiting for reply 2018-11-30 Data access problem at CSCS-LCG2 WLCG
138592   cms CSCS-LCG2 urgent NGI_CH waiting for reply 2018-12-06 Transfers failing from T2_CH_CSCS to ... WLCG
138296   cms CSCS-LCG2 urgent NGI_CH waiting for reply 2018-12-05 Transfers failing from T2_CH_CSCS WLCG
131965   none UNIBE-LHEP less urgent NGI_CH assigned on hold 2018-11-15 IPv6 deployment at WLCG Tier-2 sites EGI
131948   none CSCS-LCG2 less urgent NGI_CH assigned in progress 2018-12-03 IPv6 deployment at WLCG Tier-2 sites EGI

Other topics


Next meeting date:




Action items


Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng ssd-cache-perf.png r1 manage 165.0 K 2018-12-06 - 09:28 DarioPetrusic  

