<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE = TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME = TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW = TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
--> <!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE = TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME = TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW = TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->
Swiss Grid Operations Meeting on 2017-03-02 at 14:00
Site status
CSCS
Systems
- stable operations
- working on the monitoring dashboard
- cms02 puppet code fixed to correctly manage proxy certificates
- cleaning up internal email notifications
Storage
- dcache
- "gmetric" python script decomissioned, replaced by new bash script
- bought two new servers: 2xE5 2630v4 | 256GB RAM | 2x240GB SSD | Mellanox Connectx-3 FDR IB DP | Qlogic 16Gbps DP FC HBA
- gpfs
- stable operation, some occasional high load peaks
- bought height new servers: 1xE5-1680 v4 | 128GB RAM | 2x240GB | Mellanox Connectx-3 FDR IB DP | Qlogic 16Gbps DP FC HBA
- general
- buiding a lot of new monitoring plots with grafana (moving out from ganglia)
- preparing for the april maintenance (new dcache servers + storage, new gpfs scratch + optimized cluster layout for CRAY integration)
- had some problems with TWiki
PSI
UNIBE-LHEP
- Some ARC troubles (hitting the NDGF-T1 as well) due to nasty jobs (some with 1k input files) and possible ARC bugs
- I/O volume stepped up considerably from week 5 2017
- Gatway to LAN on the older cluster now 10GB => no longer struggles with ARC downloads and uploads
- Ongoing issue with (mostly) MCORE jobs on Ubelix failing heavily
- Not understood issue that seems to be related to where files get written (scratch/tmpdir/sessiondir...)
- Security: CVE-2017-6074 "DCCP" - Mitigated as suggested in the EGI advisory
- HammerCloud status
http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics?columnid=562&view=Shifter%20view#time=720&start_date=&end_date=&use_downtimes=false&merge_colors=false&sites=multiple&clouds=all&site=ANALY_CSCS,ANALY_CSCS-HPC,ANALY_UNIBE-LHEP,ANALY_UNIBE-LHEP-UBELIX,CSCS-LCG2,CSCS-LCG2-HPC,CSCS-LCG2-HPC_MCORE,CSCS-LCG2_MCORE,UNIBE-LHEP,UNIBE-LHEP-UBELIX,UNIBE-LHEP-UBELIX_MCORE,UNIBE-LHEP_CLOUD,UNIBE-LHEP_CLOUD_MCORE,UNIBE-LHEP_MCORE,UNIGE-DPNC,UNIGE-DPNC_MCORE
- Accounting numbers (from scheduler) from last month
ATLAS: 792945(was 816207 for Nov 2016) ; UBOONE: 552 ; T2k: 6 ; OPS: 6
- Accounting numbers from ATLAS dashboard from last month (core-hours Feb 2017) [1]
CSCS / UNIBE 58% / 42% ( was 65% / 35% Nov 2016 )
- Efficiency WT ok/fail [2] - Ubelix issue: [3]
CSCS/UNIBE: 93% / 75% ( was 81.57/64.07 Nov 2016 )
- CPU/WT efficiency [4]:
CSCS/UNIBE 90% / 83% ( was 0.67/0.71 Nov 2016 )
[1]
http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuconsumption&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All
[2]
http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=successfailures&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All
[3]
http://dashb-atlas-job.cern.ch/dashboard/request.py/terminatedjobsstatus_individual?sites=CSCS-LCG2&sites=UNIBE-LHEP&sitesCat=CH-CHIPP-CSCS&resourcetype=All&activities=all&sitesSort=2&sitesCatSort=2&start=null&end=null&timeRange=lastMonth&sortBy=16&granularity=8%20Hours&generic=0&series=30&type=qbwc
[4]
http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuefficiency&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All
UNIBE-ID
UNIGE
- Xxx
- Accounting numbers (from scheduler) from last month
NGI_CH
Other topics
Next meeting date:
A.O.B.
Attendants
- CSCS:
- CMS:
- ATLAS:
- LHCb:
- EGI:
Action items
* Item1 * Item1
This topic: LCGTier2
> WebHome >
MeetingsBoard > MeetingSwissGridOperations20170302
Topic revision: r5 - 2017-03-02 - GianfrancoSciacca