<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE =
TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME =
TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW =
TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->
Swiss Grid Operations Meeting on 2016-04-20 at 14:00
- Place: Vidyo (room: Swiss_Grid_Operations_Meeting, extension: 10537598)
- External link: https://vidyoportal.cern.ch/flex.html?roomdirect.html&key=FAEn4zjAba7BqoQ11TGZu66VSDE
- Phone gate: From Switzerland: 0227671400 (portal) + 10537598 (extension) + # (pound sign)
- IRC chat: irc:gridchat.cscs.ch:994#lcg (ask pw via email)
- Switch Vidyo SIP IP: 137.138.248.204
Site status
CSCS
Systems
- Phoenix working fine, good usage since last F2F
- Some events:
- cms06 issue (more storage details)
- Swapping issue
- Main focus LHConCray knowledge transfer from Miguel
Storage
GPFS
- Last week inode incident
- cms06 account reached ionodes quota
- no impact on filesystem
- some impact on cms jobs (related to the cms06 account)
- Stable operations, some peaks of load
- New servers arrived, will be installed, configured and integrated by the next week
- Will split GPFS in three clusters
- Server-cluster
- WN-cluster (remote)
- LHConCRAY-cluster (remote)
- Will plan a maintenance to apply the new configuration in production
dCache
- New port parameters applied to all the pools (in the layout file); this will prevent any kind of port shortage/contention
- Had some problems with 2 CMS and 2 ATLAS pools that were unable to restart correctly.
- The twiki Phoenix Monitor Overview / Storage view has been updated. More updated plots from Grafana will follow
- New servers arrived
- 1.5PB of new space will be to allocate
- 1.1PB will be migrated and decommissioned
- dCache will be updated from v2.13.50 to v2.13.57 (xrootd and checksums improvements)
PSI
UNIBE-LHEP
- ARC (5.2.1) instabilities (a-rex hangs, crashes, etc) ironed out by switching on the watchdog service, complemented by a rescue cron job running ever hour
- Occasionally must restart the services by hand (symptom: ARC CE network rates go to zero, but services seemingly up and running)
- Microboone resumed running on our resources, we are working on optimising the size of the glideins in order not to waste resources
- Microboone also now supported on the SE. Will keep a replica of the experiment raw data
- Ubelix CPU/WC efficiency largely recovered, will continue to watch it as at times it degrades sonsiderably
- HammerCloud status
http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics?columnid=562&view=Shifter%20view#time=720&start_date=&end_date=&use_downtimes=false&merge_colors=false&sites=multiple&clouds=all&site=ANALY_CSCS,ANALY_CSCS-HPC,ANALY_UNIBE-LHEP,ANALY_UNIBE-LHEP-UBELIX,CSCS-LCG2,CSCS-LCG2-HPC,CSCS-LCG2-HPC_MCORE,CSCS-LCG2_MCORE,UNIBE-LHEP,UNIBE-LHEP-UBELIX,UNIBE-LHEP-UBELIX_MCORE,UNIBE-LHEP_CLOUD,UNIBE-LHEP_CLOUD_MCORE,UNIBE-LHEP_MCORE,UNIGE-DPNC,UNIGE-DPNC_MCORE
- Accounting numbers (from scheduler) from MARCH 2017
ATLAS: 942829 (was 792945 in Feb 2017); UBOONE: 16665 (was 6 in Feb 2017); UBOONE: 5083 (was 552 in Feb 2017); OPS:16
- Accounting numbers from ATLAS dashboard from last month (core-hours MARCH 2017) [1]
CSCS / UNIBE: 60% / 40% (stable)
- Efficiency WT ok/fail [2]
CSCS 93% (was 93%) - UNIBE 81% (was 75%) (Ubelix getting better)
- CPU/WT efficiency [3]
CSCS 70% (was 90%) - UNIBE 81% (was 83%) (big dip for CSCS in the middle and end of the month: correlate to Reprocessing jobs [4] - Cmp. Bern [5] )
[1]
http://dashb-atlas-job.cern.ch/dashboard/request.py/dailysummary#button=cpuconsumption&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=CH-CHIPP-CSCS&resourcetype=All&sitesSort=2&sitesCatSort=2&start=2017-03-01&end=2017-03-31&timerange=daily&granularity=8+Hours&generic=0&sortby=0&series=30&activities%5B%5D=all
[2]
http://dashb-atlas-job.cern.ch/dashboard/request.py/dailysummary#button=successfailures&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=CH-CHIPP-CSCS&resourcetype=All&sitesSort=2&sitesCatSort=2&start=2017-03-01&end=2017-03-31&timerange=daily&granularity=8+Hours&generic=0&sortby=0&series=30&activities%5B%5D=all[3]
http://dashb-atlas-job.cern.ch/dashboard/request.py/dailysummary#button=cpuefficiency&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=CH-CHIPP-CSCS&resourcetype=All&sitesSort=2&sitesCatSort=2&start=2017-03-01&end=2017-03-31&timerange=daily&granularity=8+Hours&generic=0&sortby=0&series=30&activities%5B%5D=all
[4]
http://dashb-atlas-job.cern.ch/dashboard/request.py/dailysummary#button=activities&sites%5B%5D=CSCS-LCG2&sitesCat%5B%5D=CH-CHIPP-CSCS&resourcetype=All&sitesSort=2&sitesCatSort=2&start=2017-03-01&end=2017-03-31&timerange=daily&granularity=8+Hours&generic=0&sortby=0&series=30&activities%5B%5D=all
[5]
http://dashb-atlas-job.cern.ch/dashboard/request.py/dailysummary#button=activities&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=CH-CHIPP-CSCS&resourcetype=All&sitesSort=2&sitesCatSort=2&start=2017-03-01&end=2017-03-31&timerange=daily&granularity=8+Hours&generic=0&sortby=0&series=30&activities%5B%5D=all
UNIBE-ID
UNIGE
- Xxx
- Accounting numbers (from scheduler) from last month
NGI_CH
- Xxx
- NGI-CH Open Tickets review
Other topics
Next meeting date:
A.O.B.
Attendants
- CSCS:
- CMS:
- ATLAS:
- LHCb:
- EGI:
Action items