Tags:
meeting1Add my vote for this tag SwissGridOperationsMeeting1Add my vote for this tag create new tag
view all tags
<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE = TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME = TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW = TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->
<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE = TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME = TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW = TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->

Swiss Grid Operations Meeting on 2017-03-02 at 14:00

Site status

CSCS

Systems

  • stable operations
  • working on the monitoring dashboard
  • cms02 puppet code fixed to correctly manage proxy certificates
  • cleaning up internal email notifications

Storage

  • dcache
    • "gmetric" python script decomissioned, replaced by new bash script
    • bought two new servers: 2xE5 2630v4 | 256GB RAM | 2x240GB SSD | Mellanox Connectx-3 FDR IB DP | Qlogic 16Gbps DP FC HBA
  • gpfs
    • stable operation, some occasional high load peaks
    • bought height new servers: 1xE5-1680 v4 | 128GB RAM | 2x240GB | Mellanox Connectx-3 FDR IB DP | Qlogic 16Gbps DP FC HBA
  • general
    • buiding a lot of new monitoring plots with grafana (moving out from ganglia)
    • preparing for the april maintenance (new dcache servers + storage, new gpfs scratch + optimized cluster layout for CRAY integration)
    • had some problems with TWiki

PSI

UNIBE-LHEP

  • Some ARC troubles (hitting the NDGF-T1 as well) due to nasty jobs (some with 1k input files) and possible ARC bugs
  • I/O volume stepped up considerably from week 5 2017
  • Gatway to LAN on the older cluster now 10GB => no longer struggles with ARC downloads and uploads
  • Ongoing issue with (mostly) MCORE jobs on Ubelix failing heavily
    • Not understood issue that seems to be related to where files get written (scratch/tmpdir/sessiondir...)
  • Security: CVE-2017-6074 "DCCP" - Mitigated as suggested in the EGI advisory

  • Accounting numbers from ATLAS dashboard from last month (core-hours Feb 2017) [1]
    CSCS / UNIBE 58% / 42% ( was 65% / 35% Nov 2016 )

    • Efficiency WT ok/fail [2] - Ubelix issue: [3]
      CSCS/UNIBE: 93% / 75% ( was 81.57/64.07 Nov 2016 )

    • CPU/WT efficiency [4]:
      CSCS/UNIBE 90% / 83% ( was 0.67/0.71 Nov 2016 )

[1] http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuconsumption&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All

[2] http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=successfailures&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All

[3] http://dashb-atlas-job.cern.ch/dashboard/request.py/terminatedjobsstatus_individual?sites=CSCS-LCG2&sites=UNIBE-LHEP&sitesCat=CH-CHIPP-CSCS&resourcetype=All&activities=all&sitesSort=2&sitesCatSort=2&start=null&end=null&timeRange=lastMonth&sortBy=16&granularity=8%20Hours&generic=0&series=30&type=qbwc

[4] http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuefficiency&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All

UNIBE-ID

  • Xxx

UNIGE

  • Xxx
  • Accounting numbers (from scheduler) from last month

NGI_CH

Other topics

  • Topic1
  • Topic2
Next meeting date:

A.O.B.

Attendants

  • CSCS:
  • CMS:
  • ATLAS:
  • LHCb:
  • EGI:

Action items

* Item1 * Item1

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r5 - 2017-03-02 - GianfrancoSciacca
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback