Tags:
meeting
1
SwissGridOperationsMeeting
1
tag this topic
create new tag
view all tags
<del><!-- keep this as a security measure:<br /> * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup<br /> * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup<br /> #uncomment this if you want the page only be viewable by the internal people<br /> #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup<br />--> </del><ins><!-- keep this as a security measure:<br /> * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup<br /> * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup<br /> #uncomment this if you want the page only be viewable by the internal people<br /> #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup<br />--> </ins> ---+ Swiss Grid Operations Meeting on 2017-03-02 at 14:00 * *Place*: Vidyo (room: Swiss_Grid_Operations_Meeting, extension: 10537598) * *External link*: https://vidyoportal.cern.ch/flex.html?roomdirect.html&key=FAEn4zjAba7BqoQ11TGZu66VSDE * *Phone gate*: From Switzerland: 0227671400 (portal) + 10537598 (extension) + # (pound sign) * *IRC chat*: irc:gridchat.cscs.ch:994#lcg (ask pw via email) * *Switch Vidyo SIP IP*: 137.138.248.204 %TOC% ---++ Site status ---+++ CSCS <b><del></del>Systems</b><ins><b><br /><br /></b></ins> * stable operations * working on the monitoring dashboard * cms02 puppet code fixed to correctly manage proxy certificates * cleaning up internal email notifications <b>Storage<br /><br /></b> * dcache * "gmetric" python script decomissioned, replaced by new bash script * bought two new servers: 2xE5 2630v4 | 256GB RAM | 2x240GB SSD | Mellanox Connectx-3 FDR IB DP | Qlogic 16Gbps DP FC HBA * gpfs * stable operation, some occasional high load peaks * bought height new servers: 1xE5-1680 v4 | 128GB RAM | 2x240GB | Mellanox Connectx-3 FDR IB DP | Qlogic 16Gbps DP FC HBA * general * buiding a lot of new monitoring plots with grafana (moving out from ganglia) * preparing for the april maintenance (new dcache servers + storage, new gpfs scratch + optimized cluster layout for CRAY integration) * had some problems with TWiki ---+++ PSI * Xxx * [[http://t3mon.psi.ch/ganglia/PSIT3-custom/accounting.txt][Accounting numbers (from scheduler) from last month]] ---+++ UNIBE-LHEP * Some ARC troubles (hitting the NDGF-T1 as well) due to nasty jobs (some with 1k input files) and possible ARC bugs * I/O volume stepped up considerably from week 5 2017 * Gatway to LAN on the older cluster now 10GB => no longer struggles with ARC downloads and uploads * Ongoing issue with (mostly) MCORE jobs on Ubelix failing heavily * Not understood issue that seems to be related to where files get written (scratch/tmpdir/sessiondir...) * Security: CVE-2017-6074 "DCCP" - Mitigated as suggested in the EGI advisory * <b>HammerCloud status<br /></b> http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics?columnid=562&view=Shifter%20view#time=720&start_date=&end_date=&use_downtimes=false&merge_colors=false&sites=multiple&clouds=all&site=ANALY_CSCS,ANALY_CSCS-HPC,ANALY_UNIBE-LHEP,ANALY_UNIBE-LHEP-UBELIX,CSCS-LCG2,CSCS-LCG2-HPC,CSCS-LCG2-HPC_MCORE,CSCS-LCG2_MCORE,UNIBE-LHEP,UNIBE-LHEP-UBELIX,UNIBE-LHEP-UBELIX_MCORE,UNIBE-LHEP_CLOUD,UNIBE-LHEP_CLOUD_MCORE,UNIBE-LHEP_MCORE,UNIGE-DPNC,UNIGE-DPNC_MCORE<b><br /></b> [[http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics#time=720&start_date=&end_date=&use_downtimes=false&merge_colors=false&sites=multiple&clouds=all&site=ANALY_CSCS,ANALY_UNIBE-LHEP,ANALY_UNIBE-LHEP-UBELIX,CSCS-LCG2,CSCS-LCG2_MCORE,UNIBE-LHEP,UNIBE-LHEP-UBELIX,UNIBE-LHEP-UBELIX_MCORE,UNIBE-LHEP_MCORE,UNIGE-DPNC,UNIGE-DPNC_MCORE?columnid=562&view=Shifter%20view][<br />]] * *Accounting numbers* (from scheduler) from last month<br />ATLAS: 792945<i>(was 816207 for Nov 2016)</i> ; UBOONE: 552 ; T2k: 6 ; OPS: 6 * *Accounting numbers from ATLAS dashboard* from last month (core-hours Feb 2017) [1]<br />CSCS / UNIBE 58% / 42% ( was 65% / 35% Nov 2016 )<br /><br /> * *Efficiency WT ok/fail* [2] - Ubelix issue: [3]<br />CSCS/UNIBE: 93% / 75% ( was 81.57/64.07 Nov 2016 )<br /><br /> * *CPU/WT efficiency* [4]:<br />CSCS/UNIBE 90% / 83% ( was 0.67/0.71 Nov 2016 ) [1] <a href="http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuconsumption&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All" target="_blank">http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuconsumption&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All<img alt="" border="0" height="12" src="%PUBURL%/TWiki/TWikiDocGraphics/external-link.gif" width="13" /></a> [2] <a href="http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=successfailures&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All" target="_blank">http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=successfailures&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All<img alt="" border="0" height="12" src="%PUBURL%/TWiki/TWikiDocGraphics/external-link.gif" width="13" /></a> [3] http://dashb-atlas-job.cern.ch/dashboard/request.py/terminatedjobsstatus_individual?sites=CSCS-LCG2&sites=UNIBE-LHEP&sitesCat=CH-CHIPP-CSCS&resourcetype=All&activities=all&sitesSort=2&sitesCatSort=2&start=null&end=null&timeRange=lastMonth&sortBy=16&granularity=8%20Hours&generic=0&series=30&type=qbwc [4] <a href="http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuefficiency&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All" target="_blank">http://dashb-atlas-ddm-acc.cern.ch/dashboard/request.py/dailysummary#button=cpuefficiency&sites%5B%5D=CSCS-LCG2&sites%5B%5D=UNIBE-LHEP&sitesCat%5B%5D=All+Countries&resourcetype=All&sitesSort=2&sitesCatSort=0&start=null&end=null&timerange=lastMonth&granularity=8+Hours&generic=0&sortby=0&series=All<img alt="" border="0" height="12" src="%PUBURL%/TWiki/TWikiDocGraphics/external-link.gif" width="13" /></a> ---+++ UNIBE-ID * Xxx ---+++ UNIGE * Xxx * Accounting numbers (from scheduler) from last month ---+++ NGI_CH * Xxx * <b>NGI-CH Open Tickets review<br /></b> [[https://ggus.eu/index.php?mode=ticket_search&supportunit=NGI_CH&status=open&timeframe=any&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO][https://ggus.eu/index.php?mode=ticket_search&supportunit=NGI_CH&status=open&timeframe=any&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO<b><br /></b>]] * AFS: * https://ggus.eu/index.php?mode=ticket_info&ticket_id=124815 - (UZH) Roland lookign after it<br /><br /> * CSCS CMS * https://ggus.eu/index.php?mode=ticket_info&ticket_id=126888 - SAM stopped running * https://ggus.eu/index.php?mode=ticket_info&ticket_id=126883 - EOS space for T2_CH_CERN -> reassigned it to CERN<br /><br /> * CSCS * https://ggus.eu/index.php?mode=ticket_info&ticket_id=125479 -> LogicalCPUs published in GLUE2 not correct -> ARC developer involved * https://ggus.eu/index.php?mode=ticket_info&ticket_id=126419 -> providing a site for testing the storage accounting<br /><br /> * UNIBE ATLAS * https://ggus.eu/index.php?mode=ticket_info&ticket_id=124518 -> UBELIX MCORE job failures * https://ggus.eu/index.php?mode=ticket_info&ticket_id=126844 -> Deletion via https fails on the SE * https://ggus.eu/index.php?mode=ticket_info&ticket_id=117899 -> Storage consistency checks (STALLED) ---++ Other topics * Topic1 * Topic2 Next meeting date: ---++ A.O.B. ---++ Attendants * CSCS: * CMS: * ATLAS: * LHCb: * EGI: ---++ Action items <del> * Item1</del> * Item1
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r5 - 2017-03-02
-
GianfrancoSciacca
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback