<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE =
TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME =
TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW =
TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->
Swiss Grid Operations Meeting on 2019-12-05 at 14:00
Check calendar invitation for CSCS Zoom details.
Action items
Site status
CSCS
PSI
UNIBE-LHEP
- Xxx
- Accounting numbers (from scheduler) from last month
UNIBE-ID
- Some job errors due to storage problems. The cause of this issue were bad IB cables, mechanically damaged during the server room reconstruction.
- Some cables replaced, the rest will get replaced in the next downtime on 19-12-12
- ARC CE otherwise running smoothly
UNIGE
- Xxx
- Accounting numbers (from scheduler) from last month
NGI_CH
- Report on this ticket:
REFERENCE LINK: https://ggus.eu/index.php?mode=ticket_info&ticket_id=144342
SUBJECT: NGI_CH - November 2019 - RP/RC OLA performance
such tickets are a “standard formulation”, we have received tons in the past, affecting all sites, due to the fact that the ops probes failures go inevitably undetected, when these do not affect the production experiments. In this specific case, it is the first time the ticket has been also notified to the site. In the past, it was just assigned to the NGI_CH, so only I would receive notification. Then would do some investigation with the site, and report on the ticket. In some cases, Dario and Dino might remember, we never found the cause of some errors that appeared and went away on their own.
I also see during that perios issues affecting the ARC CEs, but these went away spontaneously and it is no longer easy to investigate what happened back at failure times.
To mitigate in the future, we have mentioned in the past that there exist the possibility of turning on notification at the site/service level in GOCDB. These will trigegr email to the GOCDB site contact in case some ops probes fail. Each site should choose their own matrix of notifications. There are two independent levels: site level (can be turned on by editing the main site page), and service level (can be turned on by editing each servic page)
- NGI-CH Open Tickets review
Other topics
Next meeting date:
A.O.B.
Attendants
- CSCS:
- CMS:
- ATLAS:
- LHCb:
- EGI: