Swiss WLCG Operations Meeting on 2011-05-19
Agenda
Status
- CSCS:
- Implemented punish system for find/du/ls. Job gets 30 seconds sleep after more than 10 bad ops / minute.
- PSI (reports Fabio):
- UNIBE (reports Gianfranco):
- Experiencing random spontaneous reboots of X8440 blades. One module refuses to power back up. Network upgrade still pending (tentatively next Wednesday)
- UNIGE (reports Szymon):
- In general rather quiet (I can do other things)
- Occasional slow response of NFS, under investigation since early march:
- It is not as simple as overloaded file servers.
- Slow response happens on one client at a time, not on all the same time.
- Have a look at slides attached: UniGE.NFS-monitoring-results.pdf.
- It would be good to understand what is the limit. Thanks in advance for any ideas of things to check etc.
Other topics
- Change of CSCS Maintenance day: "First working Wednesday of the month". Next one: Wednesday 8th of June.
- Agreed to talk about technical things (like Szymon's) on the lcg at cscs list
AOB
Attendants
- CSCS: Pablo
- CMS: Derek, Fabio
- ATLAS: Szymon, Marc, Gianfranco
- LHCb:
- EGI: Andres
Action items
- Not Trivial Example of SGE Accounting and Reporting Console usage:
This topic: LCGTier2
> WebHome >
MeetingsBoard > MeetingSwissWLCGOperations20110519
Topic revision: r6 - 2011-05-19 - FabioMartinelli