Tags:
face2face1Add my vote for this tag meeting1Add my vote for this tag create new tag
view all tags

CHIPP + CSCS Face to Face Meeting on 2016-09-01

  • Date and time: Thursday 1st of September at 10:00
  • Place: CERN (40-R-B10)
  • External link / EVO: No

Agenda

Attendants

  • Christoph, Derek, Gianfranco, Fabio, Luis, Dino, Dario, Gianni, Miguel, Stefano, Pablo

Minutes and action items

  • Attached documents show the information presented during the meeting. Besides, it was agreed that:
  • ATLAS currently sees a big duplication of efforts (GF has to explain the issue, CSCS has to understand it). If those were the same person that would be more efficient.
  • (action on Gianfranco) ATLAS should communicate the average efficiency of all ATLAS sites during the whole period, so as to understand where some of those 'dips' come from
  • (action on CSCS) CSCS should try to explain where some of those efficiency dips come from during the last year, so as to understand if there is a good explanation (such as known incidents) or not.
  • Gianfranco's dedication to ATLAS exclusively amounts to an average of 0.4 FTE during the last 4 years
  • In the "options to move forward" page, for case A (continue as we are, with improvements) manpower at the site needs a good dashboard and good connection with the VO to understand the logs, know how to compare the site with others, and better ability to move around VO internals.
  • (action on ALL) it was agreed that both the VO Reps and CSCS would sit down together in a 2-day monitoring hackathon to improve on the discovery and response of issues by developing a dashboard with CSCS and all VO's view of important metrics. The availability window is not opened until the second half of October.
  • (action on ALL) we need to add the main sysadmins @ CSCS to all VOs. Fabio : sent howto DONE
  • (action on Fabio and CSCS) it would be interesting for CSCS to read the minutes from the CMS weekly operations call (Fabio will attend and send the minutes). Minutes Archive. DONE
  • (action on Fabio) CSCS mailing list grid-list@cscs.ch needs to receive the CMS Nagios warnings. DONE
  • Fabio is doing the first level support for CMS, but having CSCS look into it should improve not only CMS but the rest of the VOs. Fabio's T2 checks DONE
  • CMS does not see the efficiency problems that ATLAS reports.
  • (action on Fabio and CSCS) we need to match the CMS and ATLAS efficiencies to see where they match (probably a site-wide problem) and where they differ (probably a VO-specific problem)
  • (action on ALL) Official Availability/Reliability metrics from WLCG is not real. For next time, we need to agree on a A/R figure that represents reality.
  • (action on CSCS) per-VO CPU accounting data should be added to the plot
  • VO-Reps can now login to servers/nodes from login.lcg.cscs.ch (from ela.cscs.ch) as their own user to do certain operations with sudo (such as cat /var/log/* and such). If they see a need for more operations they should open a ticket
  • (action on Derek) we should expect an answer for the VO-Box proposal before end of September
  • (action on CSCS) it was agreed that we should have a bi-weekly call to review new and old tickets. All issues have to be on the table and shared with everyone (ticket or not).
  • (action on Gianfranco and CSCS) we need to look into the ARC configuration (and the VO queue config) to see if it is correct, since there is a more-than-10x increase in the usage of the scratch file system in phoenix since April'16
  • (action on CSCS) we need to understand the impact of not imposing memory limits (e.g. do nodes swap?)
  • (action on CSCS) authentication on Kibana (for graphs/logs sharing) should be done with Grid certificates

Other

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatpptx 20160901_F2F_Stefano.pptx r1 manage 2303.8 K 2016-09-01 - 10:26 StefanoGorini  
PDFpdf CSCS-ATLASreport-20160901.pdf r1 manage 1879.4 K 2016-09-01 - 07:42 GianfrancoSciacca CSCS-ATLASreport
PDFpdf Grid_FTF_2016_Sept_1.pdf r1 manage 3887.4 K 2016-09-01 - 07:56 LuisMarch UNIGE Tier-3 ATLAS Cluster
PDFpdf UNIBE-LHEP-20160901.pdf r1 manage 1859.9 K 2016-09-01 - 10:14 GianfrancoSciacca UNIBE-LHEP T2 site report
Edit | Attach | Watch | Print version | History: r20 < r19 < r18 < r17 < r16 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r20 - 2016-09-07 - FabioMartinelli
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback