Tags:
create new tag
view all tags

Swiss Grid Operations Meeting on 2021-05-20 at 14:00

Next meeting: 17.06 @ 14h00 <<-- as usual we can change the date to maximize the attendance (in particular the one of the VOreps)


Minutes:

ATLAS report

Nick: The numbers presented are computed considering 6910 expected slots while the number should read 5963. Can ATLAS explain where is 6910 coming from ? ATLAS Pledge = 74240 HS06. Cores = Pledge/HS06 = 74240/12.45=5963 cores. April KHS06 pledge hours = Pledge * HoursInDay * DaysInMonth / 1000 = 74240 * 24 * 30 / 1000 = 53452.8. Per CRIC Generated = 57757.977. This means pledge for April was exceeded.

Mauro: the numbers reported from Gianfranco and from Nick are constantly off. We need to converge once and for all on a common source and stick to that to avoid wasting everybody's time and energy in trying to match them.

Gianfranco: Simple and not obscure MATH, I am surprised questions like this arise and the chair lets them arise and bugs me about them (but I have seen even worse): CRIC pledge / HS06 coefficient => Number of cores. Simple MATH. Please NOTE: no private pledge numbers have any role in ATLAS/WLCG. Do your own private scaling among yourselves please, my time is as much wasted as it is yours, or even more to be dealing with such petty issues. CSCS is at 95% (with the usual overestimation error folded in) of pledge for April 2021. That is not tragic, but is NOT above 100%. In order to have numbers match, it is sufficient not to have private versions of the relevant metrics.

CSCS report

New monitoring up from mid-April (spikes are coming from http timeouts - fixing it)

All VOs are above 100%: we are using 10 extra nodes to cope for possible downtimes. Extrapolating from the load we have so far, May should be still above 100% inspite of the problems occurred when coming out from the maintenance period.

ATLAS DPM migration

From the answer in the minutes of the 11.03.2021 meeting:

  • What is the plan and timeline to move from test phase to production? (Gianfranco: will follow WLCG/ATLAS additional recommendations. More solutions need to be evaluated)
  • How much of the ATLAS workload is using DPM at CSCS? (Gianfranco: the DPM capacity at CSCS is 11% of the ATLAS storage for UNIBE-LHEP)
This still doens't answer how much of the workload is using it / is there anybody using it ?

Gianffranco: ATLAS is using it. It has been reported MULTIPLE times. If there is not an understanding about how experiments use storage at sites, you could set a workshop up for that. Then yuou could also perhaps report "how much of the workload is using" the ATLAS storage at CSCS.

DCACHE hw has started to arrive —> turn OFF access to test-DPM on Jun 2nd

Mauro: I would like to understand what happens on the ATLAS workflows when the test-HW is removed

Gianfranco: Please note: CSCS have insisted for years that all communication about r&d projects must occur via CHIPP. This is no exception. As such, I have written to the CHIPP chair asking for an official and recorded communication. Should CSCS want to end the ongoing production project despite its success: Send an official communication to ATLAS CH (e.g. me) , including a brief motivation so that we can pass that to the upstream. Following an handshake, we will arrange for the ATLAS data migration away from CSCS. This must be scheduled.

In addition, NOTE: Arbitrarily removing access to data will "have consequences". Outside of the private version of WLCG that is being showcased here with such random shoutouts (and has no precedents in the history of the LHC experiments), ATLAS data belong to ATLAS. And ATLAS service providers are bound to adhere to the rules and code of conduct, not to mention MoU of the official WLCG. Not to an arbitrary and private CHIPP version of it.

CMS report

nothing major, now waiting the system to come back

GPU work is scheduled to begin next week

LHCb report

nothing major, waiting for the system to come back

Followup from previous Action Items

Action items

ATLAS

CMS

LHCb

T2 Sites reports

CSCS

UniBe

T3 Sites reports

PSI

UniGe

EGI / WLCG

  • EGI monthly report for April ok kfor NGI-CH

  • The following issue reported in March and April has not yet been solved:

    CSCS squids (cvmfs and cvmfs1) are not working as needed. Should be fixed as in such configuration it is like they are not there in the first place (no caching)
    (email by Ilija.Vukotic@cern.ch on 16th March 2021):

    "If I look at wlcg squid monitoring:

    http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgatlas2/indexatlas2.html

Review of open tickets

https://ggus.eu/index.php?mode=ticket_search&supportunit=NGI_CH&status=open&timeframe=any&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO

Ticket-ID Type VO Site Priority Resp. Unit Status Last Update Subject Scope
152076 atlas UNIBE-LHEP less urgent NGI_CH assigned 2021-05-20 Job failures at UNIBE-LHEP WLCG
152070   cms CSCS-LCG2 urgent NGI_CH assigned 2021-05-20 SAM tests failing at T2_CH_CSCS WLCG
152033   cms CSCS-LCG2 urgent NGI_CH in progress 2021-05-20 Erroneous consistency check endpoint at ... WLCG
151997   cms CSCS-LCG2 urgent NGI_CH assigned 2021-05-14 WebDAV protocol deployed (T2_CH_CSCS) WLCG
151265   cms CSCS-LCG2 less urgent NGI_CH on hold 2021-04-09 Enabling WebDAV on Production ... WLCG
150373   dune UNIBE-LHEP less urgent NGI_CH in progress 2021-05-14 Enable DUNE queue for CPU and future ... EGI
144485   none CSCS-LCG2 less urgent NGI_CH assigned in progress 2021-04-14 Upgrade to recent dCache release EGI
a.o.b
  • Attendants
  • CSCS: Colin, Dario, Nick, Pablo
  • CMS: Derek, Mauro
  • ATLAS:
  • LHCb: Roland
  • EGI:
Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf ATLAST2ReportApr2021.pdf r1 manage 429.7 K 2021-05-20 - 10:27 GianfrancoSciacca ATLAS CH Tier2 report
PDFpdf CH-ATLAST2Report.pdf r1 manage 139.0 K 2021-05-20 - 10:27 GianfrancoSciacca CH-ATLAS Tier2 report
PDFpdf CHIPPreportMay2021.pdf r2 r1 manage 1009.7 K 2021-05-20 - 13:23 NickCardo CSCS Site Report
PDFpdf wlcg-swissops-cms-20210520.pdf r1 manage 3575.1 K 2021-05-20 - 11:59 DerekFeichtinger CMS report
Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r9 - 2021-06-18 - MauroDonega
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback