<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE =
TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME =
TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW =
TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->
Swiss Grid Operations Meeting on 2021-01-21 at 14:00
Next meeting: 11 Feb 2020 at 15h00 (note the unusual time)
Minutes:
1- Waiting for final confirmation about running 2 ARCs to the same partition:
- ATLAS OK (Gianfranco)
- CMS: OK (Derek confirmed after the meeting)
- LHCb: OK (Roland confirmed during the meeting)
- In case of further details the discussion will be continued on SLACK
New configuration expected to be deployed on Feb 10.
2- Logs sharing: Everybody agrees to share the logs
3- Mauro has hard time reconciling the plots provided by ATLAS (Gianfranco) and CSCS (Nick charts).
Nick asked again for the definition of a time interval to define the denominator to be used to produce the charts pledged/delivered core hours. Mauro (and people present agreed) to use a month, such that we are in sync with the OPS meeting.
Mauro: I understand it's not only a matter of integral delivered resources, but also how they are delivered ("flat efficicieny" instead of waves).
On the other hand all present agreed that if CSCS is unable to deliver resources (e.g. security issues) they will try to catch up by temporarily adding some nodes (as they have already done in the past), while if experiments are not sending jobs, those cannot be recovered ("use it or lose it" as if it was on a standard cluster).
Mauro: while the ATLAS message is clear, I don't understand the meaning of the horizontal line "CSCS Pledge 64K" in Gianfranco's slides. It sits at ~105k ?
4- CSCS Nick report
100% availability from WLGC monitoring
slide 3: around 10-14 December ATLAS had no pending jobs. Gianfranco can you check what happened ?
Activities in progress:
- Access to dCache data via Macaroons -->(Edited by Derek) Elia adapted the configuration to allow certificate-less access to the DAV-door. He needed an answer from the CMS central Ops whether they were able to now do macaroon authenticated/authorized transfers. They could, so this is solved.
- ATLAS DPM Migration: What is the plan and timeline to move from test phase to production? (Gianfranco, sorry couldn't find it in your last report)
5- CMS:
nothing to report
6- LHCb:
nothing to report (problem transferring files / open ticket should be OK)
Followup from previous Action Items
Action items
To be xchecked:
https://wlcg-docs.web.cern.ch/reporting/accounting/Tier-2/2020/WLCG-T2accounting_12-20.pdf
VO reports
ATLAS
- ATLAST2ReportDec2020.pdf: ATLAS CH Tier2 report
- Highlights:
-
- 4th month in a raw of underdeliver: 41% of pledge in December
- 2620 avg running slots (at least 6k excpected)
CMS
LHCb
T2 Sites reports
CSCS
T3 Sites reports
PSI
EGI / WLCG
- A/R report for December for NGI_CH OK: https://argo.egi.eu/egi/report-ar/Critical/NGI?month=2020-12
- ARC CE disclosure of job owner names [EGI-SVG-2020-17013] does not affect LHC experiments
- All jobs submitted with robot certificates
- But if the site accepts jobs outside from the experiment frameworks, then a patch is needed
- Retirement of Site-BDII checks from the availability profile is being discussed (deprecated service)
Review of open tickets
9 of 9 Tickets |
Ticket-ID |
Type |
VO |
Site |
Priority |
Resp. Unit |
Status |
Last Update |
Subject |
Scope |
150267 |
|
lhcb |
CSCS-LCG2 |
urgent |
NGI_CH |
waiting for reply |
2021-01-21 |
Enable xroot and https with third-party ... |
WLCG |
150263 |
|
atlas |
UNIBE-LHEP |
less urgent |
NGI_CH |
in progress |
2021-01-19 |
UNIBE-LHEP: "Transfer failed: ... |
WLCG |
150221 |
|
atlas |
CSCS-LCG2 |
less urgent |
NGI_CH involved |
in progress |
2021-01-19 |
CSCS-LCG2: transfer error with ... |
WLCG |
150220 |
|
lhcb |
CSCS-LCG2 |
very urgent |
NGI_CH |
in progress |
2021-01-20 |
FTS3 transfers Failed to CSCS-LCG2 |
WLCG |
150086 |
|
cms |
CSCS-LCG2 |
urgent |
NGI_CH |
assigned |
2021-01-05 |
Lost files at T2_CH_CSCS |
EGI |
150061 |
|
lhcb |
CSCS-LCG2 |
very urgent |
NGI_CH |
assigned |
2021-01-01 |
Jobs Failed at CSCS-LCG2 |
WLCG |
149166 |
|
cms |
CSCS-LCG2 |
less urgent |
NGI_CH assigned |
assigned |
2020-12-17 |
CVMFS squids at CSCS |
EGI |
147830 |
|
cms |
CSCS-LCG2 |
less urgent |
NGI_CH involved |
waiting for reply |
2021-01-21 |
Enabling TPC transfers over davs |
WLCG |
144485 |
|
none |
CSCS-LCG2 |
less urgent |
NGI_CH assigned |
in progress |
2020-12-18 |
Upgrade to recent dCache release |
EGI |
a.o.b
- Attendants
- CSCS:
- CMS:
- ATLAS:
- LHCb:
- EGI: