Swiss Grid Operations Meeting on 2019-07-04 at 14:00
Site status
CSCS
-
Systens:
-
issue in one cabinet and the network did an emergency throttling one DVS dead and some job failed, still restoring nodes
-
Prepared pallet with cables and some switches for Gianfranco
-
Still working on ARC v6, just got the certificate
Storage: dCache
Scratch
PSI
UNIBE-LHEP
- Went into downtime on 19th June for cluster re-deployment
- Monthly summary: Pledged: 42k, delivered 18k (24k last month)
- Ubelix contributing ~46% (23% typical)
- Running an average >2.2k slots (2.5k typical during the previous pledge period)
WC UNIBE-LHEP:
6month_UNIBE-LHEP:
- Accounting numbers (from scheduler) from last month
Swiss ATLAS statistics
- Hammercloud availability
ATLAS_HC_last-month:
-
- ANALY_CSCS-HPC: 94% (85% last month)
- CSCS-LCG2-HPC_MCORE: 95% (99% last month)
- ANALY_UNIBE-LHEP: 56% (100% last month)
- ANALY_UNIBE-LHEP-UBELIX: 100% (94% last month)
- UNIBE-*UBELIX* : 100% (100% last month)
- UNIBE-LHEP_MCORE: 56% (100% last month)
- Slots used
ATLAS_slots:
- CSCS: 4.65k (4.2k last month)
- UNIBE: 1.58k (2.2k last month)
- CPU consumption
ATLAS_CPU:
- CSCS 75% (64% last month)
- UNIBE 25% (36% last month)
- Accounting Numbers from the ATLAS dashboard (May 2019) CSCS+UNIBE
Cluster | Job Type | Produced WC core-hours | Good vs Bad WC % | CPU eff good jobs % |
CSCS | Any | 3'075'000; 72% (was 2'505'555 64%) | 88% (was 90%) | 84% (was 83%) |
UniBe | Any | 1'208'333; 28% (was 1'502'777 41%) | 88% (was 82%) | 82% (was 81%) |
- Delivered vs pledged
CSCS-LCG2: pledged 50k, delivered 55.7k (was 51k)
UNIBE-LHEP: pledged 42k, delivered 17.1k (was 23.6k)
- VO shares at CSCS
Broadly OK
Drop in running cores for >1 week
And also of ATLAS share
CSCS_shares_June:
CSCS_running_cores:
UNIBE-ID
- Smoth ARC CE operation over the last period
- Downtime today: Finally research now in service. Moved ARC CE dirs to the new filesystem
- Good usage of atlas-preempt partition and cluster as a whole after increasing qos to 1200 cores for ATLAS jobs.
UNIGE
- ARC CE deployment status escalated to management
- ARC now installed and arc.conf with me for review
NGI_CH
- CA status
- Procedure ongoing, but slow. QuoVadis not too reactive and gazillions of bureaucratic issues to deal with
- Even the procedure for Unibe has not been finalised yet, the Trust Link has not been issued yet
- Once this is done and all steps documented, all the other institutes will have to do the same
- "High level" signature required, and several additional steps
- In the meanwhile, we had to request 4 "emergency" certificates to the old EGI CA. Could negotiate with a EGI operation guy
*
NGI-CH Open Tickets review
https://ggus.eu/index.php?mode=ticket_search&supportunit=NGI_CH&status=open&timeframe=any&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO
4 Tickets found |
Ticket-ID |
Type |
VO |
Site |
Priority |
Resp. Unit |
Status |
Last Update |
Subject |
Scope |
141276 |
|
none |
|
less urgent |
NGI_CH |
in progress |
2019-06-28 |
yearly review of the information ... |
EGI |
139574 |
|
dteam |
CSCS-LCG2 |
less urgent |
NGI_CH assigned |
in progress |
2019-06-14 |
please configure mesh on ... |
EGI |
131965 |
|
none |
UNIBE-LHEP |
less urgent |
NGI_CH assigned |
on hold |
2019-06-25 |
IPv6 deployment at WLCG Tier-2 sites |
EGI |
131432 |
|
none |
CSCS-LCG2 |
urgent |
NGI_CH assigned involved |
in progress |
2019-06-03 |
Storage accounting deployment |
EGI |
Other topics
Next meeting date:
A.O.B.
Attendants
- CSCS:
- CMS:
- ATLAS:
- LHCb:
- EGI:
Action items