Swiss Grid Operations Meeting on 2019-07-04 at 14:00

Site status

CSCS

  • Systens:

    • issue in one cabinet and the network did an emergency throttling one DVS dead and some job failed, still restoring nodes

    • Prepared pallet with cables and some switches for Gianfranco

    • Still working on ARC v6, just got the certificate

Storage: dCache
    • CMS fixed space issues

    • Adjusted pledged space for all VOs

Scratch
    • moving CMS from sonexion to scratch/gpfs

accounting-slurm.png

PSI

UNIBE-LHEP

  • Went into downtime on 19th June for cluster re-deployment
  • Monthly summary: Pledged: 42k, delivered 18k (24k last month)
  • Ubelix contributing ~46% (23% typical)
  • Running an average >2.2k slots (2.5k typical during the previous pledge period)

    WC UNIBE-LHEP:

Screen_Shot_2019-07-01_at_23.20.36.png

6month_UNIBE-LHEP:

Screen_Shot_2019-07-01_at_23.31.11.png

  • Accounting numbers (from scheduler) from last month
    • Omitted this month


Swiss ATLAS statistics

  • Hammercloud availability

    ATLAS_HC_last-month:
    Screen_Shot_2019-07-01_at_23.39.19.png
    • ANALY_CSCS-HPC: 94% (85% last month)
    • CSCS-LCG2-HPC_MCORE: 95% (99% last month)
    • ANALY_UNIBE-LHEP: 56% (100% last month)
    • ANALY_UNIBE-LHEP-UBELIX: 100% (94% last month)
    • UNIBE-*UBELIX* : 100% (100% last month)
    • UNIBE-LHEP_MCORE: 56% (100% last month)

  • Slots used

    ATLAS_slots:
    Screen_Shot_2019-07-02_at_23.20.52.png

    • CSCS: 4.65k (4.2k last month)
    • UNIBE: 1.58k (2.2k last month)

  • CPU consumption

    ATLAS_CPU:
    Screen_Shot_2019-07-02_at_23.45.58.png

    • CSCS 75% (64% last month)
    • UNIBE 25% (36% last month)

  • Accounting Numbers from the ATLAS dashboard (May 2019) CSCS+UNIBE

    ClusterJob TypeProduced WC core-hoursGood vs Bad WC %CPU eff good jobs %
    CSCS Any 3'075'000; 72% (was 2'505'555 64%) 88% (was 90%) 84% (was 83%)
    UniBe Any 1'208'333; 28% (was 1'502'777 41%) 88% (was 82%) 82% (was 81%)

  • Delivered vs pledged
    CSCS-LCG2: pledged 50k, delivered 55.7k (was 51k)
    UNIBE-LHEP: pledged 42k, delivered 17.1k (was 23.6k)

    Screen_Shot_2019-07-03_at_00.20.11.png

  • VO shares at CSCS
    Broadly OK
    Drop in running cores for >1 week
    And also of ATLAS share

    CSCS_shares_June:
    Screen_Shot_2019-07-03_at_22.29.35.png

    CSCS_running_cores:
    Screen_Shot_2019-07-03_at_22.35.29.png


UNIBE-ID

  • Smoth ARC CE operation over the last period
  • Downtime today: Finally research now in service. Moved ARC CE dirs to the new filesystem
  • Good usage of atlas-preempt partition and cluster as a whole after increasing qos to 1200 cores for ATLAS jobs.

UNIGE

  • ARC CE deployment status escalated to management
  • ARC now installed and arc.conf with me for review

NGI_CH

  • CA status
    • Procedure ongoing, but slow. QuoVadis not too reactive and gazillions of bureaucratic issues to deal with
    • Even the procedure for Unibe has not been finalised yet, the Trust Link has not been issued yet
    • Once this is done and all steps documented, all the other institutes will have to do the same
    • "High level" signature required, and several additional steps
    • In the meanwhile, we had to request 4 "emergency" certificates to the old EGI CA. Could negotiate with a EGI operation guy

* NGI-CH Open Tickets review
https://ggus.eu/index.php?mode=ticket_search&supportunit=NGI_CH&status=open&timeframe=any&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO

4 Tickets found
Ticket-ID Type VO Site Priority Resp. Unit Status Last Update Subject Scope
141276   none   less urgent NGI_CH in progress 2019-06-28 yearly review of the information ... EGI
139574   dteam CSCS-LCG2 less urgent NGI_CH assigned in progress 2019-06-14 please configure mesh on ... EGI
131965   none UNIBE-LHEP less urgent NGI_CH assigned on hold 2019-06-25 IPv6 deployment at WLCG Tier-2 sites EGI
131432   none CSCS-LCG2 urgent NGI_CH assigned involved in progress 2019-06-03 Storage accounting deployment EGI

Other topics

  • Topic1
  • Topic2

Next meeting date:

A.O.B.

Attendants

  • CSCS:
  • CMS:
  • ATLAS:
  • LHCb:
  • EGI:

Action items

  • Item1

Topic attachments
I Attachment History ActionSorted ascending Size Date Who Comment
PNGpng Screen_Shot_2019-07-01_at_23.20.36.png r1 manage 82.8 K 2019-07-01 - 21:23 GianfrancoSciacca WC UNIBE-LHEP
PNGpng Screen_Shot_2019-07-01_at_23.31.11.png r1 manage 207.1 K 2019-07-01 - 21:33 GianfrancoSciacca 6month_UNIBE-LHEP
PNGpng Screen_Shot_2019-07-01_at_23.39.19.png r1 manage 316.4 K 2019-07-01 - 21:41 GianfrancoSciacca ATLAS_HC_last-month
PNGpng Screen_Shot_2019-07-02_at_23.20.52.png r1 manage 187.6 K 2019-07-02 - 21:22 GianfrancoSciacca ATLAS_slots
PNGpng Screen_Shot_2019-07-02_at_23.45.58.png r1 manage 236.8 K 2019-07-02 - 21:47 GianfrancoSciacca ATLAS_CPU
PNGpng Screen_Shot_2019-07-03_at_00.20.11.png r1 manage 124.4 K 2019-07-02 - 22:23 GianfrancoSciacca ATLAS_delivered_vs_pledge
PNGpng Screen_Shot_2019-07-03_at_22.29.35.png r1 manage 149.1 K 2019-07-03 - 20:31 GianfrancoSciacca CSCS_shares_June
PNGpng Screen_Shot_2019-07-03_at_22.35.29.png r1 manage 161.8 K 2019-07-03 - 20:39 GianfrancoSciacca CSCS_running_cores
PNGpng accounting-slurm.png r1 manage 15.6 K 2019-07-04 - 11:57 DinoConciatore  

This topic: LCGTier2 > WebHome > MeetingsBoard > MeetingSwissGridOperations20190704
Topic revision: r7 - 2019-07-04 - DinoConciatore
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback