Tags:
create new tag
view all tags

Swiss Grid Operations Meeting on 2016-02-04 at 14:00

Site status

CSCS

  • STORAGE

    Hardware / Physical install
    - 8 Feb: new dCache servers (4x)
    - 8 Feb: MPO in order to connect Phoenix to the CSCS SAN
    - 9 Feb: NETAPP E5660 (~0.5PB)

    dCache
    - The ‘cleaner problem’ (mainly affecting CMS) is no more present. Space is freed automatically as expected
    - Atlas dumps in place, something to adjust for 'atlasgroupdisk/perf-egamma' and 'atlasscratchdisk’ ( https://xgus.ggus.eu/ngi_ch/index.php?mode=ticket_info&ticket_id=428 )

    GPFS
    - Unplanned maintenance was needed on Wed 3rd Feb in order to recreate the filesystem because of a metadata inconsistency problem.
  • Systems
- Preparing and consolidating racks for new arrivals end of this month
- Checking published values of HEPspec
- Tuned slurm config to improove cluster performance
- Fixed two HP nodes, one of them whit IB failures and the other the 1G man network card
- Testing complete Puppet installation for worker nodes, is working fine, i have just to check some cvmfs parameters and cream wrapper script.

PSI

  • Xxx
  • Accounting numbers (from scheduler) from last month

UNIBE-LHEP

Operations

  • Nothing significant to report; stable operation on both systems
  • 256 new cores delivered yesterday, hope to deploy before weekend
ATLAS specific operations

  • No progress on the storage dumps requested by ATLAS (due to no progress in the re-deployment of the DPM head node on SLC6)
  • ANALY_UNIBE-LHEP blacklisted in HC: no time to debug but low impact since right now ANALY jobs aren't too many
  • A couple of stabile weeks of operation for UNIBE-LHEP_CLOUD_MCORE, then we lost the cluster and could not fix it yet
Accounting
  • Accounting numbers (from scheduler) from last month (Jan 2016)
    • CPU h: 792492 (ATLAS) - 12671 (t2k.org) - 1879 (uboone) - 25 (ops)
  • Accounting numbers (from ATLAS dashboard) from last month (Jan 2016)
    • CPU h: 662466 (774848 with cloud)
    • WC h: 679368 (796292 with cloud)

UNIBE-ID

  • Xxx
  • Accounting numbers (from scheduler) from last month

UNIGE

Operations

  • Running smoothly: Higher user activity since last meeting
  • Grid (ATLAS) jobs: UNIGE-DPNC in "Test" status and ~ 1/3 oj jobs failed due to (apparently) "ran out of memory". Need checks
  • We plan a scheduled downtime at some point: Needed for upgrading system and security (related to get involved for ATLAS production also)
Storage
  • Dump of DPM SE for ATLAS experiment finally submitted (this dump should be provided once a month)
  • In addition to these ATLAS checks, we should clean our DPM: Old user data and other projects (To Be Done)
Outlook
  • Request for new network switch upgrade to 10 Gb/s + adquisition of 3 GPUs already submitted (wait for resolution in ~ March 2016)
  • Install puppet for DPM SE (and probably also for cluster configuration and setup, replacing yaim)
Accounting
  • Accounting numbers (from scheduler) from last month

NGI_CH

  • Nothing to report
  • NGI-CH Open Tickets review
https://ggus.eu/index.php?mode=ticket_search&supportunit=NGI_CH&status=open&timeframe=any&orderticketsby=REQUEST_ID&orderhow=desc&search_submit=GO

    • CSCS-LCG2
      • 117786 (ATLAS: storage dumps) almost done - should fix two paths
      • 119021 (LHCb team: jobs failed) no information provided - changed to "waiting for reply"
      • 119171 (CMS: Workflow failures) in progress
    • UNIBE-LHEP
      • 117899 (ATLAS: storage dumps) on hold
    • NGI_CH

Other topics

  • Topic1
  • Topic2
Next meeting date:

A.O.B.

Attendants

  • CSCS:
  • CMS:
  • ATLAS: Luis March
  • LHCb:
  • EGI: Luis March

Action items

  • Item1
Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r12 - 2016-02-04 - LuisMarch
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback