Tags:
meeting1Add my vote for this tag SwissGridOperationsMeeting1Add my vote for this tag create new tag
view all tags

Swiss WLCG Operations Meeting on 2012-09-06

Agenda

CSCS Status

(Reports Miguel)
  1. Storage Element:
    • dCache extension: Storage extension chosen and in the process of purchasing it. Should arrive in November (AFAIK, Miguel).
    • dCache upgrade to 1.9.12. The process has started on preproduction, but since it's a complicated matter, we are being extra careful to assure no data is lost in the process.
  2. WN:
    • WN: 12 extra Sandy Bridge nodes (384 job slots) are physically installed, but we have had no time to configure them. Will do ASAP.
    • WN software: As of today, wn[01-46] are gLite 3.2 WN and wn[47-59] are UMD 1 WN. During next maintenance we plan to upgrade all nodes to UMD 1.
  3. Network:
    • Ethernet Network replacement: The Cisco switches have arrived and the Network Administrator at CSCS is preparing the infrastructure and configuration required for them.
  4. Problems:
    • Yesterday's problem with Argus: Some error on Argus caused all CREAM-CEs to stop accepting jobs. A mail to argus-support has been sent, waiting for reply.
    • We have a small problem with our KVM solution: Convirture is unable to work due to database corruption, so we have to shut down all KMV VMs during one maintenance to re-add them. Thinking about a permanent solution, either commercial or open source, but rock-solid.
    • Sun HW is failing at an alarming rate. This week the old MDT connected to puppet failed 3 disks on a RAID-6. We were able to recover with some filesystem corruption, but if this HW is failing, other hardware of the same batch might start failing too (critical ones: dCache head nodes). Unclear yet whether this has affected our ability to install machines (kickstart files).
    • We have detected a problem with NGI-DE/CH TopBDII: at times it is very slow answering queries and, therefore, the status of CSCS is degraded on NGI checks. We have seen DESY using their own internal TopBDII, so we are thinking about doing the same internally for CSCS. NOT to all the NGI_CH cloud. At the moment the BDII at CERN is being used as a primary BDII, but it's a temporary solution. lcg-bdii.cern.ch:2170,bdii-fzk.gridka.de:2170
  5. AOB
    • Atlasvobox: We have seen that it is possible to use the Squid provided by Scientific Linux (currently used on CVMFS) to host, also, the atlasvobox. The process seems simple, but some work needs to be done and a lot of testing is important. We are working on it.
    • Fabio requested access to our preproduction cluster to test some changes on the CREAM-CE. Please, submit a ticket, so we can get to work on it ASAP.

PSI Status

(Reports Fabio)
  • Designing a Fast, HA, SAN 10TB /home based on GPFS with:
    • Two servers, like 2u HP Proliant + GPFS 3.5 - Node quorum with tiebreaker disks
    • Well tested dual card Qlogic FC 8Gbit/s
    • 2u 24-bay IBM DS3524 or 2u 24-bay SGI IS5000.
    • 6 Gbps SAS 2.5" 900GB 10k disks, but I'd want to put the GPFS metadata on SSD or 15k in RAID1 ( opinions? )
    • Tot Cost with 10k disks is: ~50k CHF.
    • BTW it's still missing features like snapshots but also GlusterFS, now called Red Hat Storage 2.0, can implement a cheap HA /home with 2 NAS.
  • Because of several WN panics, we introduced SGE queues memory limits, default is 3GB per Job, users can ask up to 6GB.
  • We introduced a recent/all hierarchical SGE accounting file to speed up the qacct response times.
  • Testing dCache 1.9.12 inside our VMWare testbed.

UNIBE Status

(Reports Gianfranco):
  • Xxx

UNIGE Status

(Rreports Szymon):
  • Xxx

Other topics

  • Topic1
  • Topic2

Next meeting date

AOB

Attendants

  • CSCS: Miguel
  • CMS: Fabio, Daniel
  • ATLAS:
  • LHCb:
  • EGI:

Action items

  • Item1
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r7 - 2012-09-09 - FabioMartinelli
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback