Scheduled Maintenance on 2012-07-04

The next first working Wednesday of the month we will go into Scheduled Downtime. It will last from 9:00 to 18:00, but we will return to operation as soon as we finish.

As usual, CMS and Atlas queues will be closed 24 hours before the maintenance, and LHCb queue will close 48 hours before the maintenance.

_ REMOVE: REMEMBER TO ADD DOWNTIME IN GOCGB and CLOSE THE QUEUES_

Summary of interventions

We will perform the following operations on the cluster:


Upgrade kernel on SL6 nodes

  • Description: There is a security issue affecting RHEL6 kernels, we need to upgrade them
  • Affected nodes: kvm01, cvmfs
  • Notes:

Bios/ILOM upgrade on gpfs nodes

Torque upgrade to 2.4.17

  • Description: There are two bug fixes solved that affect us
  • Affected nodes: lrms[01-02], wn[01-46], cream[01,02], arc[01-02]
  • Notes:
    dsh -g WN -g CREAM_CE -g ARC_CE 'rpm -Uvh http://repo/torque-2.4.17-1.cri.x86_64.rpm http://repo/torque-client-2.4.17-1.cri.x86_64.rpm'
    dsh -w lrms[01-02] 'rpm -Uvh http://repo/torque-2.4.17-1.cri.x86_64.rpm http://repo/torque-client-2.4.17-1.cri.x86_64.rpm http://repo/torque-server-2.4.17-1.cri.x86_64.rpm http://repo/torque-devel-2.4.17-1.cri.x86_64.rpm'
    dsh -g WN -g CREAM_CE -g ARC_CE -g LRMS 'rpm -qa | grep ^torque | sort' | dshbak -c
    ssh lrms01 'grid-service stop'
    ssh lrms02 'grid-service restart'
    dsh -g WN -g CREAM_CE -g ARC_CE 'grid-service restart'
    ssh lrms01 'grid-service restart'
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2012-06-28 - PabloFernandez
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback