create new tag
view all tags

Announcing a Tier-3 shutdown

  • Announce the system halt on the GOC Pages (24h before if it is a scheduled downtime)
  • Announce the system halt on cms-tier3-users@lists.psi.ch

Shutting down the Tier-3

Temporary note: This sequence is based on a mail by Nina and I (Derek) followed the sequence and adapted in some places. I added explicit commands where I could. TODO (Nina):

  • check sequence and provide explicit (and homogeneous as far as possible) commands.
  • the commands must be runnable from t3admin01, not from the laptop. I.e. configuration files that define the names of worker nodes, service nodes, etc. and that help to send parallel commands must be available on t3admin01. I do not care which parallel mechanism is used (cexec or pssh, etc). But the configuration and commands to run must be explicitly in this list and the config must be local to the admin node.
  • the ssh keys in .ssh/known_hosts on t3admin01 were completely out of date, and seemingly a lot of the WN keys have changed. This prevents working from t3admin01
  • /etc/hosts contains a number of obsolete entries (MeG nodes still inside). Also, we need to define whether the public addresses are kept within that file or only the private ones. At the moment the public addresses are incomplete.


Before Downtime

  1. do announcement in :
  2. check list of nodes on t3admin02 in node-list-t3 directories
  3. stop snapshot script of /work (on t3nfs02 in /etc/cron.daily/zfssnap comment line
    # /opt/zfssnap/zfssnap $PERIOD $MAXSNAPS $BACKUPSERVER >>$LOG 2>&1 )
  4. when possible prepare FW updates on NetApp and HPs (t3nfs01,02 and t3admin02), etc. and prepare yum + kernel updates

Downtime day:

  1. Prevent further logins to the user interfaces. Modify /etc/security/access_users.conf on the user interfaces by commenting out the lines that allow access for all CMS users
    #+ : cms : ALL
    - : ALL : ALL
  2. stop icinga notifications: on emonma00 node in /opt/icinga-config/tier3/objects/tier3_templates.cfg comment the line with members like
    define contactgroup{
            contactgroup_name       t3-admins
            alias                   TIER3 Administrators
       #    members          ......................
  3. Stop Nagios: ssh root@t3nagios /etc/init.d/nagios stop
  4. disable all user queues/ Slurm Partitions on the WNs:
      ssh t3slurm "scontrol update PartitionName=gpu State=DRAIN;scontrol update PartitionName=wn State=DRAIN; scontrol update PartitionName=qgpu State=DRAIN;scontrol update PartitionName=quick State=DRAIN "
  5. Delete any remaining jobs in the queue system
  6. Unmount PNFS on the nodes
    1. umount /pnfs an all nodes: UIs, WNs (from t3admin02): pssh -h node-list-t3/slurm-clients -P "umount /pnfs/psi.ch/cms"
    2. comment in fstab /pnfs line to prevent mount after reboots
            for n in $(seq 10 59); do echo t3wn$n; ssh t3wn$n "sed -i 's/\(t3dcachedb03:\/pnfs\/psi.ch\/cms.*\)/# COMMENTED FOR DOWNTIME \1/' /etc/fstab"; done
    3. stop puppet run on slurm clients (optional)
  7. if thers is a shutdown of t3nfs02, then umount /work and on cliens: "sed -i 's/\(t3nfs02.*\)/# Downtime \1/' /etc/fstab"
  8. and correspondingly for big maintenance days: umount /t3home and "sed -i 's/\(t3nfs*\)/# Downtime \1/' /etc/fstab"
  9. Shut down the worker nodes
    1. Shut down the nodes
            for n in $(seq 10 59) ; do echo t3wn$n; ssh !root@t3wn$n shutdown -h now ; sleep 1 ; done
    2. Check whether all nodes are down
      for n in $(seq 10 20) 22 23 $(seq 25 47) $(seq 49 59) ; do node="t3wn$n"; echo -n "$node: "; ipmitool -I lanplus -H rmwn$n -U root -f /root/private/ipmi-pw chassis power status ; done
  10. Stop PhEDEx on the t3cmsvobox (since it relies on dcache transfers). Notice that Phedex runs as the phedex user and not as root.
          ssh phedex@t3cmsvobox /home/phedex/config/T3_CH_PSI/PhEDEx/tools/init.d/phedex_Debug stop
  11. service xrootd/cmsd stop on t3se01: ssh root@t3bdii "/etc/init.d/bdii stop"; ssh t3se01  systemctl stop xrootd@clustered
  12. dcache stop steps:
    1. stop doors on t3se01 - xrootd, dcap/gsidcap, gsifttp, srm, xrootd - all visible from "dcache status" like
       ssh  t3se01 dcache stop dcap-t3se01Domain
      and stop xrootd door on t3dcachedb03
    2. pools t3fs07-11: [root@t3admin02 ~]# pssh  -h node-list-t3/dcache-pools -P "dcache stop"
    3. Unmount PNFS from the SE and DB servers: = ssh t3se01 umount /pnfs=; ssh t3dcachedb03 umount /pnfs
    4. t3se01: ssh t3se01 dcache stop
    5. Stop dcache services on the DB server : ssh t3dcachedb03 dcache stop
    6. Stop Postgresql on the DB serve: ssh t3dcachedb03 systemctl status/stop postgresql-11
    7. stop zookeeper on t3zkpr11-13: for n in $(seq 1 3); do ssh t3zkpr1$n systemctl stop zookeeper.service; done
  13. Stop the BDII: ssh root@t3bdii "/etc/init.d/bdii stop"
  14. frontier - VMs ?: (Derek: I left all VMs running. They will be shut down by the VM team)
  15. shutdown t3nfs02, t3gpu01-2, t3admin02; t3fs07-11: on t3fs07-10 first off the server and afterwards the JBOD
  16. Shut down Netapp system (Link)
    • Make sure, no background processes in operation (Santricity SMclient GUI)
    • Turn off controller enclosure
    • (Turn off any additional enclosure)

Start Tier-3

  1. power on hardware
  2. on VM zookeeper nodes t3zkpr11-13 check systemctl status zookeeper and zkcli -server t3zkpr11 on t3zkpr11
  3. t3dcachedb03 check postgres: systemctl start postgresql-11 and systemctl status crond and dcache check-config . Start all dcache main services dcache  start *Domain besides doors (currently only one xrootd door configured on t3dcachedb03)
  4. t3se01: start services beside doors from dcache  status (like dcache start info-t3se01Domain currently the same for dcache-t3se01Domain, pinmanager-t3se01Domain, spacemanager-t3se01Domain, transfermanagers-t3se01Domain should be started ); doors (configured at the moment also on t3se01) should be stared after pools
  5. mount /pnfs on t3se01 and t3dcachedb03
  6. start dcache on pools t3fs01-11 [root@t3admin02 ~]# pssh -h node-list-t3/fs-dalco -P "dcache  start" takes about 15-30' (in a case of hardware issue first switch on JBOD and than server)
  7. check dcache logs in /var/log/dcache on pools, dcachedb and se machines
  8. check if NetApp is visible from t3fs11: [root@t3nfs11 ~]# multipath -ll
  9. t3se01 start doors (if not done yet): dcap, gsidcap, gsiftp, srm, xrootd (for list, see dcache status) like dcache start  dcap-t3se01Domain, etc. and xrootd door on t3dcachedb03
  10. t3se01 check (and start) xrootd redirector = systemctl start cmsd@clustered= ; = systemctl start xrootd@clustered=
  11. check on all UIs and WNs/CNs if /pnfs/psi.ch/cms is mounted like pssh -h node-list-t3/slurm-clients -P "mount |grep pnfs"
  12. Slurm: on t3slurm scontrol update PartitionName=gpu State=UP and scontrol update PartitionName=wn State=UP
  13. run test-dCacheProtocols from UI
  14. When all T3 is UP one can fulfill the following useful checks:

-- DerekFeichtinger - 2019-01-03

Edit | Attach | Watch | Print version | History: r17 < r16 < r15 < r14 < r13 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r17 - 2020-03-16 - NinaLoktionova
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback