Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> ---+!! Scheduled Maintenance on 2011-11-07, at 8:00am Next Monday 7th of November we are going to migrate from Lustre to GPFS. This requires a full compute shutdown, and will take us one full day. We have reserved a second day in case something goes bad, but, as usual, we will finish the downtime as soon as everything works. Central storage (dCache) will not be affected. As usual, CMS and Atlas queues will be closed 24 hours before the maintenance, and LHCb queue will close 48 hours before the maintenance. ---++!! Summary of interventions We will perform the following operations on the cluster: %TOC% --- ---++ %ICON{done}% Backup Lustre skeleton * *Description*: Backup previous scratch directories into a Tar file, to be able to do further fast recoveries. * *Affected nodes*: wn[101-206], arc[01-02], cream[01-02] *Steps*: * Make sure no job is running * Stop grid-service in all WNs and CEs * Clean up all data/job directories * Do the tar and keep it safe ---++ %ICON{done}% Prepare GPFS Servers * *Description*: Do a fresh cleanup and preparation of hardware on all GPFS nodes * *Affected nodes*: mds[1-2], oss[11-42], puppet *Steps*: * Remove all Virident cards from Puppet and Oss12 * Install Virident cards and remove MDT controllers from mds[1-2] * Reinstall mds[1-2] with SL6.1 * Upgrade virident cards/software * Upgrade lsi controllers/cards on oss[21-42] * Deactivate 1/2 of the CPUs on all GPFS service nodes. * Reinstall OSS[11-42] to SL6.1 * Install GPFS rpms, 3.4.0 * Upgrade to GPFS rpms 3.4.0-8 * Compile gpl compatibility layer, install those rpms * Run jbod-naming-scheme.sh to create udev rules * Reboot servers to ensure proper naming system * 1st install Client rpms, then -Make GPFS cluster * Make GPFS filesystem * Place monitoring cron jobs for broken disks ---++ %ICON{done}% Prepare GPFS Clients * *Description*: Install GPFS kernel modules on all clients * *Affected nodes*: wn[101-206], arc[01-02], cream[01-02] * *Notes*: This may require kernel changes and consequent reboots *Steps:* * Install OFED 1.5.3 * Install GPFS 3.4.0-0 rpms, upgrade to 3.4.0-8 * Compile and install GPFS gpl compatibility layer ---++ %ICON{done}% Apply fixes to CREAM * *Description*: Apply the following updates and fixes to CREAM-CEs <verbatim>- Increase the number of pool accounts for LHCB VO. - Apply tomcat5 memory tweaks. - Move DNS entry for cream02 to IP in the infiniband network. - Update UMD packages in CREAM machines.</verbatim> * *Affected nodes*: cream[01-02], wn[101-206] * *Notes*: ---++ %ICON{done}% Update ARGUS * *Description*: Apply the following software updates to ARGUS servers and *clean policies* <verbatim>argus-pap argus-pdp argus-pep-server emi-argus emi-version yaim-argus_server argus-pdp-pep-common argus-pep-common emi-trustmanager emi-trustmanager-axis</verbatim> * *Affected nodes*: argus[01-02] * *Notes*: ---++ %ICON{todo}% Configure ILOM on both NFS servers * *Description*: * *Affected nodes*: nfs[01-02] * *Notes*: nfs01 is done, nfs02 is missing an ILOM card, needs to be purchased.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r11
<
r10
<
r9
<
r8
<
r7
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r11 - 2011-11-09
-
JasonTemple
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback