Tags:
dcache
1
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> ---+ dCache Full Restart procedure Some times there is a need to restart dCache from zero, most likely because you don't know how to start it up. This document shows how to bring the service back after a full reboot The central piece is the LM domain that starts in storage02, when you do a =service dcache start=. Afterwards, any other domain on any other host will connect to the LM, that will make it reachable inside dCache. If you restart the LM, all the other domains will try to re-contact it after some seconds, and will try to do so for some minutes... but don't expect them to retry too much. ---++ Step 1. Reboot all machines The dCache service does not come up automatically, so a good starting point is having all machines just rebooted. ---++ Step 2. Start dCache on storage02 You can probably go straight to the last step, but if you're not an expert probably yo want to go all items here: * Postgres should run automatically in the bootup. Make sure /var/lib/pgsql is mounted and postgresql is running. * chimera-nfs service also automatically runs upon boot... but if you want to make sure run =ps -axf | grep "chimera/"= and see if you get a java process with a lot of includes. * If you have upgraded the dCache RPM or even the Java RPM, you need to do a =/opt/d-cache/install/install.sh= before running the dCache service. It doesn't hurt if you do this in any case. * Finally, do a =service dcache start= ---++ Step 3. Start dCache on the rest of the systems This step depends on the kind of machine you want to join dCache. Storage01, Linux pools and Solaris pools have a different way to enable dCache (but at the end, though, it's just the same thing: starting the dcache init script. Let's go type by type. ---+++ Storage01 You can probably go straight to the last step, but if you're not an expert probably yo want to go all items here: * Postgres should run automatically in the bootup. Make sure /var/lib/pgsql is mounted and postgresql is running. * Also, the billing file database is in a separate mount point. Make sure /opt/d-cache/billing is mounted. * If you have upgraded the dCache RPM or even the Java RPM, you need to do a =/opt/d-cache/install/install.sh= before running the dCache service. It doesn't hurt if you do this in any case. * Finally, do a =service dcache start= ---+++ Thumpers (pools with Solaris) You can probably go straight to the last step, but if you're not an expert probably yo want to go all items here: * Check the data mountpoints are available =zfs list= * Make sure gmond is running, it does not like to run by default. Check with =ps -e | grep gmond= and if not present, do a =svcadm clear gmond= and check again. * If you have upgraded the dCache or even the Java packages, you need to do a =/opt/d-cache/install/install.sh= before running the dCache service. It doesn't hurt if you do this in any case. * Finally, do a =/opt/d-cache/bin/dcache start= (from xen12, or any machine with dsh groups configured, perform =dsh -g SE '/opt/d-cache/bin/dcache start'=) ---+++ Thors (pools with Linux) You can probably go straight to the last step, but if you're not an expert probably yo want to go all items here: * Check the data mountpoints are available =df -h | grep data1= * Make the logging directory: =mkdir /var/log/d-cache= * If you have upgraded the dCache RPM or even the Java RPM, you need to do a =/opt/d-cache/install/install.sh= before running the dCache service. It doesn't hurt if you do this in any case. * Finally, do a =service dcache start= (from xen12, or any machine with dsh groups configured, perform =dsh -g SE2 'service dcache start'=) ---++ Step 4. Test the system From a user interface try to use these two test scripts, after creating your voms proxy: * chk_SE-dcache. This checks the basic direct commands that you can perform to dCache. If you get a "no space available" you probably need to wait a bit until pools start up and publish their free space to the dCache core. * chk_SE-lcgtools. This uses the complete information system to perform a copy (it's the kind of tool the jobs actually use!!). It has a lot of options, but it should run OK without parameters. If it doesn't, the first think to see is if the information system has published everythink upwards. You can use a different BDII to speed things up, with the parameter "-b bdii.lcg.cscs.ch" for example. * Check for offline pools. http://storage01.lcg.cscs.ch:2288/usageInfo should show all pools (check they're all there) and if there is no OFFLINE word in big red letters you're probably fine. If all this works, you're done. Congratulations!!! -- Main.PabloFernandez - 2010-11-08
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r4 - 2011-05-23
-
PabloFernandez
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback