<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # ######* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+!! Node Type: %CALC{"$SUBSTITUTE(%TOPIC%,NodeType,)"}% ---++!! Firewall requirements | *local port* | *open to* | *reason* | <!-- Example line #| 22/tcp | * | Example entry for ssh | --> --- %TOC{title="Table of contents"}% ---+ Installation ---++ Official Doc ( pretty chaotic ) https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhedexAdminDocsInstallation ---++ CSCS Similar Doc Refer to the description on the [[LCGTier2/CmsVObox][LCGTier2/CmsVObox]]. There is one important difference: *while we use FTS channels for the transfers to the Tier-2 we use the SRM backend for transfers to the Tier-3*, because we do not have a FTS channel for PSI. This issue is linked to registering PSI as a regular grid site, which until recently was not possible, since we only support a Grid SE, but no a CE. Thus there is no =fts.map= file in the configuration area for the !PhEDEx services. ---++ Installation by Puppet Installation is described by the Puppet files =tier3-baseclasses.pp= and =SL6_vobox.pp= both saved in the dir =puppetdirenodes=, where =puppetdirenodes= is an alias defined in the following list : <pre> alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom' alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs' alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/' alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/nodes' alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat' alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10' alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts' </pre> ---++ local X509 <pre> # ll /home/phedex/.globus/ total 4 lrwxrwxrwx 1 phedex phedex 31 Apr 13 18:44 usercert.pem -> /etc/grid-security/hostcert.pem -r-------- 1 phedex phedex 1679 Apr 13 18:44 userkey.pem </pre> ---++ =/cvmfs= Study 1st the CVMFS page Be aware of https://twiki.cern.ch/twiki/bin/view/CMSPublic/CernVMFS4cms and the local =%BLUE%/cvmfs/cms.cern.ch%ENDCOLOR%= automatic mount point since =/cvmfs= is nowadays used by our PhEDEx configurations : <pre> [root@t3cmsvobox01 git]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 5.7G 4.3G 1.2G 79% / tmpfs 3.9G 0 3.9G 0% /dev/shm /dev/sda1 477M 32M 420M 7% /boot /dev/sda5 2.9G 640M 2.1G 24% /home /dev/sdb1 20G 9.1G 11G 46% /opt/cvmfs_local <-- local /cvmfs cache /dev/sda6 969M 1.7M 917M 1% /tmp /dev/sda7 5.7G 874M 4.6G 16% /var /dev/sdc1 9.9G 102M 9.3G 2% /var/cache/openafs t3fs06:/shome 6.7T 5.0T 1.8T 75% /shome t3fs05:/swshare 1.8T 562G 1.3T 31% /swshare AFS 2.0T 0 2.0T 0% /afs cvmfs2 14G 9.0G 4.7G 66% %BLUE%/cvmfs/cms.cern.ch%ENDCOLOR% </pre> Because of =/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml= that in turn is linked here : <pre> # ll /home/phedex/config/COMP/SITECONF/T3_CH_PSI/PhEDEx/storage.xml lrwxrwxrwx 1 phedex phedex 52 Apr 13 18:45 /home/phedex/config/COMP/SITECONF/T3_CH_PSI/PhEDEx/storage.xml -> /cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml </pre> ---++ Pitfalls in dcache-srmclient-2.10.7-1 ( currently the latest dcache-srmclient ) Strangely PhEDEx has a strong dependency on =dcache-srmclient= ; by strong we mean that you can't use equivalent SRM tools like =lcg-cp= or =gfal-copy= ; in its latest version, Fabio noticed that : <pre> srmcp as in dcache-srmclient-2.2.4-2.el6.x86_64 had, by default, -delegate=%BLUE%true%ENDCOLOR% srmcp as in dcache-srmclient-2.10.7-1.noarch has now, by default, -delegate=%BLUE%false%ENDCOLOR% </pre> Paul Millar ( a primary dCache Dev ) commented in this way : <pre> srmcp tries to avoid the wall-clock time and CPU overhead of delegation if that delegation isn't necessary. Unfortunately, there is a bug: the copyjobfile ( used by PhEDEx ) option is not consulted when determining whether third-party transfers are involved. The consequence is that all such transfers are considered second-party and no delegation is done.</pre> This bug badly affects PhEDEx ; due to it a working =PhEDEx/dcache-srmclient-2.2.4-2= configuration will stop to work by simply migrating to =PhEDEx/dcache-srmclient-2.10.7-1.noarch= and you'll get ( cryptic ) errors like :<pre> 21 Apr 2015 07:11:13 (SRM-t3se01) [192.33.123.205:52205 VI8:439841:srm2:copy:-2098574001] failed to connect to srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/store/mc/RunIIWinter15GS/RSGravToWW_kMpl01_M-2000_TuneCUETP8M1_13TeV-pythia8/GEN-SIM/MCRUN2_71_V1-v1/30000/AACEC97E-11B0-E411-9245-001E68862A32.root %RED%credential remaining lifetime is less then a minute%ENDCOLOR% </pre> Fabio fixed this by explicitly requesting =%RED%-delegate=true%ENDCOLOR%= to bypass the current =copyjob= bug : <pre> [root@t3cmsvobox01 PhEDEx]# grep -Hn srmcp /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart* | grep -v \# /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart.DebugServices:13: -command srmcp,%RED%-delegate=true%ENDCOLOR%,-pushmode=true,-debug=true,-retry_num=2,-protocols=gsiftp,-srm_protocol_version=2,-streams_num=1,-globus_tcp_port_range=20000:25000 /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart.Standard:13: -command srmcp,%RED%-delegate=true%ENDCOLOR%,-pushmode=true,-debug=true,-retry_num=2,-protocols=gsiftp,-srm_protocol_version=2,-streams_num=1,-globus_tcp_port_range=20000:25000 </pre> Fabio noticed another bug again in =dcache-srmclient-2.10.7-1= where the default proxy location =/tmp/x509up_u`id -u`= is considered even if we explicitly specify the option =-x509_user_proxy= to use a different path : <pre> Dear Paul and dCache colleagues, I believe I've found another bug in dcache-srmclient-2.10.7-1.noarch $ srmls -debug=false -x509_user_proxy=/home/phedex/gridcert/proxy.cert -retry_num=0 'srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/mc/RunIIWinter15GS/RSGravToWWToLNQQ_kMpl01_M-4000_TuneCUETP8M1_13TeV-pythia8/GEN-SIM/MCRUN2_71_V1-v1/10000/2898A22B-62B0-E411-B1D4-002590D600EE.root' srm client error: %RED%java.lang.IllegalArgumentException: Multiple entries with same key:%ENDCOLOR% x509_user_proxy=/home/phedex/gridcert/proxy.cert and x509_user_proxy=/tmp/x509up_u205 </pre> Fabio fixed it by tweaking the following PhEDEx scripts : <pre> [root@t3cmsvobox01 PhEDEx]# grep -Hn %RED%export%ENDCOLOR% /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownload* --color /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadDelete:14: %RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmrm -retry_num=0 "$pfn"; /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:31: *managerv2* ) echo $(%RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmls -debug=false -retry_num=0 "$path" 2>/dev/null| grep $file | cut -d\ -f3);; /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:44: fields=($(%RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmls -l -debug=false -retry_num=0 "$pfn" 2>/dev/null| grep Checksum)) /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:116: *managerv2*) %RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmrm -retry_num=0 "$pfn";; </pre> ---++ PhEDEx =git= repo cloned as a reference To observe the PhEDEx code progresses keep updated the local git repo : <pre> [root@t3cmsvobox01 git]# su - phedex -bash-4.1$ cd git -bash-4.1$ cd PHEDEX/ -bash-4.1$ %BLUE%git pull%ENDCOLOR% remote: Counting objects: 14, done. remote: Compressing objects: 100% (8/8), done. remote: Total 14 (delta 2), reused 0 (delta 0), pack-reused 6 Unpacking objects: 100% (14/14), done. From https://github.com/dmwm/PHEDEX 7768ae7..66c984f master -> origin/master Updating 7768ae7..66c984f Fast-forward Contrib/subscription_info.py | 126 ++++++++++++++++++++++++++++++++++++++++++ Utilities/testSpace/testAuth | 27 +++++++++ 2 files changed, 153 insertions(+), 0 deletions(-) create mode 100755 Contrib/subscription_info.py create mode 100644 Utilities/testSpace/testAuth </pre> ---++ How to connect to the PhEDEx DBs PhEDEx itself connects to the CERN Oracle DBs and you can directly inspect them by =sqlplus= ; in another shell observe by =netstat -tp | grep sqlplus= your =sqlplus= connections and kill them by =killall sqlplus= if =sqlplus= will hang ; in real life you'll seldom need to connect by =sqlplus= but it's important to be aware about this option : <pre>[root@t3cmsvobox01 phedex]# su - phedex -bash-4.1$ source /home/phedex/PHEDEX/etc/profile.d/env.sh -bash-4.1$ which sqlplus ~/sw/slc6_amd64_gcc461/external/oracle/11.2.0.3.0__10.2.0.4.0/bin/sqlplus -bash-4.1$ sqlplus $(/home/phedex/PHEDEX/Utilities/OracleConnectId -db /home/phedex/config/DBParam.PSI:%BLUE%Prod%ENDCOLOR%/PSI) SQL*Plus: Release 11.2.0.3.0 Production on Wed May 27 14:16:11 2015 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to:%BLUE% Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options%ENDCOLOR% SQL> select id,name from t_adm_node where name like '%CSCS%' or name like '%PSI%' ; ID NAME ---------- -------------------- 27 T2_CH_CSCS %ORANGE%821 T3_CH_PSI%ENDCOLOR% SQL> select distinct r.id, r.created_by, r.time_create,r.comments reqcomid, rds.dataset_id, rds.name, rd.decided_by, rd.time_decided, rd.comments accomid from t_req_request r join t_req_type rt on rt.id = r.type join t_req_node rn on rn.request = r.id left join t_req_decision rd on rd.request = r.id and rd.node = rn.node join t_req_dataset rds on rds.request = r.id where rn.node = %ORANGE%821%ENDCOLOR% and rt.name = 'xfer' and rd.decision = 'y' and dataset_id in (select distinct b.dataset from t_dps_block b join t_dps_block_replica br on b.id = br.block join t_dps_dataset d on d.id = b.dataset where node = %ORANGE%821%ENDCOLOR% ) order by r.time_create desc ; ID CREATED_BY TIME_CREATE REQCOMID DATASET_ID NAME DECIDED_BY TIME_DECIDED ACCOMID ---------- ---------- ----------- ---------- ---------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ------------ ---------- 441651 786542 1429196738 303750 674704 /RSGravToWW_kMpl01_M-1800_TuneCUETP8M1_13TeV-pythia8/RunIIWinter15GS-MCRUN2_71_V1-v1/GEN-SIM 786664 1429287626 303779 441651 786542 1429196738 303750 674709 /RSGravToWW_kMpl01_M-2500_TuneCUETP8M1_13TeV-pythia8/RunIIWinter15GS-MCRUN2_71_V1-v1/GEN-SIM ... </pre> ---+ Regular Maintenance work ---++ Keeping updated CMS GIT Siteconf If you modify the local PhEDEx configurations then you'll have to publish these changes into https://git.cern.ch/reps/siteconf ; you CERN id + password are required ; following the =.git/config= file used by Fabio :%TWISTY% <pre> [martinelli_f@t3ui18 siteconf]$ cat /shome/martinelli_f/git/siteconf/.git/config [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "origin"] fetch = +refs/heads/*:refs/remotes/origin/* url = https://martinel@git.cern.ch/reps/siteconf [branch "master"] remote = origin merge = refs/heads/master [user] name = Fabio Martinelli email = fabio.martinelli@psi.ch [merge] tool = vimdiff [color] diff = auto status = auto branch = auto </pre>%ENDTWISTY% ---++ Nagios https://t3nagios.psi.ch/check_mk/index.py?start_url=%2Fcheck_mk%2Fview.py%3Fview_name%3Dhost%26host%3Dt3cmsvobox01%26site%3D ---++ Checking the recent transfer errors https://cmsweb.cern.ch/phedex/prod/Activity::ErrorInfo?tofilter=T3_CH_PSI&fromfilter=&report_code=.*&xfer_code=.*&to_pfn=.*&from_pfn=.*&log_detail=.*&log_validate=.*&.submit=Update# ---++ Dataset cleaning This task must be done regularly (once every 2 months, for example), both for CSCS and PSI. *Getting the datasets list* Connect to t3cmsvobox as root and: <verbatim> su - phedex cd svn-sandbox/phedex/DB-query-tools/ source /home/phedex/PHEDEX/etc/profile.d/env.sh ./ListSiteDataInfo.pl -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep "eleted" ./ListSiteDataInfo.pl -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep -vE "Dutta|Fanfani|Kress|Magini|Wuerthwein|Belforte|Spinoso|Ajit|DataOps|eleted|StoreResults|Argiro|Klute|vocms237|IntelROCCS" ./ListSiteDataInfo.pl -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%PSI%" </verbatim> The *first* PERL command creates a list of datasets that can be safely deleted from CSCS, as they are just support requests for transfers to PSI (check that the transfer happened safely). <br /> The *second* command creates a list avoiding to include central requests, and the ones that can be deleted from CSCS.<br /> The *third* command produces a list for PSI. Datasets which are proposed for deletion are all the datasets which have an *expired retention time*. *Publishing the list and notify users* Due date for feedback is usually in a week. Lists must be published in DataSetCleaningQuery (previous lists must be deleted). To get the information on the total size proposed for deletion, you can create a temporary text file with pasted list from the twiki and then do: <verbatim> cat tmp.list | awk 'BEGIN{sum=0}{sum+=$4}END{print sum/1024.}' </verbatim> This will give the total size in TB. A email like this must be sent to the =cms-tier3-users@lists.psi.ch= mailing list: <verbatim> Subject: Dataset deletion proposal and request for User Data cleaning - Due date: 28 Oct 2011, 9:0 Dear all, a new cleaning campaign is needed, both at CSCS and PSI. You can find the list and the instructions on how to request to keep the data here: https://wiki.chipp.ch/twiki/bin/view/CmsTier3/DataSetCleaningQuery The data contained in the lists amount to 47TB / 44TB for CSCS / PSI. If you need to store a dataset both at CSCS and at PSI please also reply to this email explaining why. Please remember to clean up your user folder at CSCS regularly; a usage overview can be found at [1] and [2] Thanks, Daniel [1] http://ganglia.lcg.cscs.ch/ganglia/cms_sespace.txt [2] http://ganglia.lcg.cscs.ch/ganglia/files_cms.html </verbatim> ---++ Dataset cleaning - 2nd version Derek also made this less cryptic ( you don't need to know the Oracle DBs tables and columns, and of course Perl ) Python tool : <pre>[root@t3cmsvobox01 DB-query-tools]# ./ListSiteDataInfoWS.py --site T3_CH_PSI Getting the data from the data service... | *keep?*| *ID*| *Dataset*|*Size(GB)*| *Group*|*Requested on*|*Requested by*|*Comments*|*Comments2*| | | 225527|/GluGluToHToWWTo2L2Nu_M-160_7TeV-powheg-pythia6/Winter10-E7TeV_ProbDist_2011Flat_BX156_START39_V8-v1/AODSIM|25.5| b-tagging|2011-02-18 13:35:49|Wolfram Erdmann|retention time April 2011|to be deleted from CSCS| | | 269087|/BdToMuMu_2MuPtFilter_7TeV-pythia6-evtgen/Summer11-PU_S4_START42_V11-v1/GEN-SIM-RECO|58.6| b-physics|2011-06-08 12:34:25|Christoph Naegeli|retention-time: 2011-10-31| | | | 320266|/RelValProdTTbar/SAM-MC_42_V12_SAM-v1/GEN-SIM-RECO|3.1| FacOps|2011-09-13 09:58:51|Andrea Sciaba| |Centrally approved (Nicolo)| ... </pre> ---++ Renewing myproxy certificate for !PhEDEx transfers (once each 11 months) *Nagios daily checks the [[https://t3nagios.psi.ch/nagios/cgi-bin/extinfo.cgi?type=2&host=t3cmsvobox&service=CMS+VOMS+proxy+age][voms proxy lifetime]] used by PhEDEx; this proxy is a Fabio CMS proxy and because of that all the PhEDEx files uploaded in =/pnfs/psi.ch/cms/= belong to his account. If you change that proxy then you have to change the related files/dirs ownership in =/pnfs/psi.ch/cms= ; specifically you'll want to change the owner of =/pnfs/psi.ch/cms/trivcat/store/data= , conversely you will get a lot of =permission denied=. Following how to upload a long-life proxy into =myproxy.cern.ch= : <pre>%BLUE%$%ENDCOLOR% myproxy-init -t 168 -R 't3cmsvobox.psi.ch' -l %GREEN%psi_phedex_fabio%ENDCOLOR% -x -k renewable -s myproxy.cern.ch -c %RED%8700%ENDCOLOR% Your identity: /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli Enter GRID pass phrase for this identity: Creating proxy .......................................................................................................................................... Done Proxy Verify OK Warning: your certificate and proxy will expire Thu Dec 10 01:00:00 2015 which is within the requested lifetime of the proxy A proxy valid for %RED%8700%ENDCOLOR% hours (%RED%362.5 days%ENDCOLOR%) for user %GREEN%psi_phedex_fabio%ENDCOLOR% now exists on myproxy.cern.ch. # That %RED%362.5 days%ENDCOLOR% is wrong ! %BLUE%$%ENDCOLOR% myproxy-info -s myproxy.cern.ch -l %GREEN%psi_phedex_fabio%ENDCOLOR% username: %GREEN%psi_phedex_fabio%ENDCOLOR% owner: /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli name: renewable renewal policy: */CN=t3cmsvobox.psi.ch timeleft: 6249:20:19 (%RED%260.4 days%ENDCOLOR%) </pre> The present myproxy servers have problems with host certificates for PSI from SWITCH, because they contain a "(PSI)" substring, and the parentheses are not correctly escaped in the regexp matching of the myproxy code. Therefore, the renewer DN (-R argument to myproxy-init below) and the _allowed renewers policy on the myproxy server_ need to be defined with wildcards to enable the matching to succeed. <pre> voms-proxy-init -voms cms myproxyserver=myproxy.cern.ch <span style="text-decoration: line-through;">servicecert="/DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=hosts/C=CH/ST=Aargau/L=Villigen/O=Paul-Scherrer-Institut (PSI)/OU=AIT/CN=t3cmsvobox.psi.ch"</span> servicecert='*/CN=t3cmsvobox.psi.ch' myproxy-init -s $myproxyserver -l psi_phedex -x -R "$servicecert" -c 720 scp ~/.x509up_u$(id -u) phedex@t3ui01:gridcert/proxy.cert # for testing, you can try myproxy-info -s $myproxyserver -l psi_phedex </pre> As the phedex user do <pre>chmod 600 ~/gridcert/proxy.cert </pre> You should test whether the renewal of the certificate works for the phedex user: unset X509_USER_PROXY # make sure that the service credentials from ~/.globus are used! <pre>voms-proxy-init # initializes the service proxy cert that is allowed to retrieve the user cert myproxyserver=myproxy.cern.ch myproxy-get-delegation -s $myproxyserver -v -l psi_phedex -a /home/phedex/gridcert/proxy.cert -o /tmp/gagatest export X509_USER_PROXY=/tmp/gagatest srm-get-metadata srm://t3se01.psi.ch:8443/srm/managerv1?SFN=/pnfs/psi.ch/cms rm /tmp/gagatest </pre> ---++ Storage Consistency Checks From time to time the transfer team will ask for input for their [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsTransferTeamConsistencyChecks][storage consistency check]] (so far only for T2); the last CSCS check was in [[https://ggus.eu/index.php?mode=ticket_info&ticket_id=101366][Feb 2014]] ; to perform a 'Storage Consistency Check' we need to complete the following steps: * make sure PhEDEx is updated to the latest version and its config is committed in [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsSiteconfMigrationToGit][GIT]] * ask CSCS admins for a storage dump <pre>python chimera-dump.py -s /pnfs/lcg.cscs.ch/cms -c fulldump -g -o /tmp/outfile</pre> * convert the file using: <pre>sed -e 's#/pnfs/lcg.cscs.ch/cms/trivcat/store/\(mc\|data\|generator\|results\|hidata\|himc\|lumi\|relval\)/#/store/\1/#' \ -e '/<entry name="\/pnfs\/lcg.cscs.ch\/cms\/.*<\/entry>/d' \ -e 's#<dCache:location>.*</dCache:location>##' \ outfile.xml | uniq > storagedump.xml </pre> * compress, store on AFS, and send path to transfer team * take the file you get back from the transfer team with the LFNs to be deleted <pre>for LFN in $(cat SCC_Nov2012_CSCS_LFNsToBeRemoved.txt); do lcg-del -b -D srmv2 -l srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/$LFN; done </pre> ---+ Emergency Measures <!-- #List any measures that must be taken in case of some major incident, e.g. whether a mailing #list must be contacted or whether other services need to be shut down, etc. --> ---+ Services ---++ =/home/phedex/phedex_start.sh= To be manually invoked after a server restart ! ---++ =/home/phedex/phedex_stop.sh= ---++ =/home/phedex/phedex_status.sh= <!-- #List all the important services, their installation, configuration and how to start and stop them --> <pre> -bash-4.1$ pstree -uh phedex -la %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileDownload -state /home/phedex/state/Debug/incoming/download/ -log /home/phedex/log/Debug/download -verbose -db /home/phedex/config/DBParam.PSI:Debug/PSI -nodesT3_CH %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileExport -state /home/phedex/state/Debug/incoming/fileexport/ -log /home/phedex/log/Debug/fileexport -db /home/phedex/config/DBParam.PSI:Debug/PSI -nodes T3_CH_PSI-s %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileRemove -state /home/phedex/state/Debug/incoming/fileremove/ -log /home/phedex/log/Debug/fileremove -node T3_CH_PSI -db /home/phedex/config/DBParam.PSI:Debug/PSI-pr %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Debug/incoming/blockverify/ -log /home/phedex/log/Debug/blockverify --db /home/phedex/config/DBParam.PSI:Debug/PSI --nodesT %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactory.pl -state /home/phedex/state/Debug/incoming/watchdog/ -log /home/phedex/log/Debug/watchdog -db /home/phedex/config/DBParam.PSI:Debug/PSI -node T3_CH_PSI -config/ %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactoryLite.pl -state /home/phedex/state/Debug/incoming/WatchdogLite/ -log /home/phedex/log/Debug/WatchdogLite -node T3_CH_PSI -agent_list watchdog %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileDownload -state /home/phedex/state/Dev/incoming/download/ -log /home/phedex/log/Dev/download -verbose -db /home/phedex/config/DBParam.PSI:Dev/PSI -nodes T3_CH_PSI- %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileExport -state /home/phedex/state/Dev/incoming/fileexport/ -log /home/phedex/log/Dev/fileexport -db /home/phedex/config/DBParam.PSI:Dev/PSI -nodes T3_CH_PSI-storage %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileRemove -state /home/phedex/state/Dev/incoming/fileremove/ -log /home/phedex/log/Dev/fileremove -node T3_CH_PSI -db /home/phedex/config/DBParam.PSI:Dev/PSI-protocol %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Dev/incoming/blockverify/ -log /home/phedex/log/Dev/blockverify --db /home/phedex/config/DBParam.PSI:Dev/PSI --nodesT3_CH_P %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactory.pl -state /home/phedex/state/Dev/incoming/watchdog/ -log /home/phedex/log/Dev/watchdog -db /home/phedex/config/DBParam.PSI:Dev/PSI -node T3_CH_PSI -config/home/p %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactoryLite.pl -state /home/phedex/state/Dev/incoming/WatchdogLite/ -log /home/phedex/log/Dev/WatchdogLite -node T3_CH_PSI -agent_list watchdog %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileDownload -state /home/phedex/state/Prod/incoming/download/ -log /home/phedex/log/Prod/download -verbose -db /home/phedex/config/DBParam.PSI:Prod/PSI -nodesT3_CH_PS %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileExport -state /home/phedex/state/Prod/incoming/fileexport/ -log /home/phedex/log/Prod/fileexport -db /home/phedex/config/DBParam.PSI:Prod/PSI -nodes T3_CH_PSI-stor %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileRemove -state /home/phedex/state/Prod/incoming/fileremove/ -log /home/phedex/log/Prod/fileremove -node T3_CH_PSI -db /home/phedex/config/DBParam.PSI:Prod/PSI-proto %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Prod/incoming/blockverify/ -log /home/phedex/log/Prod/blockverify --db /home/phedex/config/DBParam.PSI:Prod/PSI --nodesT3_C %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactory.pl -state /home/phedex/state/Prod/incoming/watchdog/ -log /home/phedex/log/Prod/watchdog -db /home/phedex/config/DBParam.PSI:Prod/PSI -node T3_CH_PSI -config/hom %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactoryLite.pl -state /home/phedex/state/Prod/incoming/WatchdogLite/ -log /home/phedex/log/Prod/WatchdogLite -node T3_CH_PSI -agent_list watchdog bash └─pstree -uh phedex -la </pre>%ENDTWISTY% ---++ =netstat -tp= <pre> [root@t3cmsvobox01 git]# netstat -tp Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 t3cmsvobox01.psi.ch:42088 itrac50063-v.cern.ch:10121 ESTABLISHED 23470/perl tcp 0 0 t3cmsvobox01.psi.ch:34154 itrac50063-v.cern.ch:10121 ESTABLISHED 22967/perl tcp 0 0 t3cmsvobox01.psi.ch:38659 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:38662 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:43269 itrac50063-v.cern.ch:10121 ESTABLISHED 23973/perl tcp 0 0 t3cmsvobox01.psi.ch:41053 itrac50063-v.cern.ch:10121 ESTABLISHED 22846/perl tcp 0 0 t3cmsvobox01.psi.ch:43886 t3admin01.psi.ch:4505 ESTABLISHED 9353/python2.6 <-- salt minion tcp 0 0 t3cmsvobox01.psi.ch:40990 itrac50063-v.cern.ch:10121 ESTABLISHED 23267/perl tcp 0 0 t3cmsvobox01.psi.ch:51930 t3service01.p:fujitsu-dtcns ESTABLISHED 1224/syslog-ng tcp 1 0 t3cmsvobox01.psi.ch:39198 t3frontier01.psi.ch:squid CLOSE_WAIT 2530/cvmfs2 tcp 0 0 t3cmsvobox01.psi.ch:41978 itrac50063-v.cern.ch:10121 ESTABLISHED 23770/perl tcp 1 0 t3cmsvobox01.psi.ch:55127 t3frontier01.psi.ch:squid CLOSE_WAIT 2530/cvmfs2 tcp 0 0 t3cmsvobox01.psi.ch:38663 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:733 t3fs06.psi.ch:nfs ESTABLISHED - tcp 0 0 t3cmsvobox01.psi.ch:41150 itrac50063-v.cern.ch:10121 ESTABLISHED 23852/perl tcp 0 0 t3cmsvobox01.psi.ch:42399 itrac50063-v.cern.ch:10121 ESTABLISHED 22764/perl tcp 0 0 t3cmsvobox01.psi.ch:38660 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:41061 t3admin01.psi.ch:4506 ESTABLISHED 9353/python2.6 <-- salt minion tcp 0 0 t3cmsvobox01.psi.ch:38674 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:41821 itrac50063-v.cern.ch:10121 ESTABLISHED 23349/perl </pre> ---++ Checking each CMS pool by Nagios through both =t3se01:SRM= and =t3dcachedb:Xrootd= By =t3cmsvobox= in turn contacted by =t3nagios= we retrieve a file from each CMS pool through both =t3se01:SRM= and =t3dcachedb:Xrootd= https://t3nagios.psi.ch/check_mk/index.py?start_url=%2Fcheck_mk%2Fview.py%3Fview_name%3Dhost%26host%3Dt3cmsvobox01%26site%3D In both the cases the test files retrieved are : <pre>[martinelli_f@t3ui12 ~]$ find /pnfs/psi.ch/cms/t3-nagios/ | grep M | sort /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs01_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs02_cms ... /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_9 </pre> The related dCache files have to be obviously placed on the right CMS pool otherwise the Nagios tests will be wrong ! To easily check where they are really placed run this SQL code ( in this example some test files are %RED%erroneously%ENDCOLOR% available in the wrong pool ! that was due to a bad =migration cache= command ) %TWISTY% <pre> [root@t3dcachedb03 ~]# psql -U nagios -d chimera -c " select path,ipnfsid,pools from v_pnfs where path like '%1MB-test-file_pool_%' ; " path | ipnfsid | pools -------------------------------------------------------------+--------------------------------------+------------------------------------ /pnfs/psi.ch/dteam/t3-nagios/1MB-test-file_pool_t3fs09_ops | 0000BCDA4B329DA94D64AAAFE7C0C7501E5C | t3fs09_ops /pnfs/psi.ch/dteam/t3-nagios/1MB-test-file_pool_t3fs08_ops | 0000358B14867ED5402184C2C22F81EFC861 | t3fs08_ops /pnfs/psi.ch/dteam/t3-nagios/1MB-test-file_pool_t3fs07_ops | 0000409BB804C95944A38DBE8220B416A8A3 | t3fs07_ops /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_9 | 0000B58A7FA17778439F8F6F47C5CBBED5E7 | %RED%t3fs03_cms t3fs11_cms %ENDCOLOR%t3fs14_cms_9 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_8 | 00001A2FD52D31DB4CCAB99C8B8336522339 | %RED%t3fs09_cms t3fs11_cms %ENDCOLOR%t3fs14_cms_8 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_7 | 000018AA61C1E30F43709F0D9FE3B9CD65D1 | %RED%t3fs03_cms %ENDCOLOR%t3fs14_cms_7 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_6 | 0000E88C6CBB2D5A4365B11BE2EDD1554366 | %RED%t3fs02_cms %ENDCOLOR%t3fs14_cms_6 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_5 | 000200000000000006300738 | %RED%t3fs10_cms %ENDCOLOR%t3fs14_cms_5 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_4 | 0002000000000000052EF198 | %RED%t3fs03_cms %ENDCOLOR%t3fs14_cms_4 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_3 | 0002000000000000052EF168 | %RED%t3fs03_cms %ENDCOLOR%t3fs14_cms_3 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_2 | 0002000000000000052EF138 | %RED%t3fs07_cms %ENDCOLOR%t3fs14_cms_2 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 | 00003616229002194F439925DA3C7F1CFA02 | t3fs14_cms_11 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_10 | 0000B3D6A96EF961473AACB05F80CF9D6892 | t3fs14_cms_10 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_1 | 0002000000000000052EF108 | %RED%t3fs02_cms t3fs11_cms %ENDCOLOR%t3fs14_cms_1 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_0 | 0000A6470E0458354BD99D6C2DD27B196DCC | %RED%t3fs08_cms %ENDCOLOR%t3fs14_cms_0 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms | 0002000000000000052EF0D8 | %RED%t3fs03_cms t3fs04_cms %ENDCOLOR%t3fs14_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_9 | 00004783F9158A5941B284342FF4A8EDE126 | %RED%t3fs08_cms %ENDCOLOR%t3fs13_cms_9 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_8 | 0000132841305C27434891574015FD2CF923 | %RED%t3fs09_cms %ENDCOLOR%t3fs13_cms_8 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_7 | 00003FC27733ACBA4A809677419256FE22F9 | %RED%t3fs02_cms t3fs11_cms %ENDCOLOR%t3fs13_cms_7 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_6 | 0002000000000000072F8630 | %RED%t3fs07_cms t3fs11_cms %ENDCOLOR%t3fs13_cms_6 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_5 | 0002000000000000052EF0A8 | %RED%t3fs03_cms %ENDCOLOR%t3fs13_cms_5 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_4 | 0002000000000000052EF078 | %RED%t3fs10_cms t3fs11_cms %ENDCOLOR%t3fs13_cms_4 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_3 | 0002000000000000052EF048 | %RED%t3fs10_cms %ENDCOLOR%t3fs13_cms_3 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_2 | 0002000000000000052EF018 | %RED%t3fs02_cms %ENDCOLOR%t3fs13_cms_2 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_11 | 00000DB49D5B69EB4C568834BD162C3DA8E7 | t3fs13_cms_11 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_10 | 0000073FF4F754BB4AB1B4599F412811BDA2 | t3fs13_cms_10 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_1 | 00000CB9E97140F940CD973C319045B43FDA | %RED%t3fs04_cms t3fs11_cms %ENDCOLOR%t3fs13_cms_1 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_0 | 00005560491A76DE49DBA142D3BE3CFE38D5 | %RED%t3fs02_cms t3fs11_cms %ENDCOLOR%t3fs13_cms_0 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms | 0002000000000000052EEFB8 | %RED%t3fs07_cms t3fs11_cms %ENDCOLOR%t3fs13_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs11_cms | 00009E4A9774085C4799B5C9C827DA03406F | t3fs11_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs10_cms | 000005D1DD24CA14448694E5C46A8AA8E91F | t3fs10_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs09_cms | 0000479ED8FDDC374BC68827AEDF1C146686 | t3fs09_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs08_cms | 00003A989AB6D1074D738594B1D01E2D03DE | t3fs08_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs07_cms | 0000119DDCFD0C5F42B89769BC9C104A997F | t3fs07_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs04_cms_1 | 0002000000000000063D8C68 | t3fs04_cms_1 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs04_cms | 00020000000000000395B300 | t3fs04_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs03_cms | 000200000000000006391F88 | t3fs03_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs02_cms | 00020000000000000330BF10 | t3fs02_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs01_cms | 00020000000000000330BF90 | t3fs01_cms </pre> %ENDTWISTY% ---+ Backups OS snapshots are nightly taken by the PSI VMWare Team ( contact Peter Huesser ) + we have LinuxBackupsByLegato to recover a single file.
NodeTypeForm
Hostnames
t3cmsvobox ( t3cmsvobox01 )
Services
PhEDEx
4.1.3
Hardware
PSI DMZ VMWare cluster
Install Profile
vobox
Guarantee/maintenance until
VMWare PSI Cluster
This topic: CmsTier3
>
WebHome
>
AdminArea
>
CmsVoBox
Topic revision: r38 - 2015-11-16 - FabioMartinelli
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback