Tags:
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # ######* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+!! Node Type: %CALC{"$SUBSTITUTE(%TOPIC%,NodeType,)"}% ---++!! Firewall requirements | *local port* | *open to* | *reason* | <!-- Example line #| 22/tcp | * | Example entry for ssh | --> --- %TOC{title="Table of contents"}% ---+ Installation ---++ Official Doc ( pretty chaotic ) https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhedexAdminDocsInstallation ---++ CSCS Similar Doc Refer to the description on the [[LCGTier2/CmsVObox][LCGTier2/CmsVObox]]. There is one important difference: *while we use FTS channels for the transfers to the Tier-2 we use the SRM backend for transfers to the Tier-3*, because we do not have a FTS channel for PSI. This issue is linked to registering PSI as a regular grid site, which until recently was not possible, since we only support a Grid SE, but no a CE. Thus there is no =fts.map= file in the configuration area for the !PhEDEx services. ---++ Installation by Puppet Installation is described by the Puppet files =tier3-baseclasses.pp= and =SL6_vobox.pp= both saved in the dir =puppetdirenodes=, where =puppetdirenodes= is an alias defined in the following list : <pre> alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom' alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs' alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/' alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/nodes' alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat' alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10' alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts' </pre> ---++ local X509 <pre> # ll /home/phedex/.globus/ total 4 lrwxrwxrwx 1 phedex phedex 31 Apr 13 18:44 usercert.pem -> /etc/grid-security/hostcert.pem -r-------- 1 phedex phedex 1679 Apr 13 18:44 userkey.pem </pre> ---++ =/cvmfs= Study 1st the CVMFS page Be aware of https://twiki.cern.ch/twiki/bin/view/CMSPublic/CernVMFS4cms and the local =%BLUE%/cvmfs/cms.cern.ch%ENDCOLOR%= automatic mount point since =/cvmfs= is nowadays used by our PhEDEx configurations : <pre> [root@t3cmsvobox01 git]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 5.7G 4.3G 1.2G 79% / tmpfs 3.9G 0 3.9G 0% /dev/shm /dev/sda1 477M 32M 420M 7% /boot /dev/sda5 2.9G 640M 2.1G 24% /home /dev/sdb1 20G 9.1G 11G 46% /opt/cvmfs_local <-- local /cvmfs cache /dev/sda6 969M 1.7M 917M 1% /tmp /dev/sda7 5.7G 874M 4.6G 16% /var /dev/sdc1 9.9G 102M 9.3G 2% /var/cache/openafs t3fs06:/shome 6.7T 5.0T 1.8T 75% /shome t3fs05:/swshare 1.8T 562G 1.3T 31% /swshare AFS 2.0T 0 2.0T 0% /afs cvmfs2 14G 9.0G 4.7G 66% %BLUE%/cvmfs/cms.cern.ch%ENDCOLOR% </pre> Because of =/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml= that in turn is linked here : <pre> # ll /home/phedex/config/COMP/SITECONF/T3_CH_PSI/PhEDEx/storage.xml lrwxrwxrwx 1 phedex phedex 52 Apr 13 18:45 /home/phedex/config/COMP/SITECONF/T3_CH_PSI/PhEDEx/storage.xml -> /cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml </pre> ---++ Pitfalls in dcache-srmclient-2.10.7-1 ( currently the latest dcache-srmclient ) Strangely PhEDEx has a strong dependency on =dcache-srmclient= ; by strong we mean that you can't use equivalent SRM tools like =lcg-cp= or =gfal-copy= ; in its latest version, Fabio noticed that : <pre> srmcp as in dcache-srmclient-2.2.4-2.el6.x86_64 had, by default, -delegate=%BLUE%true%ENDCOLOR% srmcp as in dcache-srmclient-2.10.7-1.noarch has now, by default, -delegate=%BLUE%false%ENDCOLOR% </pre> Paul Millar ( a primary dCache Dev ) commented in this way : <pre> srmcp tries to avoid the wall-clock time and CPU overhead of delegation if that delegation isn't necessary. Unfortunately, there is a bug: the copyjobfile ( used by PhEDEx ) option is not consulted when determining whether third-party transfers are involved. The consequence is that all such transfers are considered second-party and no delegation is done.</pre> This bug badly affects PhEDEx ; due to it a working =PhEDEx/dcache-srmclient-2.2.4-2= configuration will stop to work by simply migrating to =PhEDEx/dcache-srmclient-2.10.7-1.noarch= and you'll get ( cryptic ) errors like :<pre> 21 Apr 2015 07:11:13 (SRM-t3se01) [192.33.123.205:52205 VI8:439841:srm2:copy:-2098574001] failed to connect to srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/store/mc/RunIIWinter15GS/RSGravToWW_kMpl01_M-2000_TuneCUETP8M1_13TeV-pythia8/GEN-SIM/MCRUN2_71_V1-v1/30000/AACEC97E-11B0-E411-9245-001E68862A32.root %RED%credential remaining lifetime is less then a minute%ENDCOLOR% </pre> Fabio fixed this by explicitly requesting =%RED%-delegate=true%ENDCOLOR%= to bypass the current =copyjob= bug : <pre> [root@t3cmsvobox01 PhEDEx]# grep -Hn srmcp /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart* | grep -v \# /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart.DebugServices:13: -command srmcp,%RED%-delegate=true%ENDCOLOR%,-pushmode=true,-debug=true,-retry_num=2,-protocols=gsiftp,-srm_protocol_version=2,-streams_num=1,-globus_tcp_port_range=20000:25000 /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart.Standard:13: -command srmcp,%RED%-delegate=true%ENDCOLOR%,-pushmode=true,-debug=true,-retry_num=2,-protocols=gsiftp,-srm_protocol_version=2,-streams_num=1,-globus_tcp_port_range=20000:25000 </pre> Fabio noticed another bug again in =dcache-srmclient-2.10.7-1= where the default proxy location =/tmp/x509up_u`id -u`= is considered even if we explicitly specify the option =-x509_user_proxy= to use a different path : <pre> Dear Paul and dCache colleagues, I believe I've found another bug in dcache-srmclient-2.10.7-1.noarch $ srmls -debug=false -x509_user_proxy=/home/phedex/gridcert/proxy.cert -retry_num=0 'srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/mc/RunIIWinter15GS/RSGravToWWToLNQQ_kMpl01_M-4000_TuneCUETP8M1_13TeV-pythia8/GEN-SIM/MCRUN2_71_V1-v1/10000/2898A22B-62B0-E411-B1D4-002590D600EE.root' srm client error: %RED%java.lang.IllegalArgumentException: Multiple entries with same key:%ENDCOLOR% x509_user_proxy=/home/phedex/gridcert/proxy.cert and x509_user_proxy=/tmp/x509up_u205 </pre> Fabio fixed it by tweaking the following PhEDEx scripts : <pre> [root@t3cmsvobox01 PhEDEx]# grep -Hn %RED%export%ENDCOLOR% /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownload* --color /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadDelete:14: %RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmrm -retry_num=0 "$pfn"; /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:31: *managerv2* ) echo $(%RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmls -debug=false -retry_num=0 "$path" 2>/dev/null| grep $file | cut -d\ -f3);; /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:44: fields=($(%RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmls -l -debug=false -retry_num=0 "$pfn" 2>/dev/null| grep Checksum)) /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:116: *managerv2*) %RED%export%ENDCOLOR% X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmrm -retry_num=0 "$pfn";; </pre> ---++ PhEDEx =git= repo cloned as a reference To observe the PhEDEx code progresses keep updated the local git repo : <pre> [root@t3cmsvobox01 git]# su - phedex -bash-4.1$ cd git -bash-4.1$ cd PHEDEX/ -bash-4.1$ %BLUE%git pull%ENDCOLOR% remote: Counting objects: 14, done. remote: Compressing objects: 100% (8/8), done. remote: Total 14 (delta 2), reused 0 (delta 0), pack-reused 6 Unpacking objects: 100% (14/14), done. From https://github.com/dmwm/PHEDEX 7768ae7..66c984f master -> origin/master Updating 7768ae7..66c984f Fast-forward Contrib/subscription_info.py | 126 ++++++++++++++++++++++++++++++++++++++++++ Utilities/testSpace/testAuth | 27 +++++++++ 2 files changed, 153 insertions(+), 0 deletions(-) create mode 100755 Contrib/subscription_info.py create mode 100644 Utilities/testSpace/testAuth </pre> ---++ How to connect to the PhEDEx DBs PhEDEx itself connects to the CERN Oracle DBs and you can directly inspect them by =sqlplus= ; in another shell observe by =netstat -tp | grep sqlplus= your =sqlplus= connections and kill them by =killall sqlplus= if =sqlplus= will hang ; in real life you'll seldom need to connect by =sqlplus= but it's important to be aware about this option : <pre>[root@t3cmsvobox01 phedex]# su - phedex -bash-4.1$ source /home/phedex/PHEDEX/etc/profile.d/env.sh -bash-4.1$ which sqlplus ~/sw/slc6_amd64_gcc461/external/oracle/11.2.0.3.0__10.2.0.4.0/bin/sqlplus -bash-4.1$ sqlplus $(/home/phedex/PHEDEX/Utilities/OracleConnectId -db /home/phedex/config/DBParam.PSI:%BLUE%Prod%ENDCOLOR%/PSI) SQL*Plus: Release 11.2.0.3.0 Production on Wed May 27 14:16:11 2015 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to:%BLUE% Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options%ENDCOLOR% SQL> select id,name from t_adm_node where name like '%CSCS%' or name like '%PSI%' ; ID NAME ---------- -------------------- 27 T2_CH_CSCS %ORANGE%821 T3_CH_PSI%ENDCOLOR% SQL> select distinct r.id, r.created_by, r.time_create,r.comments reqcomid, rds.dataset_id, rds.name, rd.decided_by, rd.time_decided, rd.comments accomid from t_req_request r join t_req_type rt on rt.id = r.type join t_req_node rn on rn.request = r.id left join t_req_decision rd on rd.request = r.id and rd.node = rn.node join t_req_dataset rds on rds.request = r.id where rn.node = %ORANGE%821%ENDCOLOR% and rt.name = 'xfer' and rd.decision = 'y' and dataset_id in (select distinct b.dataset from t_dps_block b join t_dps_block_replica br on b.id = br.block join t_dps_dataset d on d.id = b.dataset where node = %ORANGE%821%ENDCOLOR% ) order by r.time_create desc ; ID CREATED_BY TIME_CREATE REQCOMID DATASET_ID NAME DECIDED_BY TIME_DECIDED ACCOMID ---------- ---------- ----------- ---------- ---------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ------------ ---------- 441651 786542 1429196738 303750 674704 /RSGravToWW_kMpl01_M-1800_TuneCUETP8M1_13TeV-pythia8/RunIIWinter15GS-MCRUN2_71_V1-v1/GEN-SIM 786664 1429287626 303779 441651 786542 1429196738 303750 674709 /RSGravToWW_kMpl01_M-2500_TuneCUETP8M1_13TeV-pythia8/RunIIWinter15GS-MCRUN2_71_V1-v1/GEN-SIM ... </pre> ---+ Regular Maintenance work ---++ Keeping updated CMS GIT Siteconf If you modify the local PhEDEx configurations then you'll have to publish these changes into https://git.cern.ch/reps/siteconf ; you CERN id + password are required ; following the =.git/config= file used by Fabio :%TWISTY% <pre> [martinelli_f@t3ui18 siteconf]$ cat /shome/martinelli_f/git/siteconf/.git/config [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "origin"] fetch = +refs/heads/*:refs/remotes/origin/* url = https://martinel@git.cern.ch/reps/siteconf [branch "master"] remote = origin merge = refs/heads/master [user] name = Fabio Martinelli email = fabio.martinelli@psi.ch [merge] tool = vimdiff [color] diff = auto status = auto branch = auto </pre>%ENDTWISTY% ---++ Nagios https://t3nagios.psi.ch/check_mk/index.py?start_url=%2Fcheck_mk%2Fview.py%3Fview_name%3Dhost%26host%3Dt3cmsvobox01%26site%3D ---++ Checking the recent transfer errors https://cmsweb.cern.ch/phedex/prod/Activity::ErrorInfo?tofilter=T3_CH_PSI&fromfilter=&report_code=.*&xfer_code=.*&to_pfn=.*&from_pfn=.*&log_detail=.*&log_validate=.*&.submit=Update# ---++ Dataset cleaning This task must be done regularly (once every 2 months, for example), both for CSCS and PSI. *Getting the datasets list* <verbatim> ssh root@t3cmsvobox.psi.ch su - phedex cd svn-sandbox/phedex/DB-query-tools/ source /home/phedex/PHEDEX/4.1.7/etc/profile.d/env.sh # <-- change that 4.1.7 if newer ./ListSiteDataInfo.pl -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep "eleted" ./ListSiteDataInfo.pl -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep -vE "Dutta|Fanfani|Kress|Magini|Wuerthwein|Belforte|Spinoso|Ajit|DataOps|eleted|StoreResults|Argiro|Klute|vocms237|IntelROCCS" ./ListSiteDataInfo.pl -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%PSI%" </verbatim> The *first* PERL command creates a list of datasets that can be safely deleted from CSCS, as they are just support requests for transfers to PSI (check that the transfer happened safely). <br /> The *second* command creates a list avoiding to include central requests, and the ones that can be deleted from CSCS.<br /> The *third* command produces a list for PSI. Datasets which are proposed for deletion are all the datasets which have an *expired retention time*. *Publishing the list and notify users* Due date for feedback is usually in a week. Lists must be published in DataSetCleaningQuery (previous lists must be deleted). To get the information on the total size proposed for deletion, you can create a temporary text file with pasted list from the twiki and then do: <verbatim> cat tmp.list | awk 'BEGIN{sum=0}{sum+=$4}END{print sum/1024.}' </verbatim> This will give the total size in TB. A email like this must be sent to the =cms-tier3-users@lists.psi.ch= mailing list: <verbatim> Subject: Dataset deletion proposal and request for User Data cleaning - Due date: 28 Oct 2011, 9:0 Dear all, a new cleaning campaign is needed, both at CSCS and PSI. You can find the list and the instructions on how to request to keep the data here: https://wiki.chipp.ch/twiki/bin/view/CmsTier3/DataSetCleaningQuery The data contained in the lists amount to 47TB / 44TB for CSCS / PSI. If you need to store a dataset both at CSCS and at PSI please also reply to this email explaining why. Please remember to clean up your user folder at CSCS regularly; a usage overview can be found at [1] and [2] Thanks, Daniel [1] http://ganglia.lcg.cscs.ch/ganglia/cms_sespace.txt [2] http://ganglia.lcg.cscs.ch/ganglia/files_cms.html </verbatim> ---++ Dataset cleaning - 2nd version Derek also made this less cryptic ( you don't need to know the Oracle DBs tables and columns, and of course Perl ) Python tool : <pre>[root@t3cmsvobox01 DB-query-tools]# ./ListSiteDataInfoWS.py --site T3_CH_PSI Getting the data from the data service... | *keep?*| *ID*| *Dataset*|*Size(GB)*| *Group*|*Requested on*|*Requested by*|*Comments*|*Comments2*| | | 225527|/GluGluToHToWWTo2L2Nu_M-160_7TeV-powheg-pythia6/Winter10-E7TeV_ProbDist_2011Flat_BX156_START39_V8-v1/AODSIM|25.5| b-tagging|2011-02-18 13:35:49|Wolfram Erdmann|retention time April 2011|to be deleted from CSCS| | | 269087|/BdToMuMu_2MuPtFilter_7TeV-pythia6-evtgen/Summer11-PU_S4_START42_V11-v1/GEN-SIM-RECO|58.6| b-physics|2011-06-08 12:34:25|Christoph Naegeli|retention-time: 2011-10-31| | | | 320266|/RelValProdTTbar/SAM-MC_42_V12_SAM-v1/GEN-SIM-RECO|3.1| FacOps|2011-09-13 09:58:51|Andrea Sciaba| |Centrally approved (Nicolo)| ... </pre> ---++ Renewing myproxy certificate for !PhEDEx transfers (once each 11 months) *Nagios daily checks the [[https://t3nagios.psi.ch/nagios/cgi-bin/extinfo.cgi?type=2&host=t3cmsvobox&service=CMS+VOMS+proxy+age][voms proxy lifetime]] used by PhEDEx; this proxy is a Fabio CMS proxy and because of that all the PhEDEx files uploaded in =/pnfs/psi.ch/cms/= belong to his account. If you change that proxy then you have to change the related files/dirs ownership in =/pnfs/psi.ch/cms= ; specifically you'll want to change the owner of =/pnfs/psi.ch/cms/trivcat/store/data= , conversely you will get a lot of =permission denied=. Following how to upload a long-life proxy into =myproxy.cern.ch= : <pre>%BLUE%$%ENDCOLOR% myproxy-init -t 168 -R 't3cmsvobox.psi.ch' -l %GREEN%psi_phedex_fabio%ENDCOLOR% -x -k renewable -s myproxy.cern.ch -c %RED%8700%ENDCOLOR% Your identity: /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli Enter GRID pass phrase for this identity: Creating proxy .......................................................................................................................................... Done Proxy Verify OK Warning: your certificate and proxy will expire Thu Dec 10 01:00:00 2015 which is within the requested lifetime of the proxy A proxy valid for %RED%8700%ENDCOLOR% hours (%RED%362.5 days%ENDCOLOR%) for user %GREEN%psi_phedex_fabio%ENDCOLOR% now exists on myproxy.cern.ch. # That %RED%362.5 days%ENDCOLOR% is wrong ! %BLUE%$%ENDCOLOR% myproxy-info -s myproxy.cern.ch -l %GREEN%psi_phedex_fabio%ENDCOLOR% username: %GREEN%psi_phedex_fabio%ENDCOLOR% owner: /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli name: renewable renewal policy: */CN=t3cmsvobox.psi.ch timeleft: 6249:20:19 (%RED%260.4 days%ENDCOLOR%) </pre> The present myproxy servers have problems with host certificates for PSI from SWITCH, because they contain a "(PSI)" substring, and the parentheses are not correctly escaped in the regexp matching of the myproxy code. Therefore, the renewer DN (-R argument to myproxy-init below) and the _allowed renewers policy on the myproxy server_ need to be defined with wildcards to enable the matching to succeed. <pre> voms-proxy-init -voms cms myproxyserver=myproxy.cern.ch <span style="text-decoration: line-through;">servicecert="/DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=hosts/C=CH/ST=Aargau/L=Villigen/O=Paul-Scherrer-Institut (PSI)/OU=AIT/CN=t3cmsvobox.psi.ch"</span> servicecert='*/CN=t3cmsvobox.psi.ch' myproxy-init -s $myproxyserver -l psi_phedex -x -R "$servicecert" -c 720 scp ~/.x509up_u$(id -u) phedex@t3ui01:gridcert/proxy.cert # for testing, you can try myproxy-info -s $myproxyserver -l psi_phedex </pre> As the phedex user do <pre>chmod 600 ~/gridcert/proxy.cert </pre> You should test whether the renewal of the certificate works for the phedex user: unset X509_USER_PROXY # make sure that the service credentials from ~/.globus are used! <pre>voms-proxy-init # initializes the service proxy cert that is allowed to retrieve the user cert myproxyserver=myproxy.cern.ch myproxy-get-delegation -s $myproxyserver -v -l psi_phedex -a /home/phedex/gridcert/proxy.cert -o /tmp/gagatest export X509_USER_PROXY=/tmp/gagatest srm-get-metadata srm://t3se01.psi.ch:8443/srm/managerv1?SFN=/pnfs/psi.ch/cms rm /tmp/gagatest </pre> ---++ Storage Consistency Checks From time to time the transfer team will ask for input for their [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsTransferTeamConsistencyChecks][storage consistency check]] (so far only for T2); the last CSCS check was in [[https://ggus.eu/index.php?mode=ticket_info&ticket_id=101366][Feb 2014]] ; to perform a 'Storage Consistency Check' we need to complete the following steps: * make sure PhEDEx is updated to the latest version and its config is committed in [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/CompOpsSiteconfMigrationToGit][GIT]] * ask CSCS admins for a storage dump <pre>python chimera-dump.py -s /pnfs/lcg.cscs.ch/cms -c fulldump -g -o /tmp/outfile</pre> * convert the file using: <pre>sed -e 's#/pnfs/lcg.cscs.ch/cms/trivcat/store/\(mc\|data\|generator\|results\|hidata\|himc\|lumi\|relval\)/#/store/\1/#' \ -e '/<entry name="\/pnfs\/lcg.cscs.ch\/cms\/.*<\/entry>/d' \ -e 's#<dCache:location>.*</dCache:location>##' \ outfile.xml | uniq > storagedump.xml </pre> * compress, store on AFS, and send path to transfer team * take the file you get back from the transfer team with the LFNs to be deleted <pre>for LFN in $(cat SCC_Nov2012_CSCS_LFNsToBeRemoved.txt); do lcg-del -b -D srmv2 -l srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/$LFN; done </pre> ---+ Emergency Measures <!-- #List any measures that must be taken in case of some major incident, e.g. whether a mailing #list must be contacted or whether other services need to be shut down, etc. --> ---+ Services ---++ =/home/phedex/phedex_start.sh= To be manually invoked after a server restart ! ---++ =/home/phedex/phedex_stop.sh= ---++ =/home/phedex/phedex_status.sh= <!-- #List all the important services, their installation, configuration and how to start and stop them --> <pre> -bash-4.1$ pstree -uh phedex -la %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileDownload -state /home/phedex/state/Debug/incoming/download/ -log /home/phedex/log/Debug/download -verbose -db /home/phedex/config/DBParam.PSI:Debug/PSI -nodesT3_CH %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileExport -state /home/phedex/state/Debug/incoming/fileexport/ -log /home/phedex/log/Debug/fileexport -db /home/phedex/config/DBParam.PSI:Debug/PSI -nodes T3_CH_PSI-s %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileRemove -state /home/phedex/state/Debug/incoming/fileremove/ -log /home/phedex/log/Debug/fileremove -node T3_CH_PSI -db /home/phedex/config/DBParam.PSI:Debug/PSI-pr %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Debug/incoming/blockverify/ -log /home/phedex/log/Debug/blockverify --db /home/phedex/config/DBParam.PSI:Debug/PSI --nodesT %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactory.pl -state /home/phedex/state/Debug/incoming/watchdog/ -log /home/phedex/log/Debug/watchdog -db /home/phedex/config/DBParam.PSI:Debug/PSI -node T3_CH_PSI -config/ %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactoryLite.pl -state /home/phedex/state/Debug/incoming/WatchdogLite/ -log /home/phedex/log/Debug/WatchdogLite -node T3_CH_PSI -agent_list watchdog %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileDownload -state /home/phedex/state/Dev/incoming/download/ -log /home/phedex/log/Dev/download -verbose -db /home/phedex/config/DBParam.PSI:Dev/PSI -nodes T3_CH_PSI- %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileExport -state /home/phedex/state/Dev/incoming/fileexport/ -log /home/phedex/log/Dev/fileexport -db /home/phedex/config/DBParam.PSI:Dev/PSI -nodes T3_CH_PSI-storage %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileRemove -state /home/phedex/state/Dev/incoming/fileremove/ -log /home/phedex/log/Dev/fileremove -node T3_CH_PSI -db /home/phedex/config/DBParam.PSI:Dev/PSI-protocol %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Dev/incoming/blockverify/ -log /home/phedex/log/Dev/blockverify --db /home/phedex/config/DBParam.PSI:Dev/PSI --nodesT3_CH_P %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactory.pl -state /home/phedex/state/Dev/incoming/watchdog/ -log /home/phedex/log/Dev/watchdog -db /home/phedex/config/DBParam.PSI:Dev/PSI -node T3_CH_PSI -config/home/p %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactoryLite.pl -state /home/phedex/state/Dev/incoming/WatchdogLite/ -log /home/phedex/log/Dev/WatchdogLite -node T3_CH_PSI -agent_list watchdog %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileDownload -state /home/phedex/state/Prod/incoming/download/ -log /home/phedex/log/Prod/download -verbose -db /home/phedex/config/DBParam.PSI:Prod/PSI -nodesT3_CH_PS %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileExport -state /home/phedex/state/Prod/incoming/fileexport/ -log /home/phedex/log/Prod/fileexport -db /home/phedex/config/DBParam.PSI:Prod/PSI -nodes T3_CH_PSI-stor %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Transfer/FileRemove -state /home/phedex/state/Prod/incoming/fileremove/ -log /home/phedex/log/Prod/fileremove -node T3_CH_PSI -db /home/phedex/config/DBParam.PSI:Prod/PSI-proto %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Prod/incoming/blockverify/ -log /home/phedex/log/Prod/blockverify --db /home/phedex/config/DBParam.PSI:Prod/PSI --nodesT3_C %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactory.pl -state /home/phedex/state/Prod/incoming/watchdog/ -log /home/phedex/log/Prod/watchdog -db /home/phedex/config/DBParam.PSI:Prod/PSI -node T3_CH_PSI -config/hom %BLUE%perl%ENDCOLOR% /home/phedex/PHEDEX/Utilities/AgentFactoryLite.pl -state /home/phedex/state/Prod/incoming/WatchdogLite/ -log /home/phedex/log/Prod/WatchdogLite -node T3_CH_PSI -agent_list watchdog bash └─pstree -uh phedex -la </pre>%ENDTWISTY% ---++ =netstat -tp= <pre> [root@t3cmsvobox01 git]# netstat -tp Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 t3cmsvobox01.psi.ch:42088 itrac50063-v.cern.ch:10121 ESTABLISHED 23470/perl tcp 0 0 t3cmsvobox01.psi.ch:34154 itrac50063-v.cern.ch:10121 ESTABLISHED 22967/perl tcp 0 0 t3cmsvobox01.psi.ch:38659 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:38662 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:43269 itrac50063-v.cern.ch:10121 ESTABLISHED 23973/perl tcp 0 0 t3cmsvobox01.psi.ch:41053 itrac50063-v.cern.ch:10121 ESTABLISHED 22846/perl tcp 0 0 t3cmsvobox01.psi.ch:43886 t3admin01.psi.ch:4505 ESTABLISHED 9353/python2.6 <-- salt minion tcp 0 0 t3cmsvobox01.psi.ch:40990 itrac50063-v.cern.ch:10121 ESTABLISHED 23267/perl tcp 0 0 t3cmsvobox01.psi.ch:51930 t3service01.p:fujitsu-dtcns ESTABLISHED 1224/syslog-ng tcp 1 0 t3cmsvobox01.psi.ch:39198 t3frontier01.psi.ch:squid CLOSE_WAIT 2530/cvmfs2 tcp 0 0 t3cmsvobox01.psi.ch:41978 itrac50063-v.cern.ch:10121 ESTABLISHED 23770/perl tcp 1 0 t3cmsvobox01.psi.ch:55127 t3frontier01.psi.ch:squid CLOSE_WAIT 2530/cvmfs2 tcp 0 0 t3cmsvobox01.psi.ch:38663 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:733 t3fs06.psi.ch:nfs ESTABLISHED - tcp 0 0 t3cmsvobox01.psi.ch:41150 itrac50063-v.cern.ch:10121 ESTABLISHED 23852/perl tcp 0 0 t3cmsvobox01.psi.ch:42399 itrac50063-v.cern.ch:10121 ESTABLISHED 22764/perl tcp 0 0 t3cmsvobox01.psi.ch:38660 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:41061 t3admin01.psi.ch:4506 ESTABLISHED 9353/python2.6 <-- salt minion tcp 0 0 t3cmsvobox01.psi.ch:38674 t3ldap01.psi.ch:ldaps ESTABLISHED 5581/nslcd tcp 0 0 t3cmsvobox01.psi.ch:41821 itrac50063-v.cern.ch:10121 ESTABLISHED 23349/perl </pre> ---++ Checking each CMS pool by Nagios through both =t3se01:SRM= and =t3dcachedb:Xrootd= By =t3cmsvobox= in turn contacted by =t3nagios= we retrieve a file from each CMS pool through both =t3se01:SRM= and =t3dcachedb:Xrootd= https://t3nagios.psi.ch/check_mk/index.py?start_url=%2Fcheck_mk%2Fview.py%3Fview_name%3Dhost%26host%3Dt3cmsvobox01%26site%3D In both the cases the test files retrieved are : <pre>[martinelli_f@t3ui12 ~]$ find /pnfs/psi.ch/cms/t3-nagios/ | grep M | sort /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs01_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs02_cms ... /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_9 </pre> The related dCache files have to be obviously placed on the right CMS pool otherwise the Nagios tests will be wrong ! To easily check where they are really placed run this SQL code ( in this example some test files are %RED%erroneously%ENDCOLOR% available in the wrong pool ! that was due to a bad =migration cache= command ) %TWISTY% <pre> [root@t3dcachedb03 ~]# psql -U nagios -d chimera -c " select path,ipnfsid,pools from v_pnfs where path like '%1MB-test-file_pool_%' ; " path | ipnfsid | pools -------------------------------------------------------------+--------------------------------------+------------------------------------ /pnfs/psi.ch/dteam/t3-nagios/1MB-test-file_pool_t3fs09_ops | 0000BCDA4B329DA94D64AAAFE7C0C7501E5C | t3fs09_ops /pnfs/psi.ch/dteam/t3-nagios/1MB-test-file_pool_t3fs08_ops | 0000358B14867ED5402184C2C22F81EFC861 | t3fs08_ops /pnfs/psi.ch/dteam/t3-nagios/1MB-test-file_pool_t3fs07_ops | 0000409BB804C95944A38DBE8220B416A8A3 | t3fs07_ops /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_9 | 0000B58A7FA17778439F8F6F47C5CBBED5E7 | t3fs03_cms t3fs11_cms t3fs14_cms_9 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_8 | 00001A2FD52D31DB4CCAB99C8B8336522339 | t3fs09_cms t3fs11_cms t3fs14_cms_8 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_7 | 000018AA61C1E30F43709F0D9FE3B9CD65D1 | t3fs03_cms t3fs14_cms_7 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_6 | 0000E88C6CBB2D5A4365B11BE2EDD1554366 | t3fs02_cms t3fs14_cms_6 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_5 | 000200000000000006300738 | t3fs10_cms t3fs14_cms_5 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_4 | 0002000000000000052EF198 | t3fs03_cms t3fs14_cms_4 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_3 | 0002000000000000052EF168 | t3fs03_cms t3fs14_cms_3 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_2 | 0002000000000000052EF138 | t3fs07_cms t3fs14_cms_2 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 | 00003616229002194F439925DA3C7F1CFA02 | t3fs10_cms t3fs14_cms_11 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_10 | 0000B3D6A96EF961473AACB05F80CF9D6892 | t3fs07_cms t3fs14_cms_10 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_1 | 0002000000000000052EF108 | t3fs02_cms t3fs11_cms t3fs14_cms_1 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_0 | 0000A6470E0458354BD99D6C2DD27B196DCC | t3fs08_cms t3fs14_cms_0 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms | 0002000000000000052EF0D8 | t3fs03_cms t3fs04_cms t3fs14_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_9 | 00004783F9158A5941B284342FF4A8EDE126 | t3fs08_cms t3fs13_cms_9 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_8 | 0000132841305C27434891574015FD2CF923 | t3fs09_cms t3fs13_cms_8 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_7 | 00003FC27733ACBA4A809677419256FE22F9 | t3fs02_cms t3fs11_cms t3fs13_cms_7 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_6 | 0002000000000000072F8630 | t3fs07_cms t3fs11_cms t3fs13_cms_6 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_5 | 0002000000000000052EF0A8 | t3fs03_cms t3fs13_cms_5 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_4 | 0002000000000000052EF078 | t3fs10_cms t3fs11_cms t3fs13_cms_4 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_3 | 0002000000000000052EF048 | t3fs10_cms t3fs13_cms_3 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_2 | 0002000000000000052EF018 | t3fs02_cms t3fs13_cms_2 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_11 | 00000DB49D5B69EB4C568834BD162C3DA8E7 | t3fs09_cms t3fs13_cms_11 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_10 | 0000073FF4F754BB4AB1B4599F412811BDA2 | t3fs10_cms t3fs13_cms_10 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_1 | 00000CB9E97140F940CD973C319045B43FDA | t3fs04_cms t3fs11_cms t3fs13_cms_1 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms_0 | 00005560491A76DE49DBA142D3BE3CFE38D5 | t3fs02_cms t3fs11_cms t3fs13_cms_0 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs13_cms | 0002000000000000052EEFB8 | t3fs07_cms t3fs11_cms t3fs13_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs11_cms | 00009E4A9774085C4799B5C9C827DA03406F | t3fs11_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs10_cms | 000005D1DD24CA14448694E5C46A8AA8E91F | t3fs10_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs09_cms | 0000479ED8FDDC374BC68827AEDF1C146686 | t3fs09_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs08_cms | 00003A989AB6D1074D738594B1D01E2D03DE | t3fs08_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs07_cms | 0000119DDCFD0C5F42B89769BC9C104A997F | t3fs07_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs04_cms_1 | 0002000000000000063D8C68 | t3fs04_cms_1 /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs04_cms | 00020000000000000395B300 | t3fs04_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs03_cms | 000200000000000006391F88 | t3fs03_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs02_cms | 00020000000000000330BF10 | t3fs02_cms /pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs01_cms | 00020000000000000330BF90 | t3fs01_cms </pre> %ENDTWISTY% ---+ Backups OS snapshots are nightly taken by the PSI VMWare Team ( contact Peter Huesser ) + we have LinuxBackupsByLegato to recover a single file.
NodeTypeForm
Hostnames
t3cmsvobox ( t3cmsvobox01 )
Services
PhEDEx
4.1.3
Hardware
PSI DMZ VMWare cluster
Install Profile
vobox
Guarantee/maintenance until
VMWare PSI Cluster
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r50
|
r43
<
r42
<
r41
<
r40
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r41 - 2016-01-21
-
FabioMartinelli
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback