Node Type: CmsVoBox

Firewall requirements

local port open to reason


Official Doc ( pretty chaotic )

CSCS Similar Doc

Refer to the description on the LCGTier2/CmsVObox.

There is one important difference between CSCS and PSI : while we use FTS channels for the transfers to CSCS we use the SRM backend for transfers to PSI, because we do not have a FTS channel for PSI. This issue is linked to registering PSI as a regular grid site, which until recently was not possible, since we only support a Grid SE, but no a CE.

Thus there isn't a file in the configuration area for the PhEDEx services.

Installation by Puppet

Installations are made by Fabio at PSI, usually nobody apart from him should care about this task.

Installation is described by the Puppet files tier3-baseclasses.pp and SL6_vobox.pp both saved in the dir pdirmanifests, where pdirmanifests is defined in these Fabio's aliases :

alias ROOT='. /afs/ && . /afs/'
alias cscsela='ssh -AX'
alias cscslogin='ssh -AX'
alias cscspub='ssh -AX'
alias dcache='ssh -2 -l admin -p 22224'
alias dcache04='ssh -2 -l admin -p 22224'
alias gempty='git commit --allow-empty-message -m '\'''\'''
alias kscustom54='cd /afs/'
alias kscustom57='cd /afs/'
alias kscustom60='cd /afs/'
alias kscustom64='cd /afs/'
alias kscustom66='cd /afs/'
alias ksdir='cd /afs/'
alias ksprepostdir='cd /afs/'
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=tty'
alias mc='. /usr/libexec/mc/'
alias pdir='cd /afs/'
alias pdirf='cd /afs/'
alias pdirmanifests='cd /afs/'
alias pdirredhat='cd /afs/'
alias pdirsolaris='cd /afs/'
alias vi='vim'
alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'
alias yumdir5='cd /afs/'
alias yumdir6='cd /afs/'
alias yumdir7='cd /afs/'
alias yumdir7old='cd /afs/'

local X509

A server host X509 is essential to regularly the proxy saved in /home/phedex/gridcert/proxy.cert by :
# ll /home/phedex/.globus/
total 4
lrwxrwxrwx 1 phedex phedex   31 Apr 13 18:44 usercert.pem -> /etc/grid-security/hostcert.pem
-r-------- 1 phedex phedex 1679 Apr 13 18:44 userkey.pem

[root@t3cmsvobox01 ~]# grid-cert-info --file  /etc/grid-security/hostcert.pem
        Version: 3 (0x2)
        Serial Number: 131 (0x83)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: DC=ORG, DC=SEE-GRID, CN=SEE-GRID CA 2013
            Not Before: Feb  3 12:05:29 2016 GMT
            Not After : Feb  2 12:05:29 2017 GMT
        Subject: DC=EU, DC=EGI, C=CH,
this is cron constantly renewing the proxy /home/phedex/gridcert/proxy.cert used by PhEDEx for its data transfers ; it also produces PhEDEx stats in /shome/phedex/phedex-statistics.txt :
-bash-4.1$ cat /etc/cron.d/phedex 


#22 */4 * * * phedex source /home/phedex/PHEDEX/$PHEDEXVER/etc/profile.d/ ; unset X509_USER_PROXY ; /usr/bin/voms-proxy-init ; /usr/bin/myproxy-get-delegation -s -k renewable -v -l psi_phedex_2016_fabio -a /home/phedex/gridcert/proxy.cert -o /home/phedex/gridcert/proxy.cert; export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert; /usr/bin/voms-proxy-init -noregen -voms cms

# Feb '16
22 */4 * * * phedex source /home/phedex/PHEDEX/$PHEDEXVER/etc/profile.d/ ; unset X509_USER_PROXY ; /usr/bin/voms-proxy-init ; /usr/bin/myproxy-get-delegation -s -k renewable -v -l psi_t3cmsvobox_phedex_joosep_2016 -a /home/phedex/gridcert/proxy.cert -o /home/phedex/gridcert/proxy.cert; export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert; /usr/bin/voms-proxy-init -noregen -voms cms

# logrotate (the config file was generated by /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/
05 0 * * * phedex /usr/sbin/logrotate -s /home/phedex/state/logrotate.state /home/phedex/config/logrotate.conf

# log parsing and writing of results to the shared home area for consumption by the external web server
*/15 * * * * phedex source /home/phedex/PHEDEX/$PHEDEXVER/etc/profile.d/; echo -e generated on `date` "\n------------------------" > $SUMMARYFILE; echo "Prod:" >> $SUMMARYFILE;/home/phedex/init.d/phedex_Prod status >> $SUMMARYFILE;  echo "Debug:" >> $SUMMARYFILE;/home/phedex/init.d/phedex_Debug status >> $SUMMARYFILE; /home/phedex/PHEDEX/$PHEDEXVER/Utilities/InspectPhedexLog -c 300 -es "-12 hours" -d /home/phedex/log/Prod/download /home/phedex/log/Debug/download >> $SUMMARYFILE 2>/dev/null


Read the CVMFS page

Be aware of and the local /cvmfs/ automatic mount point since /cvmfs is nowadays used by our PhEDEx configurations :

[root@t3cmsvobox01 git]# df -h 
Filesystem       Size  Used Avail Use% Mounted on
/dev/sda2        5.7G  4.3G  1.2G  79% /
tmpfs            3.9G     0  3.9G   0% /dev/shm
/dev/sda1        477M   32M  420M   7% /boot
/dev/sda5        2.9G  640M  2.1G  24% /home
/dev/sdb1         20G  9.1G   11G  46% /opt/cvmfs_local  <-- local /cvmfs cache
/dev/sda6        969M  1.7M  917M   1% /tmp
/dev/sda7        5.7G  874M  4.6G  16% /var
/dev/sdc1        9.9G  102M  9.3G   2% /var/cache/openafs
t3fs06:/shome    6.7T  5.0T  1.8T  75% /shome
t3fs05:/swshare  1.8T  562G  1.3T  31% /swshare
AFS              2.0T     0  2.0T   0% /afs
cvmfs2            14G  9.0G  4.7G  66% /cvmfs/
Because of /cvmfs/ that in turn is linked here :
# ll /home/phedex/config/COMP/SITECONF/T3_CH_PSI/PhEDEx/storage.xml
lrwxrwxrwx 1 phedex phedex 52 Apr 13 18:45 /home/phedex/config/COMP/SITECONF/T3_CH_PSI/PhEDEx/storage.xml -> /cvmfs/

Pitfalls in dcache-srmclient-2.10.7-1 ( currently the latest dcache-srmclient )

Strangely PhEDEx has a strong dependency on dcache-srmclient ; by strong we mean that you can't use equivalent SRM tools like lcg-cp or gfal-copy ; in its latest version, Fabio noticed that :
srmcp as in dcache-srmclient-2.2.4-2.el6.x86_64 had, by default, -delegate=true
srmcp as in dcache-srmclient-2.10.7-1.noarch has now, by default, -delegate=false 
Paul Millar ( a primary dCache Dev ) commented in this way :
srmcp tries to avoid the wall-clock time and CPU overhead of delegation if that delegation isn't necessary. 
Unfortunately, there is a bug: the copyjobfile ( used by PhEDEx ) option is not consulted when determining 
whether third-party transfers are involved. The consequence is that all such transfers are considered 
second-party and no delegation is done.
This bug badly affects PhEDEx ; due to it a working PhEDEx/dcache-srmclient-2.2.4-2 configuration will stop to work by simply migrating to PhEDEx/dcache-srmclient-2.10.7-1.noarch and you'll get ( cryptic ) errors like :
21 Apr 2015 07:11:13 (SRM-t3se01) [ VI8:439841:srm2:copy:-2098574001]
  failed to connect to srm://
     credential remaining lifetime is less then a minute
Fabio fixed this by explicitly requesting -delegate=true to bypass the current copyjob bug :
[root@t3cmsvobox01 PhEDEx]# grep -Hn srmcp /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart* | grep -v \#
/home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart.DebugServices:13:   -command     srmcp,-delegate=true,-pushmode=true,-debug=true,-retry_num=2,-protocols=gsiftp,-srm_protocol_version=2,-streams_num=1,-globus_tcp_port_range=20000:25000
/home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/ConfigPart.Standard:13:   -command     srmcp,-delegate=true,-pushmode=true,-debug=true,-retry_num=2,-protocols=gsiftp,-srm_protocol_version=2,-streams_num=1,-globus_tcp_port_range=20000:25000
Fabio noticed another bug again in dcache-srmclient-2.10.7-1 where the default proxy location /tmp/x509up_u`id -u` is considered even if we explicitly specify the option -x509_user_proxy to use a different path :
Dear Paul and dCache colleagues, I believe I've found another bug in dcache-srmclient-2.10.7-1.noarch
$ srmls -debug=false -x509_user_proxy=/home/phedex/gridcert/proxy.cert -retry_num=0 'srm://'
srm client error:
java.lang.IllegalArgumentException: Multiple entries with same key:
x509_user_proxy=/home/phedex/gridcert/proxy.cert and
Fabio fixed it by tweaking the following PhEDEx scripts :
[root@t3cmsvobox01 PhEDEx]# grep -Hn export /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownload* --color
/home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadDelete:14:   export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmrm -retry_num=0 "$pfn";
/home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:31:    *managerv2* ) echo $(export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmls -debug=false -retry_num=0 "$path" 2>/dev/null| grep $file | cut -d\  -f3);;
/home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:44:        fields=($(export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmls -l -debug=false -retry_num=0 "$pfn" 2>/dev/null| grep Checksum))
/home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/FileDownloadSRMVerify:116:        *managerv2*) export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert && srmrm -retry_num=0 "$pfn";;

PhEDEx git repo cloned as a reference

To observe the PhEDEx code progresses keep updated the local git repo :
[root@t3cmsvobox01 git]# su - phedex
-bash-4.1$ cd git
-bash-4.1$ cd PHEDEX/
-bash-4.1$ git pull
remote: Counting objects: 14, done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 14 (delta 2), reused 0 (delta 0), pack-reused 6
Unpacking objects: 100% (14/14), done.
   7768ae7..66c984f  master     -> origin/master
Updating 7768ae7..66c984f
 Contrib/ |  126 ++++++++++++++++++++++++++++++++++++++++++
 Utilities/testSpace/testAuth |   27 +++++++++
 2 files changed, 153 insertions(+), 0 deletions(-)
 create mode 100755 Contrib/
 create mode 100644 Utilities/testSpace/testAuth

How to connect to the PhEDEx DBs

PhEDEx itself connects to the CERN Oracle DBs and you can directly inspect them by sqlplus ; in another shell observe by netstat -tp | grep sqlplus your sqlplus connections and kill them by killall sqlplus if sqlplus will hang ; in real life you'll seldom need to connect by sqlplus but it's important to be aware about this option :
[root@t3cmsvobox01 phedex]# su - phedex
-bash-4.1$ source /home/phedex/PHEDEX/etc/profile.d/
-bash-4.1$ which sqlplus
-bash-4.1$ sqlplus $(/home/phedex/PHEDEX/Utilities/OracleConnectId -db /home/phedex/config/DBParam.PSI:Prod/PSI)
SQL*Plus: Release Production on Wed May 27 14:16:11 2015
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select id,name from t_adm_node where name like '%CSCS%' or name like '%PSI%' ; 
---------- --------------------
   27 T2_CH_CSCS
  821 T3_CH_PSI

SQL> select distinct, r.created_by, r.time_create,r.comments reqcomid, rds.dataset_id,, rd.decided_by, rd.time_decided, rd.comments accomid  from t_req_request r join t_req_type rt on = r.type join t_req_node rn on rn.request = left join t_req_decision rd on rd.request = and rd.node = rn.node join t_req_dataset rds on rds.request = where rn.node = 821 and = 'xfer' and rd.decision = 'y' and dataset_id in (select distinct b.dataset  from t_dps_block b join t_dps_block_replica br on = br.block join t_dps_dataset d on = b.dataset where node = 821 ) order by r.time_create desc ; 

   ID CREATED_BY TIME_CREATE   REQCOMID DATASET_ID NAME                                                                                  DECIDED_BY TIME_DECIDED    ACCOMID
---------- ---------- ----------- ---------- ---------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ------------ ----------
    441651     786542  1429196738     303750    674704 /RSGravToWW_kMpl01_M-1800_TuneCUETP8M1_13TeV-pythia8/RunIIWinter15GS-MCRUN2_71_V1-v1/GEN-SIM                                                     786664   1429287626     303779
    441651     786542  1429196738     303750    674709 /RSGravToWW_kMpl01_M-2500_TuneCUETP8M1_13TeV-pythia8/RunIIWinter15GS-MCRUN2_71_V1-v1/GEN-SIM

Regular Maintenance work

Keeping updated CMS GIT Siteconf

If you modify the local PhEDEx configurations then you have to publish these changes as described on


checks on t3nagios

Checking the recent transfer errors*&xfer_code=.*&to_pfn=.*&from_pfn=.*&log_detail=.*&log_validate=.*&.submit=Update#

Dataset cleaning

This task must be done regularly (once every 2 months, for example), both for CSCS and PSI.

Getting the datasets list

 su - phedex
 cd svn-sandbox/phedex/DB-query-tools/
 source /home/phedex/PHEDEX/4.1.7/etc/profile.d/  # <-- change that 4.1.7 if newer
 ./ -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep  "eleted"
 ./ -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep -vE "Paus|Dynamo|Dutta|Fanfani|Kress|Magini|Wuerthwein|Belforte|Spinoso|Ajit|DataOps|eleted|StoreResults|Argiro|Klute|Cremonesi|Jean-Roch Vlimant|vocms[0-9]+|cmsgwms-submit[0-9]+|IntelROCCS|retention time: 2016|Retention date: 2016" <-- adapt that 2016
 ./ -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%PSI%" | grep -Ev "retention time: 2016|Retention date: 2016" <-- adapt that 2016

The first PERL command creates a list of datasets that can be safely deleted from CSCS, as they are just support requests for transfers to PSI (check that the transfer happened safely).
The second command creates a list avoiding to include central requests, and the ones that can be deleted from CSCS.
The third command produces a list for PSI.

Datasets which are proposed for deletion are all the datasets which have an expired retention time.

Publishing the list and notify users

Due date for feedback is usually in a week. Lists must be published in DataSetCleaningQuery (previous lists must be deleted). To get the information on the total size proposed for deletion, you can create a temporary text file with pasted list from the twiki and then do:

cat tmp.list | awk 'BEGIN{sum=0}{sum+=$4}END{print sum/1024.}'

This will give the total size in TB.

A email like this must be sent to the mailing list:

Subject: Dataset deletion proposal and request for User Data cleaning - Due date: 28 Oct 2011, 9:0
Dear all,
a new cleaning campaign is needed, both at CSCS and PSI. You can find the list and the instructions on how to request to keep the data here:

The data contained in the lists amount to 47TB / 44TB for CSCS / PSI.
If you need to store a dataset both at CSCS and at PSI please also reply to this email explaining why.
Please remember to clean up your user folder at CSCS regularly; a usage overview can be found at [1] and [2]



Dataset cleaning - 2nd version

Derek also made this less cryptic ( you don't need to know the Oracle DBs tables and columns, and of course Perl ) Python tool :

[root@t3cmsvobox01 DB-query-tools]# ./ --site T3_CH_PSI 
Getting the data from the data service...

|   *keep?*|      *ID*| *Dataset*|*Size(GB)*|   *Group*|*Requested on*|*Requested by*|*Comments*|*Comments2*|
|          |    225527|/GluGluToHToWWTo2L2Nu_M-160_7TeV-powheg-pythia6/Winter10-E7TeV_ProbDist_2011Flat_BX156_START39_V8-v1/AODSIM|25.5| b-tagging|2011-02-18 13:35:49|Wolfram Erdmann|retention time April 2011|to be deleted from CSCS|
|          |    269087|/BdToMuMu_2MuPtFilter_7TeV-pythia6-evtgen/Summer11-PU_S4_START42_V11-v1/GEN-SIM-RECO|58.6| b-physics|2011-06-08 12:34:25|Christoph Naegeli|retention-time: 2011-10-31|          |
|          |    320266|/RelValProdTTbar/SAM-MC_42_V12_SAM-v1/GEN-SIM-RECO|3.1|    FacOps|2011-09-13 09:58:51|Andrea Sciaba|          |Centrally approved (Nicolo)|

Renewing the myproxy certificate saved in (seldom, once each ~11 months)

*Nagios daily checks the voms proxy lifetime used by PhEDEx; this proxy is either a Fabio's proxy or a Joosep's proxy and because of that all the PhEDEx files uploaded in /pnfs/ belong to one of these 2 accounts ( but not chaotically to both ). If you change that proxy then you have to change ALL the related files/dirs ownership in /pnfs/ ; specifically you'll want to change the owner of /pnfs/ or conversely each PhEDEx file transfer will fail with permission denied.

Following how to upload a long-life proxy into :

$ myproxy-init -t 168 -R '' -l psi_phedex_fabio -x -k renewable -s -c 8700
Your identity: /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli
Enter GRID pass phrase for this identity:
Creating proxy .......................................................................................................................................... Done
Proxy Verify OK

Warning: your certificate and proxy will expire Thu Dec 10 01:00:00 2015
 which is within the requested lifetime of the proxy
A proxy valid for 8700 hours (362.5 days) for user psi_phedex_fabio now exists on

# That 362.5 days is wrong !

$ myproxy-info -s  -l psi_phedex_fabio
username: psi_phedex_fabio
owner: /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli
  name: renewable
  renewal policy: */
  timeleft: 6249:20:19  (260.4 days)

The present myproxy servers have problems with host certificates for PSI from SWITCH, because they contain a "(PSI)" substring, and the parentheses are not correctly escaped in the regexp matching of the myproxy code. Therefore, the renewer DN (-R argument to myproxy-init below) and the allowed renewers policy on the myproxy server need to be defined with wildcards to enable the matching to succeed.

voms-proxy-init -voms cms
servicecert="/DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=hosts/C=CH/ST=Aargau/L=Villigen/O=Paul-Scherrer-Institut (PSI)/OU=AIT/"
myproxy-init -s $myproxyserver -l psi_phedex -x       -R "$servicecert" -c 720
scp ~/.x509up_u$(id -u) phedex@t3ui01:gridcert/proxy.cert
#  for testing, you can try
myproxy-info -s $myproxyserver -l psi_phedex

As the phedex user do

chmod 600 ~/gridcert/proxy.cert

You should test whether the renewal of the certificate works for the phedex user: unset X509_USER_PROXY # make sure that the service credentials from ~/.globus are used!

voms-proxy-init  # initializes the service proxy cert that is allowed to retrieve the user cert
myproxy-get-delegation -s $myproxyserver -v -l psi_phedex              -a /home/phedex/gridcert/proxy.cert -o /tmp/gagatest

export X509_USER_PROXY=/tmp/gagatest
srm-get-metadata srm://
rm /tmp/gagatest

Emergency Measures



To be manually invoked after a server restart !



current phedex status
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileDownload -state /home/phedex/state/Debug/incoming/download/ -log /home/phedex/log/Debug/download -verbose -db/home/phed
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileExport -state /home/phedex/state/Debug/incoming/fileexport/ -log /home/phedex/log/Debug/fileexport -db/home/phedex/conf
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileRemove -state /home/phedex/state/Debug/incoming/fileremove/ -log /home/phedex/log/Debug/fileremove -node T3_CH_PSI -db/
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Debug/incoming/blockverify/ -log /home/phedex/log/Debug/blockverify --db/home/p
perl /home/phedex/PHEDEX/4.1.7/Utilities/ -state /home/phedex/state/Debug/incoming/watchdog/ -log /home/phedex/log/Debug/watchdog -db/home/phedex/config/DBP
perl /home/phedex/PHEDEX/4.1.7/Utilities/ -state /home/phedex/state/Debug/incoming/WatchdogLite/ -log /home/phedex/log/Debug/WatchdogLite -nodeT3_CH_PSI
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileDownload -state /home/phedex/state/Dev/incoming/download/ -log /home/phedex/log/Dev/download -verbose -db/home/phedex/c
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileExport -state /home/phedex/state/Dev/incoming/fileexport/ -log /home/phedex/log/Dev/fileexport -db/home/phedex/config/D
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileRemove -state /home/phedex/state/Dev/incoming/fileremove/ -log /home/phedex/log/Dev/fileremove -node T3_CH_PSI -db/home
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Dev/incoming/blockverify/ -log /home/phedex/log/Dev/blockverify --db/home/phede
perl /home/phedex/PHEDEX/4.1.7/Utilities/ -state /home/phedex/state/Dev/incoming/watchdog/ -log /home/phedex/log/Dev/watchdog -db/home/phedex/config/DBParam
perl /home/phedex/PHEDEX/4.1.7/Utilities/ -state /home/phedex/state/Dev/incoming/WatchdogLite/ -log /home/phedex/log/Dev/WatchdogLite -node T3_CH_PSI-ag
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileDownload -state /home/phedex/state/Prod/incoming/download/ -log /home/phedex/log/Prod/download -verbose -db/home/phedex
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileExport -state /home/phedex/state/Prod/incoming/fileexport/ -log /home/phedex/log/Prod/fileexport -db/home/phedex/config
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Transfer/FileRemove -state /home/phedex/state/Prod/incoming/fileremove/ -log /home/phedex/log/Prod/fileremove -node T3_CH_PSI -db/ho
perl /home/phedex/PHEDEX/4.1.7/Toolkit/Verify/BlockDownloadVerify -state /home/phedex/state/Prod/incoming/blockverify/ -log /home/phedex/log/Prod/blockverify --db/home/phe
perl /home/phedex/PHEDEX/4.1.7/Utilities/ -state /home/phedex/state/Prod/incoming/watchdog/ -log /home/phedex/log/Prod/watchdog -db/home/phedex/config/DBPar
perl /home/phedex/PHEDEX/4.1.7/Utilities/ -state /home/phedex/state/Prod/incoming/WatchdogLite/ -log /home/phedex/log/Prod/WatchdogLite -node T3_CH_PSI-

      └─pstree -uh phedex -la

source /home/phedex/PHEDEX/4.1.7/etc/profile.d/
/home/phedex/PHEDEX/4.1.7/Utilities/InspectPhedexLog -es "-1 hours" -d /home/phedex/log/Prod/download /home/phedex/log/Debug/download
/home/phedex/PHEDEX/4.1.7/Utilities/InspectPhedexLog -es "-1 days"  -d /home/phedex/log/Prod/download /home/phedex/log/Debug/download

How to update

How to test /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/storage.xml
source /home/phedex/PHEDEX/4.1.7/etc/profile.d/
/home/phedex/PHEDEX/4.1.7/Utilities/TestCatalogue -c /home/phedex/config/SITECONF/T3_CH_PSI/PhEDEx/storage.xml  -p srmv2 -L /store/data/file

StorageConsistencyCheck example
source /home/phedex/PHEDEX/4.1.7/etc/profile.d/
/home/phedex/PHEDEX/4.1.7/Utilities/StorageConsistencyCheck -db /home/phedex/config/DBParam.PSI:Prod/PSI -lfnlist /home/phedex/PSI.lfnlist.txt  -node T3_CH_PSI
/home/phedex/PHEDEX/4.1.7/Utilities/StorageConsistencyCheck -db /home/phedex/config/DBParam.PSI:Prod/PSI -lfnlist /home/phedex/CSCS.lfnlist.txt -node T2_CH_CSCS
[root@t3dcachedb03 ~]# psql -U nagios -d chimera -c " select path from v_pnfs where path like '/pnfs/' ; " -t -q  -o ./PSI.txt <-------- to get the LFN 

PSI agents as perceived from the CERN DBs
source /home/phedex/PHEDEX/4.1.7/etc/profile.d/
/home/phedex/PHEDEX/4.1.7/Utilities/ShowAgents -db /home/phedex/config/DBParam.PSI:Prod/PSI -node T3_CH_PSI

PSI Prod Datasets

PSI Prod Errors

==========================================================  <-- HowTo Doc
source /home/phedex/PHEDEX/4.1.7/etc/profile.d/
cd svn-sandbox/phedex/DB-query-tools/
source /home/phedex/PHEDEX/4.1.7/etc/profile.d/  
./ -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep  "eleted"
./ -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%CSCS%" | grep -vE "Dutta|Fanfani|Kress|Magini|Wuerthwein|Belforte|Spinoso|Ajit|DataOps|eleted|StoreResults|Argiro|Klute|Cremonesi|Jean-Roch Vlimant|vocms[0-9]+|cmsgwms-submit[0-9]+|IntelROCCS|retention time: 2016|Retention date: 2016"
./ -w -t --db ~/config/DBParam.PSI:Prod/PSI -s "%PSI%" | grep -Ev "retention time: 2016|Retention date: 2016"   <-- where to publish the outputs

be aware that if a data sets appeared, disappeared and appeared again as it's written today will report the 1st old occurrence, so old user and old retention time


netstat -tp

[root@t3cmsvobox01 git]# netstat -tp
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0       TIME_WAIT   -                   
tcp        0      0   t3service01.p:fujitsu-dtcns ESTABLISHED 1154/syslog-ng      
tcp        0      0       ESTABLISHED 21373/nslcd         
tcp        0      0  TIME_WAIT   -                   
tcp        0      0   ESTABLISHED 2084/xrdcp          
tcp        0      0   ESTABLISHED 1810/xrdcp          
tcp        0      0           ESTABLISHED -                   
tcp        0      0  ESTABLISHED 30705/perl          
tcp        0      0  ESTABLISHED 31158/perl          
tcp        0      0  ESTABLISHED 29777/perl          
tcp        0      0  TIME_WAIT   -                   
tcp        1      0   CLOSE_WAIT  29416/cvmfs2        
tcp        0      0  ESTABLISHED 30198/perl          
tcp        0      0  ESTABLISHED 29695/perl          
tcp        1      0       CLOSE_WAIT  2057/nrpe           
tcp        0      0     TIME_WAIT   -                   
tcp        0      0       ESTABLISHED 21525/python2.6     
tcp        0      0  ESTABLISHED 29898/perl          
tcp        0      0     TIME_WAIT   -                   
tcp        1      0   CLOSE_WAIT  29416/cvmfs2        
tcp        0      0       ESTABLISHED 21373/nslcd         
tcp        1      0       CLOSE_WAIT  1783/nrpe           
tcp        0      0      ESTABLISHED 1571/sshd           
tcp        0      0       ESTABLISHED 21373/nslcd         
tcp        0      0  ESTABLISHED 30280/perl          
tcp        0      0  ESTABLISHED 30401/perl          
tcp        0      0     TIME_WAIT   -                   
tcp        0      0  ESTABLISHED 30992/perl          
tcp        0      0  ESTABLISHED 30540/perl          
tcp        0      0       ESTABLISHED 21373/nslcd         
tcp        0      0       ESTABLISHED 21373/nslcd         
tcp        0      0     TIME_WAIT   -                   
tcp        0      0  TIME_WAIT   -                   
tcp        1      0   CLOSE_WAIT  29416/cvmfs2        
tcp        0      0        TIME_WAIT   -                   
tcp        0      0       ESTABLISHED 21525/python2.6     

Checking each CMS pool by Nagios through both the t3se01:SRM and t3dcachedb:Xrootd dCache doors

By t3cmsvobox , in turn contacted by t3nagios , we retrieve a file from each CMS pool through both t3se01:SRM and t3dcachedb:Xrootd

In both the cases the test files retrieved are :

[martinelli_f@t3ui12 ~]$ find /pnfs/ | grep M | sort

The related dCache files have to be obviously placed on the right CMS pool otherwise the Nagios tests will be wrong ! To easily check where they are really placed run this SQL code ( in this example some test files are erroneously available in the wrong pool ! that was due to a bad migration cache command )
More... Close

[root@t3dcachedb03 ~]# psql -U nagios -d chimera -c " select path,ipnfsid,pools from v_pnfs where path like '%1MB-test-file_pool_%' ; " 
                            path                             |               ipnfsid                |               pools                
 /pnfs/  | 0000BCDA4B329DA94D64AAAFE7C0C7501E5C | t3fs09_ops
 /pnfs/  | 0000358B14867ED5402184C2C22F81EFC861 | t3fs08_ops
 /pnfs/  | 0000409BB804C95944A38DBE8220B416A8A3 | t3fs07_ops
 /pnfs/  | 0000B58A7FA17778439F8F6F47C5CBBED5E7 | t3fs03_cms t3fs11_cms t3fs14_cms_9
 /pnfs/  | 00001A2FD52D31DB4CCAB99C8B8336522339 | t3fs09_cms t3fs11_cms t3fs14_cms_8
 /pnfs/  | 000018AA61C1E30F43709F0D9FE3B9CD65D1 | t3fs03_cms t3fs14_cms_7
 /pnfs/  | 0000E88C6CBB2D5A4365B11BE2EDD1554366 | t3fs02_cms t3fs14_cms_6
 /pnfs/  | 000200000000000006300738             | t3fs10_cms t3fs14_cms_5
 /pnfs/  | 0002000000000000052EF198             | t3fs03_cms t3fs14_cms_4
 /pnfs/  | 0002000000000000052EF168             | t3fs03_cms t3fs14_cms_3
 /pnfs/  | 0002000000000000052EF138             | t3fs07_cms t3fs14_cms_2
 /pnfs/ | 00003616229002194F439925DA3C7F1CFA02 | t3fs10_cms t3fs14_cms_11
 /pnfs/ | 0000B3D6A96EF961473AACB05F80CF9D6892 | t3fs07_cms t3fs14_cms_10
 /pnfs/  | 0002000000000000052EF108             | t3fs02_cms t3fs11_cms t3fs14_cms_1
 /pnfs/  | 0000A6470E0458354BD99D6C2DD27B196DCC | t3fs08_cms t3fs14_cms_0
 /pnfs/    | 0002000000000000052EF0D8             | t3fs03_cms t3fs04_cms t3fs14_cms
 /pnfs/  | 00004783F9158A5941B284342FF4A8EDE126 | t3fs08_cms t3fs13_cms_9
 /pnfs/  | 0000132841305C27434891574015FD2CF923 | t3fs09_cms t3fs13_cms_8
 /pnfs/  | 00003FC27733ACBA4A809677419256FE22F9 | t3fs02_cms t3fs11_cms t3fs13_cms_7
 /pnfs/  | 0002000000000000072F8630             | t3fs07_cms t3fs11_cms t3fs13_cms_6
 /pnfs/  | 0002000000000000052EF0A8             | t3fs03_cms t3fs13_cms_5
 /pnfs/  | 0002000000000000052EF078             | t3fs10_cms t3fs11_cms t3fs13_cms_4
 /pnfs/  | 0002000000000000052EF048             | t3fs10_cms t3fs13_cms_3
 /pnfs/  | 0002000000000000052EF018             | t3fs02_cms t3fs13_cms_2
 /pnfs/ | 00000DB49D5B69EB4C568834BD162C3DA8E7 | t3fs09_cms t3fs13_cms_11
 /pnfs/ | 0000073FF4F754BB4AB1B4599F412811BDA2 | t3fs10_cms t3fs13_cms_10
 /pnfs/  | 00000CB9E97140F940CD973C319045B43FDA | t3fs04_cms t3fs11_cms t3fs13_cms_1
 /pnfs/  | 00005560491A76DE49DBA142D3BE3CFE38D5 | t3fs02_cms t3fs11_cms t3fs13_cms_0
 /pnfs/    | 0002000000000000052EEFB8             | t3fs07_cms t3fs11_cms t3fs13_cms
 /pnfs/    | 00009E4A9774085C4799B5C9C827DA03406F | t3fs11_cms
 /pnfs/    | 000005D1DD24CA14448694E5C46A8AA8E91F | t3fs10_cms
 /pnfs/    | 0000479ED8FDDC374BC68827AEDF1C146686 | t3fs09_cms
 /pnfs/    | 00003A989AB6D1074D738594B1D01E2D03DE | t3fs08_cms
 /pnfs/    | 0000119DDCFD0C5F42B89769BC9C104A997F | t3fs07_cms
 /pnfs/  | 0002000000000000063D8C68             | t3fs04_cms_1
 /pnfs/    | 00020000000000000395B300             | t3fs04_cms
 /pnfs/    | 000200000000000006391F88             | t3fs03_cms
 /pnfs/    | 00020000000000000330BF10             | t3fs02_cms
 /pnfs/    | 00020000000000000330BF90             | t3fs01_cms


OS snapshots are nightly taken by the PSI VMWare Team ( contact Peter Huesser ) + we have LinuxBackupsByLegato to recover a single file.

Hostnames t3cmsvobox ( t3cmsvobox01 )
Services PhEDEx 4.1.7
Hardware PSI DMZ VMWare cluster
Install Profile vobox
Guarantee/maintenance until VMWare PSI Cluster
Edit | Attach | Watch | Print version | History: r50 < r49 < r48 < r47 < r46 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r48 - 2016-11-08 - FabioMartinelli
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback