How to work with Storage Element
SE clients
Storage data (based on dCache) located under directory /pnfs/psi.ch/cms/trivcat/store/user/
username .
Data are accessible by standard
gfal2 (Grid File Access Library),
xrdcp and
dcap (obsolete) utilities.
On login and compute nodes /pnfs is mounted Read-Only.
With /pnfs one can use the common Linux commands
cd
,
ls
,
find
,
du
,
stat
, i.e. meta-data based commands displaying the file list, size, last access time, etc, but
not file-content commands (it is not possible to
cat
or
grep
a file).
Here are examples how to copy file from dCache to local machine and vice versa:
XROOTD LAN (local area network, for local access from UI and worker nodes)
xrdfs
executed on a UI in the Xrootd LAN service case :
$ xrdfs t3dcachedb03.psi.ch ls -l -u //pnfs/psi.ch/cms/trivcat/store/user/$USER/
...
-rw- 2015-03-15 22:03:41 5356235878 root://192.33.123.26:1094///pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/xroot
-rw- 2015-03-15 22:06:04 131870 root://192.33.123.26:1094///pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/xrootd.
-rw- 2015-03-15 22:06:45 1580023632 root://192.33.123.26:1094///pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/ZllH.DiJetPt.Mar1.DY1JetsToLL_M-50_TuneZ2Star_8TeV-madgraph_procV2_mergeV1V2.root
...
xrdcp
executed on a UI in the Xrootd LAN service case :
$ xrdcp -d 1 root://t3dcachedb.psi.ch:1094///pnfs/psi.ch/cms/trivcat/store/user/$USER/ZllH.DiJetPt.Mar1.DY1JetsToLL_M-50_TuneZ2Star_8TeV-madgraph_procV2_mergeV1V2.root /dev/null -f
[1.472GB/1.472GB][100%][==================================================][94.18MB/s]
XROOTD WAN (wide area network, access from outside of Tier-3, stage-in / stage-out)
The Read-Write Xrootd service reachable by
root://t3se01.psi.ch:1094//
Do NOT use this service for local analysis jobs. We limit the number of parallel transfers through this door, since it usually should only be used by efficient WAN copies, i.e. transfers of large files with high bandwidths, because too many of these transfers could harm the availability of the tier-3's small number of storage servers. If you use this door for your analysis jobs you will find that many of them will get queued.
$ xrdfs cms-xrd-transit.cern.ch locate /store/mc/RunIIFall15MiniAODv2/ZprimeToWW_narrow_M-3500_13TeV-madgraph/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/00000/86A261F4-3BB8-E511-88EE-C81F66B73F37.root
[::192.33.123.24]:1095 Server Read
$ host 192.33.123.24
24.123.33.192.in-addr.arpa domain name pointer t3se01.psi.ch.
$ xrdcp --force root://cms-xrd-transit.cern.ch//store/mc/RunIIFall15MiniAODv2/ZprimeToWW_narrow_M-3500_13TeV-madgraph/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/00000/86A261F4-3BB8-E511-88EE-C81F66B73F37.root /dev/null
[32MB/32MB][100%][==================================================][32MB/s]
ROOT examples
- Reading a file in ROOT by xrootd
https://confluence.slac.stanford.edu/display/ds/Using+Xrootd+from+root
$ root -l
$ root [1] TFile *_file0 = TFile::Open("root://t3dcachedb03.psi.ch:1094//pnfs/psi.ch/cms/trivcat/store/user/leo/whatever.root")
- Reading a file in ROOT by dcap
$ root -l
$ root [1] TFile *_file0 = TFile::Open("dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/trivcat/store/user/leo/whatever.root")
- Merging online two ROOT files by hadd and gsidcap
To merge
online two ROOT files located at T3 you can use the ROOT tool
hadd
:
More... Close
$ source
/swshare/ROOT/root_v5.34.18_slc6_amd64_py26_pythia6/bin/thisroot.sh
$ which hadd
/swshare/ROOT/root_v5.34.18_slc6_amd64_py26_pythia6/bin/hadd
$ hadd -f0
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_12.root
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_13.root
hadd Target file:
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root
hadd Source file 1:
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_12.root
hadd Source file 2:
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_13.root
hadd Target path:
gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root:/
$ ll /pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root
-rw-r--r-- 1 at3user ethz-susy 87M Oct 3 16:41
/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root
because of the
gsidcap
protocol that's usually just offered as a LAN protocol in a Tier 1/2/3, you're suppose to run
hadd
from a
t3ui*
server, not
lxplus
or some external UI, and you'll want to merge
online two ROOT files that are necessarily stored at T3 with the final result again stored at T3.
GFAL2 examples
gfal2 is a CERN standard tools to interact with all the Grid SEs.
There are the following gfal2 tools are available
gfal-cat, gfal-copy, gfal-ls, gfal-mkdir, gfal-rm, gfal-save, gfal-sum, gfal-xattr with corresponding manual page for each of them like
$ man gfal-rm .
We recommend to use
root://t3dcachedb03.psi.ch
to upload/download a file:
$ gfal-copy --force root://t3dcachedb03.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/
Some other protocols options:
$ gfal-copy --force gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/
$ gfal-copy --force gsidcap://t3se01.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/
Further examples :
$ gfal-mkdir root://t3dcachedb03.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/user_id/testdir
$ gfal-copy root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:/dev/null -f
Copying root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 [DONE] after 0s
$ gfal-ls -l root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user
dr-xr-xr-x 0 0 0 512 Feb 21 2013 alschmid
...
Remove a file from dCache:
$ gfal-rm root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/myfile
Erasing a remote whole (non-empty) dir with all content:
$ gfal-rm -r root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/dir-name
Action of the
gfal-save
and
gfal-cat
commands :
$ cat myfile
Hello T3
$ cat myfile | gfal-save root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/myfile
$ gfal-cat root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/myfile
Hello T3
DCAP examples
To copy a file from the SE to a local disk , for instance, on /scratch, use
dcap
(
the only option to transfer data without grid certificate) and
gsidcap
protocols. These tools are blocked towards the outside, so you cannot use them from a machine outside of PSI like
lxplus
for downloading files.
dccp dcap://t3se01.psi.ch:22125//pnfs/psi.ch/cms/testing/test100 /scratch/myfile
dccp gsidcap://t3se01.psi.ch:22128/pnfs/psi.ch/cms/testing/test100 /scratch/myfile
Setting data access permissions
Every user directory on T3 SE /pnfs/psi.ch/cms/trivcat/store/user/
username is set with write permission just to owner; by default even your group members won't be able to alter your own SE dir like shown in the following example:
$ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username
# file : srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username
# owner : 2980
owner:2980:RWX <---- 2980 is the UID
user:2980:RWX
group:500:RX <---- no group write ; 500 is the GID
other:RX
if you need to create a
/pnfs
dir where also the group members can write and delete files do the following:
$ srmmkdir srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/TESTDIR
$ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/TESTDIR
$ srm-set-permissions -type=ADD -group=RWX srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/TESTDIR
$ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/TESTDIR
# file : srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/TESTDIR
# owner : 2980
owner:2980:RWX
user:2980:RWX
group:500:RWX <---- now your group members can write files and dirs
other:RX
Getting data from remote SEs to the T3 SE
Official datasets
For official data sets/blocks that are
registered in CMS DBS you
must use the
PhEDEx system.
Private datasets
The recommended way to transfer private datasets (non-CMSSW ntuples) between sites is the File Transfer System (FTS)
documentation
. In a nuthsell, you need to prepare a file that contains
lines like
protocol://source protocol://destination
.
An example file is the following:
gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_973.root gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_973.root
gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_981.root gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_981.root
Then, you can submit the transfer with
$ fts-transfer-submit -s https://fts3-pilot.cern.ch:8446 -f files.txt
a360e11e-ab3b-11e6-8fe7-02163e00a39b
You will get back an ID string, which you can use to monitor your transfer on the site
https://fts3-pilot.cern.ch:8449/fts3/ftsmon/#
The transfer will proceed in parallel, you can also specify resubmission options for failed jobs.
The site prefixes can typically be found out from existing transfer logs on the grid or by inspecting the files in
/cvmfs/cms.cern.ch/SITECONF/T2_CH_CSCS/JobConfig/site-local-config.xml
/cvmfs/cms.cern.ch/SITECONF/T2_CH_CSCS/PhEDEx/storage.xml
Some useful examples are below:
gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat
gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat
gsiftp://eoscmsftp.cern.ch//eos/cms/
Job Stageout from other remote sites
You can try to stageout your CRAB3 Job outputs directly a T3_CH_PSI but if these transfers will get too slow and/or unreliable than stageout first at T2_CH_CSCS and afterwards copy your files at T3_CH_PSI by the Leonardo Sala's
data_replica.py
or whatever else Grid tool able to transfers files in parallel between 2 sites.