Tags:
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> ---+!! How to work with the SE %TOC% ---++ SE clients Storage data (based on dCache) can be found under directory /pnfs and accessed using dcap, xrootd and grigfrp protocols. ---+++ Read-Only NFSv3 /pnfs On the =t3ui* t3wn*= servers it's mounted Read-Only the namespace =/pnfs= in order to easily use the common Linux commands =cd=, =ls=, =find=, =du=, =stat= and so on. <!-- a daily updated, pre-ordered by dir size report is always available on http://t3mon.psi.ch/PSIT3-custom/v_pnfs_top_dirs.txt with the limitation that it might be max 12h old --> A couple of =find /pnfs/psi.ch/cms/= examples : * =find /pnfs/psi.ch/cms/ -atime +50 -iname *root -uid `id -u $USER`= * =find /pnfs/psi.ch/cms/ -atime +50 -type d -uid `id -u $USER`= Note that only meta-data based commands will work (e.g. displaying the file list, file size, last access time, etc.) on =/pnfs= but *no* file-content commands ( it is not possible to =cat= or =grep= a file). There is however a special command to copy a file to a local file system like =/scratch= called =dccp=. Use it like this : <verbatim> dccp /pnfs/psi.ch/cms/trivcat/store/user/path/to/somefile.root /scratch/$USER/ </verbatim> ---+++ XROOTD As said at the beginning of this page, the T3 SE features several Grid protocols accessible by several Grid tools ; this is flexible but it's also a source of confusion so if you don't have a motivation to use more than one tool or more than a protocol than always use =root://t3dcachedb.psi.ch:1094//= and the tools =xrdfs xrdcp= ; this XROOTD commands knowledge will be recyclable in the larger [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][CMS AAA]] context. ---++++ The T3 Xrootd WAN service - I/O queue =xrootd= * It's a Read-Write Xrootd service exposed on Internet ( "WAN" ) reachable by =root://t3se01.psi.ch:1094//= * MAX *4* active connections are allowed in each of its *dedicated* [[https://www.dcache.org/][dCache]] I/O queues =xrootd= * Only the =/pnfs/psi.ch/cms/trivcat/= namespace subset is available but in practical terms this is not a limitation though. * Be aware that this service is intentionally NOT connected to the [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][CMS AAA]] =cms-xrd-global.cern.ch= service because of the 2016 CMS policy that excludes both permanently all the T3 sites and dynamically all the misbehaving T1/T2 sites ; =root://t3se01.psi.ch:1094//= is reachable from Internet because it's connected again by [[https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookXrootdService][CMS AAA]] policy to =cms-xrd-transit.cern.ch= as shown by the following example : <pre> $ xrdfs cms-xrd-transit.cern.ch locate /store/mc/RunIIFall15MiniAODv2/ZprimeToWW_narrow_M-3500_13TeV-madgraph/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/00000/86A261F4-3BB8-E511-88EE-C81F66B73F37.root [::192.33.123.24]:1095 Server Read $ host 192.33.123.24 24.123.33.192.in-addr.arpa domain name pointer t3se01.psi.ch. $ xrdcp --force root://cms-xrd-transit.cern.ch//store/mc/RunIIFall15MiniAODv2/ZprimeToWW_narrow_M-3500_13TeV-madgraph/MINIAODSIM/PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/00000/86A261F4-3BB8-E511-88EE-C81F66B73F37.root /dev/null [32MB/32MB][100%][==================================================][32MB/s] </pre> ---++++ The T3 Xrootd LAN service - I/O queue =regular= ( DEFAULT service to be used ) * It's a Read-Write Xrootd service unexposed on Internet ( "LAN" ) reachable by =root://t3dcachedb.psi.ch:1094//= * MAX *100* active connections are allowed in each of [[https://www.dcache.org/][dCache]] I/O queues =regular=, in constant competition with the =dcap= and =gsidcap= Active/Max/Queued connections. * The full T3 =/pnfs= namespace is available. [[http://xrootd.org/doc/man/xrdfs.1.html][xrdfs]] executed on a UI in the Xrootd LAN service case : <pre> $ xrdfs t3dcachedb03.psi.ch ls -l -u //pnfs/psi.ch/cms/trivcat/store/user/$USER/ ... -rw- 2015-03-15 22:03:41 5356235878 root://192.33.123.26:1094///pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/xroot -rw- 2015-03-15 22:06:04 131870 root://192.33.123.26:1094///pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/xrootd. -rw- 2015-03-15 22:06:45 1580023632 root://192.33.123.26:1094///pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%BLUE%ZllH.DiJetPt.Mar1.DY1JetsToLL_M-50_TuneZ2Star_8TeV-madgraph_procV2_mergeV1V2.root%ENDCOLOR% ... </pre> [[http://xrootd.org/doc/man/xrdcp.1.html][xrdcp]] executed on a UI in the Xrootd LAN service case : <pre> $ xrdcp -d 1 root://t3dcachedb.psi.ch:1094///pnfs/psi.ch/cms/trivcat/store/user/$USER/%BLUE%ZllH.DiJetPt.Mar1.DY1JetsToLL_M-50_TuneZ2Star_8TeV-madgraph_procV2_mergeV1V2.root%ENDCOLOR% /dev/null -f [1.472GB/1.472GB][100%][==================================================][94.18MB/s] </pre> ---+++ =dcap= and =gsidcap= - I/O queue =regular= ( legacy tools ) =dcap= and =gsidcap= are a fast method to copy a file from the SE to a local disk on the T3. The protocol is blocked towards the outside, so you cannot use it from a machine outside of PSI ( like =lxplus= ) for downloading files. <pre> dccp dcap://%BLUE%t3se01.psi.ch%ENDCOLOR%:22125//pnfs/psi.ch/cms/testing/test100 /tmp/myfile </pre> you can't alter =/pnfs= by =dcap= ; to modify =pnfs= use =gsidcap= instead : <pre> dccp gsidcap://%BLUE%t3se01.psi.ch%ENDCOLOR%:22128/pnfs/psi.ch/cms/testing/test100 /tmp/myfile </pre> ---+++ ROOT ---++++ Using the standalone ROOT installations ( legacy ) See HowToWorkInCmsEnv#Using_StandAlone_ROOT_by_swshare ---++++ Reading a file in ROOT by =xrootd= - I/O queue =regular= https://confluence.slac.stanford.edu/display/ds/Using+Xrootd+from+root <pre> $ root -l $ root [1] TFile *_file0 = TFile::Open("root://%BLUE%t3dcachedb03.psi.ch%ENDCOLOR%:1094//pnfs/psi.ch/cms/trivcat/store/user/leo/whatever.root") </pre> ---++++ Reading a file in ROOT by =dcap= - I/O queue =regular= ( legacy ) <pre> $ root -l $ root [1] TFile *_file0 = TFile::Open("dcap://%BLUE%t3se01.psi.ch%ENDCOLOR%:22125//pnfs/psi.ch/cms/trivcat/store/user/leo/whatever.root") </pre> ---++++ Merging online two ROOT files by =hadd= and =gsidcap= - I/O queue =regular= To merge *online* two ROOT files located at T3 you can use the ROOT tool [[http://root.cern.ch/drupal/content/how-merge-histogram-files][hadd]]:</br> %TWISTY% <pre> $ source /swshare/ROOT/root_v5.34.18_slc6_amd64_py26_pythia6/bin/thisroot.sh $ which hadd /swshare/ROOT/root_v5.34.18_slc6_amd64_py26_pythia6/bin/hadd $ hadd %ORANGE%-f0%ENDCOLOR% %GREEN%gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root%ENDCOLOR% %BLUE%gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_12.root%ENDCOLOR% %BLUE%gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_13.root%ENDCOLOR% hadd %ORANGE%Target%ENDCOLOR% file: %GREEN%gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root%ENDCOLOR% hadd %ORANGE%Source%ENDCOLOR% file 1: %BLUE%gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_12.root%ENDCOLOR% hadd %ORANGE%Source%ENDCOLOR% file 2: %BLUE%gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/babies/QCD-Pt300to475/QCD_Pt300to470_PU_S14_POSTLS170/treeProducerSusyFullHad_tree_13.root%ENDCOLOR% hadd Target path: gsidcap://t3se01:22128/pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root:/ $ ll /pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root -rw-r--r-- 1 at3user ethz-susy 87M Oct 3 16:41 /pnfs/psi.ch/cms/trivcat/store/user/at3user/merged.root </pre>%ENDTWISTY% because of the =gsidcap= protocol that's usually just offered as a LAN protocol in a Tier 1/2/3, you're suppose to run =hadd= from a =t3ui*= server, not =lxplus= or some external UI, and you'll want to merge *online* two ROOT files that are necessarily stored at T3 with the final result again stored at T3. ---+++ =gfal-*= tools *If your jobs are still working fine with the previous lcg- tools then you can use those until they will work ; the T3 Admins won't debug your lcg- tools errors though* </br> Troubleshooting: * =gfal2.GError: Unable to open the /usr/lib64/gfal2-plugins//libgfal_plugin_http.so plugin specified in the plugin directory, failure : /usr/lib64/libdavix_copy.so.0: undefined symbol: _ZNK5Davix13RequestParams11getCopyModeEv= * Use ="env -i X509_USER_PROXY=~/.x509up_u`id -u` gfal-command XXX"= or =LD_LIBRARY_PATH=/usr/lib64:$LD_LIBRARY_PATH gfal-ls gsiftp://t3se01.psi.ch/pnfs/= In 2014 CERN released the [[https://dmc.web.cern.ch/projects/gfal-2/home][gfal-* CLIs and APIs]] as its new standard toolset to interact with all the Grid SEs and their several Grid protocols ; there is a [[https://indico.cern.ch/event/251192/session/2/contribution/13/material/slides/1.pdf][talk]] about that ; %BLUE%Since the gfal-* tools are designed to be MULTI protocol%ENDCOLOR%, so you can upload/download a file in several ways : <pre> $ gfal-copy --force %BLUE%root%ENDCOLOR%://t3dcachedb03.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/ $ gfal-copy --force %BLUE%gsiftp%ENDCOLOR%://t3se01.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/ $ gfal-copy --force %BLUE%srm%ENDCOLOR%://t3se01.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/ $ gfal-copy --force %BLUE%gsidcap%ENDCOLOR%://t3se01.psi.ch/pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:////$PWD/ </pre> if you're in doubt about what to use then use =%BLUE%root%ENDCOLOR%://t3dcachedb03.psi.ch= and ignore =%BLUE%gsiftp srm gsidcap%ENDCOLOR%= </br> <!-- The next table compares the outdated =lcg-*= tools with the new =gfal-*= tools : | *lcg-* | *gfal-* | | lcg-cp | %BLUE%gfal-copy%ENDCOLOR% | | lcg-ls | %BLUE%gfal-ls%ENDCOLOR% | | lcg-del | %BLUE%gfal-rm%ENDCOLOR% | | lcg-lr | No equivalent CLI available, API is there | | lcg-get-checksum | %BLUE%gfal-sum%ENDCOLOR% | | lcg-getturls, lcg-gt | %BLUE%gfal-xattr%ENDCOLOR% | | lcg-stmd | Not available | | lcg-aa, lcg-cr, lcg-la, lcg-lg… and other catalog related cli | Partially available (%BLUE%gfal-xattr%ENDCOLOR%, %BLUE%gfal-copy%ENDCOLOR% and/or combination of commands) | | No equivalent lcg-util command | %BLUE%gfal-save%ENDCOLOR%, %BLUE%gfal-cat%ENDCOLOR% | --> The =gfal-*= tools are available both on the UIs and on the WNs servers ( you don't have to specify =/usr/bin/= ): <pre> $ /usr/bin/gfal-cat $ /usr/bin/gfal-copy $ /usr/bin/gfal-ls $ /usr/bin/gfal-mkdir $ /usr/bin/gfal-rm $ /usr/bin/gfal-save $ /usr/bin/gfal-sum $ /usr/bin/gfal-xattr </pre> It's available a =man= page for each of them ; e.g. : =$ man gfal-rm= The following session shows in action the =%BLUE%gfal-save%ENDCOLOR%= and =%BLUE%gfal-cat%ENDCOLOR%= commands : <pre>$ cat %GREEN%myfile%ENDCOLOR% %ORANGE%Hello T3%ENDCOLOR% $ cat %GREEN%myfile%ENDCOLOR% | %BLUE%gfal-save%ENDCOLOR% root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/%GREEN%myfile%ENDCOLOR% $ %BLUE%gfal-cat%ENDCOLOR% root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/%GREEN%myfile%ENDCOLOR% %ORANGE%Hello T3%ENDCOLOR% </pre> Further examples : <pre> $ %BLUE%gfal-copy%ENDCOLOR% root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 file:/dev/null -f Copying root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/t3-nagios/1MB-test-file_pool_t3fs14_cms_11 [DONE] after 0s $ %BLUE%gfal-ls%ENDCOLOR% root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user -l dr-xr-xr-x 0 0 0 512 Feb 21 2013 alschmid dr-xr-xr-x 0 0 0 512 Mar 8 14:31 amarini dr-xr-xr-x 0 0 0 512 May 12 2015 andis ... $ %BLUE%gfal-rm%ENDCOLOR% root://t3dcachedb03.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/auser/%ENDCOLOR%%GREEN%myfile%ENDCOLOR% </pre> ---+++ =gfalFS= The tool[[http://linux.die.net/man/1/gfalfs][gfalFS]] allows to mount a GridFTP server, or a SRM server, as a local dir ; it's useful to browse and remove dirs from that GridFTP server or to remotely access a log file without downloading it. Where to work on your =t3ui= : <pre> $ pwd /scratch/martinelli_f <---- use your account name </pre> Making a local dir to mount the GridFTP directory : <pre> $ mkdir t3 </pre> Mounting the GridFTP directory : <pre> $ gfalFS t3 gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f <---- use your account name instead </pre> Testing if =gfalFS= is working : <pre> $ ls -l t3 </pre> %RED%Removing a file%ENDCOLOR% <--- pay the greatest attention when you're deleting !!! The PSI T3 will protect your files but CSCS and all the other CMS Grid centres are very tolerant in terms of file deletion ! <pre> $ rm -f t3/sgejob-5939967/mybigfile $ echo $? 0 </pre> Uploading a local file into the GridFTP directory by =gfalFS= : <pre> $ cp /etc/hosts t3/ </pre> Transparent remote I/O ; don't do that for files >1GB but it's useful for log files : <pre> $ cat t3/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 </pre> %RED%Deleting several GridFTP directories and files%ENDCOLOR% : <--- pay the greatest attention when you delete !!! <pre> $ rm -rf t3/sgejob-59399* $ echo $? 0 </pre> Unmounting the GridFTP directory : <pre> $ gfalFS_umount -z t3 $ echo $? 0 </pre> %BLUE%EXACTLY LIKE THE PREVIOUS EXAMPLE BUT EXECUTED vs THE CSCS T2%ENDCOLOR% : <pre> $ pwd /scratch/martinelli_f <---- use your account name $ mkdir t2 $ gfalFS t2 gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/$USER $ ls -l t2 ... </pre> ---+++ =uberftp= - I/O queue =wan= [[http://dims.ncsa.illinois.edu/set/uberftp/userdoc.html][uberftp]] is a GridFTP interactive client : ---++++ Interactively accessing a GridFTP server <pre> $ uberftp %BLUE%t3se01.psi.ch%ENDCOLOR% 220 GSI FTP door ready 200 User :globus-mapping: logged in UberFTP (2.8)> %BLUE%cd%ENDCOLOR% /pnfs/psi.ch/cms/trivcat/store/user/martinelli_f UberFTP (2.8)> %BLUE%ls%ENDCOLOR% drwx------ 1 martinelli_f martinelli_f 512 Nov 25 2014 sgejob-5939967 drwx------ 1 martinelli_f martinelli_f 512 Nov 25 2014 sgejob-5939965 ... </pre> ---++++ Listing remote directory or files <pre> uberftp %BLUE%t3se01.psi.ch%ENDCOLOR% '%BLUE%ls%ENDCOLOR% /pnfs/psi.ch/cms/trivcat/store' 220 GSI FTP door ready 200 PASS command successful drwx------ 1 cmsuser cmsuser 512 Apr 15 13:18 mc drwx------ 1 cmsuser cmsuser 512 Aug 11 2009 relval drwx------ 1 cmsuser cmsuser 512 Oct 2 2009 PhEDEx_LoadTest07 drwx------ 1 cmsuser cmsuser 512 Jun 17 12:19 data drwx------ 1 cmsuser cmsuser 512 Jun 2 15:54 user drwx------ 1 cmsuser cmsuser 512 May 10 2009 unmerged </pre> ---++++ Copying locally a remote file <pre> uberftp %BLUE%t3se01.psi.ch%ENDCOLOR% '%BLUE%get%ENDCOLOR% /pnfs/psi.ch/cms/testing/test100' </pre> </pre> ---++++ Copying locally a remote dir Be aware that's a serial copy, not parallel : <pre> uberftp %BLUE%t3se01.psi.ch%ENDCOLOR% '%BLUE%get -r %ENDCOLOR% /pnfs/psi.ch/cms/testing .' </pre> </pre> ---++++ Erasing a remote dir <pre> uberftp t3se01.psi.ch '%RED%rm -r%ENDCOLOR% /pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%' </pre> or in debug mode : <pre> uberftp -debug 2 t3se01.psi.ch '%RED%rm -r%ENDCOLOR% /pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%' </pre> ---+++ globus-url-copy - I/O queue =wan= ---++++ Copying a dir between two GridFTP server - serial method The [[http://linux.die.net/man/1/globus-url-copy][globus-url-copy]] tool can copy file, files and *recursively ( but serially ) a whole dir* from a GridFTP server to another ; the file transfer will occur directly between the two GridFTP servers ; you'll have to know the absolute paths both on the sender and the receiver side ; in the next example we're going to copy the dir : * =gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/= * into : * =gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/= the path prefix %BLUE%/gpfs/ddn/srm/cms/%ENDCOLOR% has been discovered by a =uberftp gsiftp://stormgf2.pi.infn.it= session ; if you're in doubt contact the T3 administrators and we'll help you to identify this kind of prefixes ; at T3 / T2 the absolute paths are always respectively =/pnfs/psi.ch/cms= and =/pnfs/lcg.cscs.ch/cms= the dir copy example : <pre> $ globus-url-copy -continue-on-error -rst -nodcau -fast -vb -v -cd -r gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ Source: gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ Dest: gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ DYJetsToLL_M-50_HT-%GREEN%100%ENDCOLOR%to%GREEN%200%ENDCOLOR%_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/ Source: gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ Dest: gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ DYJetsToLL_M-50_HT-%GREEN%200%ENDCOLOR%to%GREEN%400%ENDCOLOR%_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/ ... </pre> ---++++ Copying a dir between two GridFTP servers by GNU parallel The tools [[http://toolkit.globus.org/toolkit/docs/latest-stable/gridftp/user/#globus-url-copy][globus-url-copy]], [[http://linux.die.net/man/1/uberftp][uberftp]], [[http://www.gnu.org/software/parallel/][GNU parallel]] can be used together to copy, *in parallel*, a dir between two GridFTP servers, in this example a %BLUE%C.Galloni%ENDCOLOR% /pnfs dir into a %ORANGE%MDefranc%ENDCOLOR% /pnfs dir ; no files will be routed trough the server running the globus-url-copy commands itself ( e.g. your UI, or a WN ) ; furthermore, since in a Grid environment each GridFTP server often acts as a transparent proxy to more than a GridFTP server, the copies will occur between a *matrix* 2x2 of GridFTP servers ; a bottleneck in the parallelism might occur due to the limited bandwidth available between the 2 data centres more than to the total amount of GridFTP servers involved. It's not compulsory but we recommend to run all the globus-url-copy commands in a [[https://www.gnu.org/software/screen/manual/screen.html][screen -L]] session to avoid to get interrupted the copies just because of a connection cut to the server where you've started them ; anyway it's safe to repeat the same globus-url-copy commands over and over again. ---+++++ Copying a T3 /pnfs dir into another T3 /pnfs dir ( use case requested by the users just once ) 1st of all we'll generate the globus-url-copy commands to be passed as input to [[http://www.gnu.org/software/parallel/][GNU parallel]] ; we'll save them into the file =tobecopied= ; afterward we'll started them in *parallel* ; we can arbitrarily choose how many parallel globus-url-copy commands to run by the [[http://www.gnu.org/software/parallel/][GNU parallel]] parameter =%RED%-j N%ENDCOLOR%= ; each globus-url-copy command will consume a CPU core on the server on which you're running it so don't set a =-j= parameter greater than the amount of CPU cores there available : <pre> $ uberftp -ls -r gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%cgalloni%ENDCOLOR%//RunII/Ntuple_080316/ | grep .root$ | awk {' print "globus-url-copy -v -cd gsiftp://t3se01.psi.ch/"$8" gsiftp://t3se01.psi.ch/"$8}' | sed 's/%BLUE%cgalloni%ENDCOLOR%/%ORANGE%mdefranc%ENDCOLOR%/2' > tobecopied $ # 10 parallel globus-url-copy $ cat tobecopied | parallel %RED%-j 10%ENDCOLOR% </pre> ---+++++ Copying a T2 /pnfs dir into a T3 /pnfs dir ( recurring use case ) Because this time the source site is different from the destination site we can increase the [[http://www.gnu.org/software/parallel/][GNU parallel]] parameter from =-j 10= to, for instance, =-j 30= ; for a copy from a T1/T2 to a T2 you might set =-j 50= ; regrettably it's impossible for an ordinary user to compute the correct =-j= ; again you might want to start the copies by a [[https://www.gnu.org/software/screen/manual/screen.html][screen -L]] session, but it's not compulsory. <pre> $ uberftp -ls -r gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/%BLUE%cgalloni%ENDCOLOR%/Ntuple_290216/WJetsToQQ_HT-600ToInf_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/ | grep .root$ | awk {' print "globus-url-copy -v -cd gsiftp://storage01.lcg.cscs.ch//"$8" gsiftp://t3se01.psi.ch/"$8}' | sed 's/%BLUE%cgalloni%ENDCOLOR%/%ORANGE%mdefranc%ENDCOLOR%/2' | sed 's/lcg.cscs.ch/psi.ch/3' > tobecopied $ # 30 parallel globus-url-copy $ cat tobecopied | parallel %RED%-j 30%ENDCOLOR% </pre> ---+++ =lcg-tools= - I/O queue =wan= ( legacy ) *This section is reported ONLY as an historical reference* These examples show the usage of the =lcg-*= commands for direct interaction with a SE and bypassing grid services like file catalogs and the information system; they have =man pages= where you can find information about their usage. The =lcg-*= commands should be used instead of the =srmcp, srmls, srmrm= commands that are known to being *too RAM demanding*. All of them are invoked with the =-b -D srmv2= options that cause the commands to not try to contact the central information system to find out the protocol for our SE; this would not hurt, but it involves an unnecessary additional roundtrip over the net while supplying the information on the command line makes things faster. *Listing files or dirs* (use =-l= flag for getting details like file sizes) : <verbatim> lcg-ls -b -D srmv2 [-l] srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user </verbatim> *Uploading a file* : <verbatim> lcg-cp -b -D srmv2 my_file.dat srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user/${USER}/testA01</verbatim> *Downloading a file*: <verbatim> lcg-cp -b -D srmv2 srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user/${USER}/testA01 my_new_localfile.dat </verbatim> *Deleting a file.<br /> Please take note of the =-l= flag!* It is necessary and tells the command that this file is not associated with any grid catalog entries from which it should be removed as well (this naturally assumes that the file you are deleting has not been registered by you or CRAB in a CMS catalog. If it is registered, the result will be that there still will be an entry in the catalog for this file, while the real copy has been deleted from our SE <verbatim> lcg-del -b -D srmv2 -l srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user/feichtinger/testA01 </verbatim> if you developed your =lcg-cp= commands omitting the =-b -D srmv2= options and these commands suddenly stop to work it could be needed to temporarily switch the information server from =bdii-fzk.gridka.de= to =lcg-bdii.cern.ch=: the error symptom will be like: <pre> $ lcg-cp 'srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/store/user/clange/GravitonToHH_4b_M-1000_TuneZ2star_8TeV-Madgraph_pythia6_Summer12_DR53X-PU_S10_START53_V7C-v1_AODSIM/CombinedSVV2RecoVertex_B_6_1_qpt.root' 'srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/myfile2.root' %RED%[GFAL][get_se_types_and_endpoints][] [BDII][g1_sd_get_se_types_and_endpoints]: No available information lcg_cp: Invalid argument %ENDCOLOR% </pre> and the workaround to fix: <pre> export LCG_GFAL_INFOSYS=lcg-bdii.cern.ch:2170 $ lcg-cp ... </pre> ---+++ !SRMv2 using srmcp, srmls, etc. - I/O queue =wan= ( legacy ) *This section is reported ONLY as an historical reference* *Note* : The =srmcp, srmls, srmrm= suite of commands is using Java which has the tendency to allocate *big amounts of memory*. This causes problems with the memory limits on our systems and also on the others Grid sites and cause *your jobs to get killed*. Try to use the gfal tools commands instead of the =srm*= commands. Listing a file or dir: <verbatim> srmls srm://t3se01.psi.ch:8443/srm/managerv2?SFN=/pnfs/psi.ch/cms/trivcat/store/user </verbatim> Downloading a file: <verbatim> srmcp -2 -globus_tcp_port_range 20000,25000 --streams_num=1 srm://t3se01.psi.ch:8443/srm/managerv2?SFN=//pnfs/psi.ch/cms/testing/test100 file:////tmp/test100 </verbatim> *Note:* 1 The =-globus_tcp_port_range 20000,25000= argument may regrettably be necessary because of a failure of the SRM client to correctly interpret an environment variable. 1 The =--streams_num=1= setting will transfer the file over one single connection (the default over 10 parallel streams only makes sense in slow WAN environments, and the connectivity can also be problematic for the required back connections). One stream is the safest setting. 1 If you are a *tcsh* user, you need to put the URL in quotes, because the "?" will else be interpreted as wildcard and you will get a *No match* error. ---++ Getting data from remote SEs to the T3 SE ---+++ Official datasets For official data sets/blocks that are *registered* in CMS DBS you *must* use the [[HowToOrderData][PhEDEx system]]. ---+++ Private datasets The recommended way to transfer private datasets (non-CMSSW ntuples) between sites is the File Transfer System (FTS) [[http://fts3-docs.web.cern.ch/fts3-docs/docs/cli/cli.html][documentation]]. In a nuthsell, you need to prepare a file that contains lines like =protocol://source protocol://destination=. An example file is the following: <pre> gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_973.root gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_973.root gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_981.root gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat/store/user/jpata/tth/Sep29_v1/ttHTobb_M125_13TeV_powheg_pythia8/Sep29_v1/161003_055207/0000/tree_981.root </pre> Then, you can submit the transfer with <pre> $ fts-transfer-submit -s https://fts3-pilot.cern.ch:8446 -f files.txt a360e11e-ab3b-11e6-8fe7-02163e00a39b </pre> You will get back an ID string, which you can use to monitor your transfer on the site https://fts3-pilot.cern.ch:8449/fts3/ftsmon/# The transfer will proceed in parallel, you can also specify resubmission options for failed jobs. The site prefixes can typically be found out from existing transfer logs on the grid or by inspecting the files in <pre> /cvmfs/cms.cern.ch/SITECONF/T2_CH_CSCS/JobConfig/site-local-config.xml /cvmfs/cms.cern.ch/SITECONF/T2_CH_CSCS/PhEDEx/storage.xml </pre> Some useful examples are below: <pre> gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat gsiftp://t3se01.psi.ch//pnfs/psi.ch/cms/trivcat gsiftp://eoscmsftp.cern.ch//eos/cms/ </pre> ---++ Job Stageout from other remote sites You can try to stageout your CRAB3 Job outputs directly a T3_CH_PSI but if these transfers will get too slow and/or unreliable than stageout first at T2_CH_CSCS and afterwards copy your files at T3_CH_PSI by the Leonardo Sala's [[https://twiki.cern.ch/twiki/bin/view/Main/LSDataReplica][data_replica.py]] or whatever else Grid tool able to transfers files in parallel between 2 sites. ---++ =/pnfs= dirs and files permissions At the time you requested a T3 account you've provided your X509 =DN=, namely a string like: =/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=accountname/CN=706134/CN=Name Surname= that you can always retrieve by running on a UI : <pre>$ voms-proxy-info | grep identity identity : /DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=users/C=CH/O=Paul-Scherrer-Institut (PSI)/CN=Fabio Martinelli </pre> we've also created your SE dir =/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%accountname%ENDCOLOR%= and granted the write permission *just to you*; this permission prevents the other users from deleting your files. By default even your group members won't be able to alter your own SE dir like shown in the following example : <pre>$ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%username%ENDCOLOR% # file : srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%username%ENDCOLOR% # owner : 2980 owner:2980:R%GREEN%W%ENDCOLOR%X <---- 2980 is the UID user:2980:R%GREEN%W%ENDCOLOR%X group:500:RX <---- no group write ; 500 is the GID other:RX </pre> the group write permission is switched on for the group dirs instead : <pre>$ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%b-physics%ENDCOLOR% # file : srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%b-physics%ENDCOLOR% # owner : 501 owner:501:R%GREEN%W%ENDCOLOR%X user:501:R%GREEN%W%ENDCOLOR%X group:500:R%GREEN%W%ENDCOLOR%X <---- each member of the group can upload and delete files; they can also create new subdirs other:RX </pre> if you need to create a =/pnfs= dir where also the group members can write and delete files you can proceed in this way: <pre>$ srmmkdir srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/%BLUE%TESTDIR%ENDCOLOR% $ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/%BLUE%TESTDIR%ENDCOLOR% # file : srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/%BLUE%TESTDIR%ENDCOLOR% # owner : 2980 owner:2980:R%GREEN%W%ENDCOLOR%X user:2980:R%GREEN%W%ENDCOLOR%X group:500:RX <----no group write, yet other:RX $ %RED%srm-set-permissions%ENDCOLOR% -type=ADD -group=R%GREEN%W%ENDCOLOR%X srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/%BLUE%TESTDIR%ENDCOLOR% $ srm-get-permissions srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/%BLUE%TESTDIR%ENDCOLOR% # file : srm://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/username/%BLUE%TESTDIR%ENDCOLOR% # owner : 2980 owner:2980:R%GREEN%W%ENDCOLOR%X user:2980:R%GREEN%W%ENDCOLOR%X group:500:R%GREEN%W%ENDCOLOR%X <---- now your group members can write files and dirs other:RX </pre> For obvious security reasons no user can create or remove dirs and files inside the main user dir =/pnfs/psi.ch/cms/trivcat/store/user= ; e.g. this =srmmkdir= command correctly fails : <pre>$ srmmkdir srm://t3se01//pnfs/psi.ch/cms/trivcat/store/user/%BLUE%TESTDIR%ENDCOLOR% Return code: SRM_AUTHORIZATION_FAILURE Explanation: srm://t3se01//pnfs/psi.ch/cms/trivcat/store/user/%BLUE%TESTDIR%ENDCOLOR% : %RED%Permission denied%ENDCOLOR% </pre> ---++ =/pnfs= dirs cleanup ---+++ T3_CH_PSI Each T3 user *must* to remove his/her old dir from =/pnfs= ; in order to quickly select and delete the unnecessary dirs, login on a UI ( it's crucial for the correct $USER resolution ) and prepare the file =/scratch/$USER/recursive.rm.pnfs= : <pre> # Extract your /pnfs dirs and save them in /scratch/$USER/recursive.rm.pnfs $ curl http://t3mon.psi.ch/PSIT3-custom/v_pnfs_top_dirs.txt 2> /dev/null | egrep $USER | awk {' print "uberftp t3se01.psi.ch \047rm -r "$15"\047"}' > /scratch/$USER/recursive.rm.pnfs # Erase in /scratch/$USER/recursive.rm.pnfs all the /pnfs dirs that you want to PRESERVE !! All the remaining /pnfs dirs will be recursively DELETED !! There are NO backups for /pnfs files !! $ vim /scratch/$USER/recursive.rm.pnfs # Run the recursive deletions ; a CMS proxy is needed $ source /scratch/$USER/recursive.rm.pnfs </pre> You might start with a small fragment of /scratch/$USER/recursive.rm.pnfs and check how it behaves before to run a big deletions campaign. ---+++ T2_CH_CSCS Regularly monitor your disk usage <pre> curl http://ganglia.lcg.cscs.ch/ganglia/files_cms.html | grep $USER </pre> or using the [[https://wiki.chipp.ch/twiki/pub/CmsTier3/HowToAccessSe/T2_Storage.py.txt][T2_Storage.py]] notebook/python script. In order to clean up, use uberftp as explained above. ---++[[BasicUnderstandingOfTheDCacheForAdvancedUser][Understanding of dCache for advanced user]]
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
txt
T2_Storage.py.txt
r1
manage
1.6 K
2017-01-30 - 10:49
JoosepPata
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r145
|
r136
<
r135
<
r134
<
r133
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r134 - 2018-11-20
-
NinaLoktionova
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback