Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup,Main.CMSAdminReaderGroup --> *OBSOLETE INFORMATION* ---+ basic understanding of the dCache for advanced user (archived pages) The Storage Element ( SE ) =t3se01.psi.ch= runs [[https://www.dcache.org/][dCache]], a Grid Storage middleware which transparently combines together the space made available by tens of fileservers in a single namespace called =/pnfs= ; on the top of it [[https://www.dcache.org/][dCache]] offers the Grid protocols =dcap gsidcap root srm gsiftp= in order to allow the Grid tools =lcg-cp xrdcp srm-cp dccp gfal-copy ...= to upload/download files into/from this single namespace. Whenever a new file get written in =/pnfs= either by a T3 user or by the PhEDEx service [[https://www.dcache.org/][dCache]] randomly selects a filesystem to host that new file, it doesn't matter the Grid protocol, the Grid tool, the user or the server used to upload the file itself ; typically more than a single file it's written in a =/pnfs= subdir and accordingly all the files inside that subdir will be randomly spread over all the available filesystems ; this files distribution implements a load-balancing among all the available filesystems and avoid any I/O bottleneck. Along the time newer, bigger and faster filesystems and fileservers replace their older peers but all these operations are transparently performed behind the scenes by a [[https://www.dcache.org/][dCache]] administrator ; the T3 users won't notice these maintenances but they will be affected by a [[https://www.dcache.org/][dCache]] SW upgrade or by a major fault occurred to a fileserver [[https://www.dcache.org/][dCache]] it's not the first middleware aggregating heterogeneous fileservers together ( e.g. look [[https://www.gluster.org/][Gluster]] or [[http://www.orangefs.org/][OrangeFS]] ) nor probably the best one ( e.g. it can't split a file in distributed chunks like [[https://www.gluster.org/][Gluster]] ) but it supports very well the Grid context ( VOMS authorizations, X509s management, Space Token support, Grid protocols, ... ) so it's a good choice for our specific needs. The Grid protocols/Grid tools versatility offered by a [[https://www.dcache.org/][dCache]] setup often confuses the new T3 user since it's not always well integrated with 3rd SW like ROOT / hadd / CMSSW, it behaves differently if the file access comes from the T3 LAN or from a remote Internet site ( WAN access ), if the CMSSW environment is loaded or not, if the CRAB environment is loaded or not, and it's processed with different policies set by the T3 admins, so a nonnegligible learning period is requested in order to grasp all these protocols, tools and corner cases. In order to use the T3 SE service at its bestest it's needed at least a basic understanding of the [[https://www.dcache.org/][dCache]] internals ( a comparable effort is needed to properly use a new batch system ) ; the basic unit of a [[https://www.dcache.org/][dCache]] setup is a single filesystem, necessarily hosted inside a single fileserver ; by its nature, each filesystem can sustain a certain amount of concurrent streaming operations, like downloading a 1GB =.root= file, and a higher amount of concurrent interactive operations, like opening a =.root= file from a batch job to read a fraction of it, do some computing, and after a while read another fraction. To differentiate these I/O cases [[https://www.dcache.org/][dCache]] offers a [[https://en.wikipedia.org/wiki/FIFO_(computing_and_electronics)][FIFO]] I/O queue system per filesystem. It's up to the [[https://www.dcache.org/][dCache]] administrator to select a reasonable threshold both for the streaming and the interactive cases, at T3 those are max 4 streaming operations and max ~100 concurrent interactive operations. Further I/O requests will get queued in their specific I/O queue and they won't start until an I/O slot won't get available. A T3 user will notice these "stuck" cases because his/her file request won't start like usual. If so, write immediately to =cms-tier3 AT lists.psi.ch= because 99% of the times that will be an error.More than a Grid protocol can be mapped to a same I/O queue by the [[https://www.dcache.org/][dCache]] administrator ; for instance at T3 the =dcap gsidcap= Grid protocols use the same interactive I/O queue =regular= ; a such overlap is made to mitigate the lack of a comprehensive I/O queue limits system since it implicitly implements the limit "max 100 dcap OR gsidcap connections" in the I/O queue =regular= associated to a single filesystem ; ideally a [[https://www.dcache.org/][dCache]] administrator would create an I/O queue for each Grid protocol instead and he would define a list of constraints involving more than a single I/O queue. Presently [[https://www.dcache.org/][dCache]] *CAN'T* enforce limits like : 1. max active I/O slots per filesystem involving all the several I/O queues using that filesystem ; it's only possible to define an isolated max active I/O slots per I/O queue 1. max active I/O *user* slots for a specific I/O queue 1. max active I/O *user* slots for all the I/O queues with the same name 1. max active I/O *user* slots for all the I/O queues 1. max active I/O *user* space in =/pnfs= all these inapplicable limits mean that a single misbehaving user will globally affect the T3 SE service ; especially the case 2. is occurred more than once in the early past. Only the T3 administrators will be able to fix these cases by usually identifying the culprit, killing his/her computational jobs and explaining what was wrong. *The I/O queues system and the Grid protocols mapping* %TWISTY{ mode="div" showlink="Show..." hidelink="Hide" showimgleft="%ICONURLPATH{toggleopen-small}%" hideimgleft="%ICONURLPATH{toggleclose-small}%" }% | *Grid protocol* | *t3server* | *Filesystem I/O queue* | *Max active slots for that I/O queue* | *Grid protocol/t3server endpoint reachable from Internet?* | | =dcap= | =t3se01.psi.ch= | =regular= | 100 | No | | =gsidcap= | =t3se01.psi.ch= | =regular= | 100 | No | | =root= | =t3se01.psi.ch= | =wan= | 4 | Yes | | =gsiftp= | =t3se01.psi.ch= | =wan= | 2 | Yes | | =srm= ( i.e. again =gsiftp= ) | =t3se01.psi.ch= | =wan= | 2 | Yes | | =dcap= | =t3dcachedb03.psi.ch= | none | 0 | No | | =gsidcap= | =t3dcachedb03.psi.ch= | none | 0 | No | | =root= | =t3dcachedb03.psi.ch= | =regular= | 100 | No | | =gsiftp= | =t3dcachedb03.psi.ch= | none | 0 | No | | =srm= ( i.e. again =gsiftp= ) | =t3dcachedb03.psi.ch= | none | 0 | No | %ENDTWISTY% by the following =watch= command ( to be executed on a =t3ui= server ) we can observe both the filesystems and their several I/O queues ; the =Movers= column reports the sums of the several =Restores Stores P2P-Server P2P-Client %BLUE%regular wan xrootd%ENDCOLOR%= Active/Max/Queued counters ; each T3 user is affected only by the =%BLUE%regular wan xrootd%ENDCOLOR%= traffic :</br> %TWISTY{ mode="div" showlink="Show..." hidelink="Hide" showimgleft="%ICONURLPATH{toggleopen-small}%" hideimgleft="%ICONURLPATH{toggleclose-small}%" }% <pre> $ watch --interval=1 --differences 'lynx -dump -width=800 http://t3dcachedb.psi.ch:2288/queueInfo | grep -v __________ | grep -v ops ' </pre> %ENDTWISTY% ---+++ globus-url-copy ---++++ Copying a dir between two GridFTP server - serial method The [[http://linux.die.net/man/1/globus-url-copy][globus-url-copy]] tool can copy file, files and *recursively ( but serially ) a whole dir* from a GridFTP server to another ; the file transfer will occur directly between the two GridFTP servers ; you'll have to know the absolute paths both on the sender and the receiver side ; in the next example we're going to copy the dir : * =gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/= * into : * =gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/= the path prefix %BLUE%/gpfs/ddn/srm/cms/%ENDCOLOR% has been discovered by a =uberftp gsiftp://stormgf2.pi.infn.it= session ; if you're in doubt contact the T3 administrators and we'll help you to identify this kind of prefixes ; at T3 / T2 the absolute paths are always respectively =/pnfs/psi.ch/cms= and =/pnfs/lcg.cscs.ch/cms= the dir copy example : <pre> $ globus-url-copy -continue-on-error -rst -nodcau -fast -vb -v -cd -r gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ Source: gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ Dest: gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ DYJetsToLL_M-50_HT-%GREEN%100%ENDCOLOR%to%GREEN%200%ENDCOLOR%_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/ Source: gsiftp://stormgf2.pi.infn.it%BLUE%/gpfs/ddn/srm/cms%ENDCOLOR%/store/user/arizzi/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ Dest: gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/martinelli_f/%ORANGE%VHBBHeppyV12%ENDCOLOR%/ DYJetsToLL_M-50_HT-%GREEN%200%ENDCOLOR%to%GREEN%400%ENDCOLOR%_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/ ... </pre> ---++++ Copying a dir between two GridFTP servers by GNU parallel The tools [[http://toolkit.globus.org/toolkit/docs/latest-stable/gridftp/user/#globus-url-copy][globus-url-copy]], [[http://linux.die.net/man/1/uberftp][uberftp]], [[http://www.gnu.org/software/parallel/][GNU parallel]] can be used together to copy, *in parallel*, a dir between two GridFTP servers, in this example a %BLUE%C.Galloni%ENDCOLOR% /pnfs dir into a %ORANGE%MDefranc%ENDCOLOR% /pnfs dir ; no files will be routed trough the server running the globus-url-copy commands itself ( e.g. your UI, or a WN ) ; furthermore, since in a Grid environment each GridFTP server often acts as a transparent proxy to more than a GridFTP server, the copies will occur between a *matrix* 2x2 of GridFTP servers ; a bottleneck in the parallelism might occur due to the limited bandwidth available between the 2 data centres more than to the total amount of GridFTP servers involved. It's not compulsory but we recommend to run all the globus-url-copy commands in a [[https://www.gnu.org/software/screen/manual/screen.html][screen -L]] session to avoid to get interrupted the copies just because of a connection cut to the server where you've started them ; anyway it's safe to repeat the same globus-url-copy commands over and over again. ---+++++ Copying a T3 /pnfs dir into another T3 /pnfs dir ( use case requested by the users just once ) 1st of all we'll generate the globus-url-copy commands to be passed as input to [[http://www.gnu.org/software/parallel/][GNU parallel]] ; we'll save them into the file =tobecopied= ; afterward we'll started them in *parallel* ; we can arbitrarily choose how many parallel globus-url-copy commands to run by the [[http://www.gnu.org/software/parallel/][GNU parallel]] parameter =%RED%-j N%ENDCOLOR%= ; each globus-url-copy command will consume a CPU core on the server on which you're running it so don't set a =-j= parameter greater than the amount of CPU cores there available : <pre> $ uberftp -ls -r gsiftp://t3se01.psi.ch/pnfs/psi.ch/cms/trivcat/store/user/%BLUE%cgalloni%ENDCOLOR%//RunII/Ntuple_080316/ | grep .root$ | awk {' print "globus-url-copy -v -cd gsiftp://t3se01.psi.ch/"$8" gsiftp://t3se01.psi.ch/"$8}' | sed 's/%BLUE%cgalloni%ENDCOLOR%/%ORANGE%mdefranc%ENDCOLOR%/2' > tobecopied $ # 10 parallel globus-url-copy $ cat tobecopied | parallel %RED%-j 10%ENDCOLOR% </pre> ---+++++ Copying a T2 /pnfs dir into a T3 /pnfs dir ( recurring use case ) Because this time the source site is different from the destination site we can increase the [[http://www.gnu.org/software/parallel/][GNU parallel]] parameter from =-j 10= to, for instance, =-j 30= ; for a copy from a T1/T2 to a T2 you might set =-j 50= ; regrettably it's impossible for an ordinary user to compute the correct =-j= ; again you might want to start the copies by a [[https://www.gnu.org/software/screen/manual/screen.html][screen -L]] session, but it's not compulsory. <pre> $ uberftp -ls -r gsiftp://storage01.lcg.cscs.ch//pnfs/lcg.cscs.ch/cms/trivcat/store/user/%BLUE%cgalloni%ENDCOLOR%/Ntuple_290216/WJetsToQQ_HT-600ToInf_TuneCUETP8M1_13TeV-madgraphMLM-pythia8/ | grep .root$ | awk {' print "globus-url-copy -v -cd gsiftp://storage01.lcg.cscs.ch//"$8" gsiftp://t3se01.psi.ch/"$8}' | sed 's/%BLUE%cgalloni%ENDCOLOR%/%ORANGE%mdefranc%ENDCOLOR%/2' | sed 's/lcg.cscs.ch/psi.ch/3' > tobecopied $ # 30 parallel globus-url-copy $ cat tobecopied | parallel %RED%-j 30%ENDCOLOR% </pre> -- Main.NinaLoktionova - 2018-11-20
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r3 - 2022-11-11
-
DerekFeichtinger
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback