Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSSiteLogXX][previous page]] / [[CMSSiteLogXX][next page]] of CMS site log %M% ---+ 17. 09. 2009 lcg-cp stageout problems from CRAB jobs *NOTE*: This problem was reported on [[https://hypernews.cern.ch/HyperNews/CMS/get/crabFeedback/2431.html][ this hypernews item]]. The problem is tracked on [[https://savannah.cern.ch/support/index.php?109984][this Savannah support request]]. It also has been submitted to the dcache support list on 2009-09-18 as tracker item #5109. Andrea Rizzi and Andreas Schaetti reported on stageout failures from their CRAB jobs. The relevant part of the CRAB log output is <pre %FILESTYLE%> ########## contents of SE interaction 2009-09-17 15:15:12.751466: Executed: lcg-ls -b -D srmv2 -t 2400 --verbose srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/store/user/arizzi/W H_HTobb_Pt100_M115_GEN_v2/WH_HTobb_Pt100_M115_GEN_v2/3804f52f25a016d6eb88c4371b906f7b/hwbbar115_10TeV_GEN_MC_2.root Done with exit code: 256 and output: Warning: -t,--timeout is deprecated! Use --timeout-* options instead /pnfs/lcg.cscs.ch/cms/trivcat/store/user/arizzi/WH_HTobb_Pt100_M115_GEN_v2/WH_HTobb_Pt100_M115_GEN_v2/3804f52f25a016d6eb88c4371b906f7b/hwbbar115_10TeV_GEN _MC_2.root: [SE][Ls][SRM_INVALID_PATH] could not get storage info by path : CacheException(rc=10001;msg=path /pnfs/fs/usr/cms/trivcat/store/user/arizzi/WH _HTobb_Pt100_M115_GEN_v2/WH_HTobb_Pt100_M115_GEN_v2/3804f52f25a016d6eb88c4371b906f7b/hwbbar115_10TeV_GEN_MC_2.root not found ( .(id)(hwbbar115_10TeV_GEN_M C_2.root) )) SE type: SRMv2 2009-09-17 15:15:13.890772: Executed: lcg-cp --verbose --vo=cms -b -D srmv2 -t 2400 --verbose file:///home/egee/cms074/globus-tmp.wn36.5872.0/https_3a_2f_2fwms213.cern.ch_ 3a9000_2fvgkrUkMs0YPpBCGY4QTPjg/CMSSW_3_1_2/hwbbar115_10TeV_GEN_MC_2.root srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat /store/user/arizzi/WH_HTobb_Pt100_M115_GEN_v2/WH_HTobb_Pt100_M115_GEN_v2/3804f52f25a016d6eb88c4371b906f7b/hwbbar115_10TeV_GEN_MC_2.root Done with exit code: 256 and output: Warning: -t,--timeout is deprecated! Use --timeout-* options instead Using grid catalog type: UNKNOWN Using grid catalog : (null) VO name: cms Checksum type: None Destination SE type: SRMv2 [SE][Mkdir][SRM_INVALID_PATH] srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/trivcat/store/user/arizzi/WH_HTobb_Pt100_M115_GEN_v 2/WH_HTobb_Pt100_M115_GEN_v2/3804f52f25a016d6eb88c4371b906f7b/hwbbar115_10TeV_GEN_MC_2.root: srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg. cscs.ch/cms/trivcat/store/user/arizzi/WH_HTobb_Pt100_M115_GEN_v2/WH_HTobb_Pt100_M115_GEN_v2/3804f52f25a016d6eb88c4371b906f7b : %GREEN%parent path or a component of the parent path does not exist lcg_cp: No such file or directory%ENDCOLOR% </pre> Andrea Rizzi's user directory exists, but none of the subdirectories does exist. It seems that =lcg-cp= does not create automatically all the required subdirectories for a request. The job seem to run fine at T2_IT_Pisa. ---++ lcg-cp refuses to create more than one subdirectory layer at T2_CH_CSCS - this seems intentional! lcg-cp (executed from CSCS-UI) with implicit creation of one subdirectory works, while implict creation of two directories fails. This behavior seems to be intentional, and dcache responds with a specific error message about not being able to create the nested directory, because the parent directory is not there. I was able to confirm the path creation behavior in a few tests. Note that our site is running dcache-1.9.3-3 at the moment of these tests. * %Y% First I confirm that the path =/pnfs/lcg.cscs.ch/cms/local_tests= exists <pre> lcg-ls -b -D srmv2 --srm-timeout 2400 --verbose srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/local_tests SE type: SRMv2 /pnfs/lcg.cscs.ch/cms/local_tests/automatic_test-20080904-2021-8387-srm2b /pnfs/lcg.cscs.ch/cms/local_tests/automatic_test-20081207-1239-8889-gftp ... </pre> * %ICON{"choice-no"}% Now I try to copy a file nested in two subdirectories to this directory, and this fails with the exact same error.<pre> lcg-cp --verbose --vo=cms -b -D srmv2 -t 2400 --verbose file:///tmp/dcachetest-20090917-1352-3942/srcfile srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/local_tests/%GREEN%derekdir1/derekdir2/lcg-cp-derek1%ENDCOLOR% Warning: -t,--timeout is deprecated! Use --timeout-* options instead Using grid catalog type: UNKNOWN Using grid catalog : (null) VO name: cms Checksum type: None Destination SE type: SRMv2 [SE][Mkdir][SRM_FAILURE] srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/local_tests/derekdir1/derekdir2/lcg-cp-derek1: srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/local_tests/derekdir1/derekdir2 Failed to create, got error return code from pnfs: path /pnfs/fs/usr/cms/local_tests/derekdir1/derekdir2 not found ( .(id)(derekdir2) ) lcg_cp: Invalid argument </pre> * %Y% Now I try the same copy, but with only one subdirectory in the request, and this succeeds<pre> lcg-cp --verbose --vo=cms -b -D srmv2 -t 2400 --verbose file:///tmp/dcachetest-20090917-1352-3942/srcfile srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/local_tests/%GREEN%derekdir1/lcg-cp-derek1%ENDCOLOR% Warning: -t,--timeout is deprecated! Use --timeout-* options instead Using grid catalog type: UNKNOWN Using grid catalog : (null) VO name: cms Checksum type: None Destination SE type: SRMv2 Destination SRM Request Token: -2136239017 Source URL: file:/tmp/dcachetest-20090917-1352-3942/srcfile File size: 51200 Source URL for copy: file:/tmp/dcachetest-20090917-1352-3942/srcfile Destination URL: gsiftp://se16.lcg.cscs.ch:2811//pnfs/lcg.cscs.ch/cms/local_tests/derekdir1/lcg-cp-derek1 # streams: 1 51200 bytes 49.72 KB/sec avg 49.72 KB/sec inst Transfer took 2020 ms </pre> ---++ lcg-cp correctly creates multiple subdirectory layers at T2_IT_Pisa Here I can confirm that the creation of two layers of subdirectories is working at T2_IT_Pisa. The lcg-cp is again executed from CSCS-UI, so any differences observed must be attributed to the SE. * %Y% Creation of a test user directory for my username<pre> srmmkdir srm://cmsdcache.pi.infn.it:8443/srm/managerv2?SFN=/pnfs/pi.infn.it/data/cms/store/user/dfeichti </pre> * %Y% Transfer of a simple file<pre> lcg-cp --verbose --vo=cms -b -D srmv2 -t 2400 --verbose file:///tmp/dcachetest-20090917-1205-24206/srcfile srm://cmsdcache.pi.infn.it:8443/srm/managerv2?SFN=/pnfs/pi.infn.it/data/cms/store/user/dfeichti/lcg-cp-derek5 Warning: -t,--timeout is deprecated! Use --timeout-* options instead Using grid catalog type: UNKNOWN Using grid catalog : (null) VO name: cms Checksum type: None Destination SE type: SRMv2 Destination SRM Request Token: -2141283420 Source URL: file:/tmp/dcachetest-20090917-1205-24206/srcfile File size: 51200 Source URL for copy: file:/tmp/dcachetest-20090917-1205-24206/srcfile Destination URL: gsiftp://cmsdcache10.pi.infn.it:2811//pnfs/pi.infn.it/data/cms/store/user/dfeichti/lcg-cp-derek5 # streams: 1 51200 bytes 42.96 KB/sec avg 42.96 KB/sec inst Transfer took 2060 ms </pre> * %Y% Transfer of a file with creation of two directory layers <pre> lcg-cp --verbose --vo=cms -b -D srmv2 -t 2400 --verbose file:///tmp/dcachetest-20090917-1205-24206/srcfile srm://cmsdcache.pi.infn.it:8443/srm/managerv2?SFN=/pnfs/pi.infn.it/data/cms/store/user/dfeichti/subdir1/subdir2/lcg-cp-derek5 Warning: -t,--timeout is deprecated! Use --timeout-* options instead Using grid catalog type: UNKNOWN Using grid catalog : (null) VO name: cms Checksum type: None Destination SE type: SRMv2 Destination SRM Request Token: -2141283364 Source URL: file:/tmp/dcachetest-20090917-1205-24206/srcfile File size: 51200 Source URL for copy: file:/tmp/dcachetest-20090917-1205-24206/srcfile Destination URL: gsiftp://cmsdcache7.pi.infn.it:2811//pnfs/pi.infn.it/data/cms/store/user/dfeichti/subdir1/subdir2/lcg-cp-derek5 # streams: 1 51200 bytes 44.34 KB/sec avg 44.34 KB/sec inst Transfer took 2060 ms </pre> ---++ srmcp succeeds in creating nested subdirectories at CSCS Contrary to lcg-cp, srmcp has no problem to create the implicit two sub directories Executing from CSCS UI: <pre> srmcp --debug -2 file:////tmp/dcachetest-20090917-1205-24206/srcfile srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms/local_tests/%GREEN%dfsub1/dfsub2/df1%ENDCOLOR% WARNING: SRM_PATH is defined, which might cause a wrong version of srm client to be executed WARNING: SRM_PATH=/opt/d-cache/srm Storage Resource Manager (SRM) Client version 2.1.2 Copyright (c) 2002-2008 Fermi National Accelerator Laboratory SRM Configuration: default_port=8443 debug=true ... ... execution of CopyJob, source = file:////tmp/dcachetest-20090917-1205-24206/srcfile destination = gsiftp://se25.lcg.cscs.ch:2811//pnfs/lcg.cscs.ch/cms/local_tests/dfsub1/dfsub2/df1 completed SRMClientV2 : srmPutDone , contacting service httpg://storage01.lcg.cscs.ch:8443/srm/managerv2 srmPutDone status code=SRM_SUCCESS copy_jobs is empty stopping copier </pre> ---++ Differences between T2_CH_CSCS and T2_IT_PISA As noted above, the lcg-cp tests were all executed from CSCS-UI, so a difference in lcg-cp version cannot be responsible for the different behavior. My guess is either dcache version or dcache configuration. ||*T2_CH_CSCS*|*T2_IT_PISA*| |Storage Manager| dcache-1.9.3 | dcache-1.8.0-15p5 | |namespace| pnfs | pnfs | |lcg-util version| 1.7.6-1 | 1.7.4-1 | |GFAL-client| 1.11.8-1 | 1.11.6-2 | ---++ dcache configuration? On the T2_CH_CSCS dcache, the recursive directory creation is correctly enabled: <pre %FILESTYLE%> # ---- Enable automatic creation of directories. # # Allow automatic creation of directories via SRM # # allow=true, disallow=false # RecursiveDirectoryCreation=true </pre> A look at the srm.batch file that sets the properties defaults, confirms <pre %FILESTYLE%> set context -c RecursiveDirectoryCreation true </pre> ---++ The behavior at CSCS is inconsistent for lcg-cp (but not for srmcp) It turns out that 2-layer directory creation sometimes succeeds at CSCS. Therefore I used a small script to run a larger number of tests each against a few SEs. All tests ran from the CSCS UI %TABLE{caption="lcg-cp 2-layer implicit directory creation"}% |*SE*| *dcache version* | *namespace* |*Failures/Total tries* | |CSCS|1.9.3-3| pnfs | 9/20 | |Estonia| 1.9.3-3 | pnfs | 0/20 | |PSI| 1.9.2-4 | pnfs | 0/20 | |Pisa| 1.8.0-p15 | pnfs | 0/20 | *Estonia runs the exact same dcache version as we do, and they also still have pnfs. All tests I did on their site succeeded, so this points to some local problem at CSCS*. My suspicions are mostly targeted at the pnfs namespace... Still: The fact that lcg-cp and srmcp show such different behavior on our site is a bit unsettling. 1-layer directory creation always succeeds %TABLE{caption="lcg-cp 1-layer implicit directory creation"}% |*SE*| *dcache version* | *Failures* | |CSCS|1.9.3-3| 0/20 | Running the tests with srmcp against CSCS always succeeds %TABLE{caption="srmcp 2-layer implicit directory creation"}% |*SE*| *dcache version* | *Failures* | |CSCS|1.9.3-3| 0/20 | ---++ 04. 02. 2010 Problem solved after updates to dcache 1.9.x Running the test against the newer dcache versions at CSCS always shows successful runs. -- Main.DerekFeichtinger - 2009-09-17 ---------------- %ICON{arrowleft}% Go to [[CMSSiteLogXX][previous page]] / [[CMSSiteLogXX][next page]] of CMS site log %M%
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r9
<
r8
<
r7
<
r6
<
r5
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r9 - 2010-02-04
-
DerekFeichtinger
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback