Tags:
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+ dCache Problem descriptions ---++ SRM problem *Description*: SRMv1 read from the SE fails, even though globus-url-copy of the same file from the SE as well as from the pool node directly works fine. *Other Symptoms*: The best information can be obtained from the pool cell of the affected pool: <pre> srmcp -debug srm://t3se01.psi.ch:8443/srm/managerv1?SFN=/pnfs/psi.ch/cms/automatic_test-20080823-0015-2165-srm1 file:////tmp/derek-srm1-1 10.43.10 Sending reply Pool=t3fs01_cms_1;NOTAGS;;PnfsId=000200000000000000001778;have=true;waiting=false 10.43.10 PoolIoFileMessage delivered to ioFile (method) 10.43.10 Dup Request : none <GFTP-t3fs01-Unknown-105@gridftp-t3fs01Domain:10002> 10.43.10 JOB queued 000200000000000000001778 10.43.10 JOB prepare 000200000000000000001778 10.43.10 Entry for 000200000000000000001778 found and is 000200000000000000001778 <C-------X--(0)[0]> 51200 si={cms:cms} 10.43.10 JOB run 000200000000000000001778 10.43.10 Trying to open /data1/t3fs01_cms_1/data/000200000000000000001778 10.43.10 (GFtp_2_nio/000200000000000000001778) version=1, role=Sender, mode=E, host=127.0.0.1:34210, buffer=0, passive=false, parallelism=10 10.43.10 (GFtp_2_nio/000200000000000000001778) Connecting to /127.0.0.1:34210 10.43.10 (GFtp_2_nio/000200000000000000001778) java.net.ConnectException: Connection refused </pre> A copy directly from the pool that works, looks like this <pre> globus-url-copy -dbg gsiftp://t3fs01.psi.ch/pnfs/psi.ch/cms/automatic_test-20080823-0015-2165-srm1 file:/tmp/mytest.tst 10.51.12 Sending reply Pool=t3fs01_cms_1;NOTAGS;;PnfsId=000200000000000000001778;have=true;waiting=false 10.51.12 PoolIoFileMessage delivered to ioFile (method) 10.51.12 Dup Request : none <GFTP-t3fs01-Unknown-106@gridftp-t3fs01Domain:10003> 10.51.12 JOB queued 000200000000000000001778 10.51.12 JOB prepare 000200000000000000001778 10.51.12 Entry for 000200000000000000001778 found and is 000200000000000000001778 <C-------X--(0)[0]> 51200 si={cms:cms} 10.51.12 JOB run 000200000000000000001778 10.51.12 Trying to open /data1/t3fs01_cms_1/data/000200000000000000001778 10.51.12 (GFtp_2_nio/000200000000000000001778) version=1, role=Sender, mode=S, host=t3fs01:32916, buffer=0, passive=false, parallelism=5 10.51.12 (GFtp_2_nio/000200000000000000001778) Connecting to t3fs01/192.33.123.41:32916 10.51.12 (GFtp_2_nio/000200000000000000001778) Opened Socket[addr=t3fs01/192.33.123.41,port=32916,localport=32917] 10.51.12 (GFtp_2_nio/000200000000000000001778) Entering event loop 10.51.12 (GFtp_2_nio/000200000000000000001778) Closing Socket[addr=t3fs01/192.33.123.41,port=32916,localport=32917] 10.51.12 (GFtp_2_nio/000200000000000000001778) Multiplexer shutting down 10.51.12 (GFtp_2_nio/000200000000000000001778) Left event loop and closing channels 10.51.12 (GFtp_2_nio/000200000000000000001778) Transfer finished: 51200 bytes transferred in 0.09 seconds = 0.568 MB/s 10.51.12 000200000000000000001778;length=51200;timer=Null Time elapsed 10.51.13 IO thread finished : Thread[Scheduler-IO-1-worker,5,t3fs01_cms_1-threads] </pre> In the SRM case, somehow the pool tries to connect to "/127.0.0.1:[port]" instead of to "t3fs01/192.33.123.41:[port]". The hostname is an empty string, it seems. Unclear what is used in the real connect. May also be that the listening socket is not for ALL addresses of the host (including localhost) but only for the named interface. To identify the issue, we attach a truss to the pool process on the pool node: <pre> truss -v connect -f -p 2954 &> /tmp/truss </pre> Looking through the output for both cases, we find that in the case of the SRM mediated copy, the connect indeed is tried to the localhost address, while in the direct gridftp case, the named interface was used: <pre> # SRM mediated copy 2954/82: connect(25, 0xD10B0850, 16, SOV_DEFAULT) Err#146 ECONNREFUSED 2954/82: AF_INET name = 127.0.0.1 port = 34235 # standalone gridftp copy 2954/82: connect(24, 0xD10B0850, 16, SOV_DEFAULT) = 0 2954/82: AF_INET name = 192.33.123.41 port = 32930 </pre> A reference installation on the Tier-2 shows that a Solaris-1 node also only listens to the named interface based on the netstat output: <pre> se05# netstat UDP: IPv4 Local Address Remote Address State -------------------- -------------------- ---------- se05.32772 mon.8671 Connected TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- se05.41053 se09.60559 49640 0 49640 0 TIME_WAIT se05.24411 wn08.52485 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52486 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52487 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52488 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52489 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52490 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52491 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52492 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52493 5840 0 49232 0 TIME_WAIT se05.24411 wn08.52494 5840 0 49232 0 TIME_WAIT se05.ssh ui.59621 9056 0 49232 0 ESTABLISHED se05.65360 storage01.52323 131072 0 49640 0 ESTABLISHED se05.65361 storage01.52323 131072 0 49640 0 ESTABLISHED se05.65362 storage01.52323 130944 0 49640 0 ESTABLISHED se05.65363 storage01.52323 131072 0 49640 0 ESTABLISHED </pre> -- Main.DerekFeichtinger - 28 Aug 2008
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r1 - 2008-08-28
-
DerekFeichtinger
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback