Go to
next page of Tier3 site log
25. 09. 2008 Testing of PhEDEx service for the PSI Tier-3
I activated the PhEDEx service yesterday after removing some difficulties due to a missing protocol statement for SRMv2 in the trivial file catalog definition.
First tests were for transfers from FZK to PSI:
Throughput is quite good. We can fill the 1 Gb/s link if we want. But there is a particular SRM error state failing lots of the transfers. Seems at first sight source related:
Error message statistics per site:
===================================
*** ERRORS from T1_DE_FZK_Buffer:***
294 copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException
258 copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 4
219 copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 3
126 copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 2
36 copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 1
9 copy did not complete or status unknown
5 no detail - validate failed: [unknown reason - inspect log]
SITE STATISTICS:
==================
first entry: 2008-09-24 12:27:50 last entry: 2008-09-25 12:26:45
T1_DE_FZK_Buffer (OK: 1320 Err: 947 Exp: 0 Canc: 0 Lost: 0) succ.: 58.2 % total: 3260.3 GB (37.8 MB/s)
Impact of this test on total PSI traffic (week view):
Server throughput (2 fileservers used) over last day:
Error Mode: retrieval of "from" TURL failed
I found this page of Nicolo Magnini, which states that the respective file is not yet available on the disk server (possibly not yet staged in).
https://twiki.cern.ch/twiki/bin/view/CMS/SAMLocalSRMv2
My own tests with copying from T2_CH_CSCS to T3_CH_PSI show that this seems to be an error happening if the source file does not exist in pnfs.
Transfers from CSCS
At first I got the following errors from the logs:
*** ERRORS from T2_CH_CSCS:***
23 copy did not complete or status unknown
Some closer manual investigation showed, that the TFC rule of CSCS for the generated
LoadTest filename did not yield an existing file. However, the extracted error message is almost useless for diagnosing this condition.
Correcting the rule for the new CMS site names led to successful copies.
--
DerekFeichtinger - 25 Sep 2008
Go to
next page of Tier3 site log