Tags:
create new tag
view all tags

Arrow left Go to next page of Tier3 site log MOVED TO...

25. 09. 2008 Testing of PhEDEx service for the PSI Tier-3

I activated the PhEDEx service yesterday after removing some difficulties due to a missing protocol statement for SRMv2 in the trivial file catalog definition.

First tests were for transfers from FZK to PSI:

Throughput is quite good. We can fill the 1 Gb/s link if we want. But there is a particular SRM error state failing lots of the transfers. Seems at first sight source related:

Error message statistics per site:
===================================

 *** ERRORS from T1_DE_FZK_Buffer:***
    294   copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException
    258   copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 4
    219   copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 3
    126   copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 2
     36   copy of [srm-URL] into [srm-URL] failed, status = SRM_FAILURE explanation=FAILED: at [date] state Failed : retrieval of "from" TURL failed with error java.lang.ArrayIndexOutOfBoundsException: 1
      9   copy did not complete or status unknown
      5   no detail - validate failed: [unknown reason - inspect log]

SITE STATISTICS:
==================
                         first entry: 2008-09-24 12:27:50      last entry: 2008-09-25 12:26:45
    T1_DE_FZK_Buffer (OK: 1320  Err: 947  Exp:   0  Canc:   0  Lost:   0)   succ.: 58.2 %   total: 3260.3 GB  (37.8 MB/s)

Impact of this test on total PSI traffic (week view):
PSI-traffic-week-20080925.png

Server throughput (2 fileservers used) over last day:
PSI-traffic-servers-20080925.png

Error Mode: retrieval of "from" TURL failed

I found this page of Nicolo Magnini, which states that the respective file is not yet available on the disk server (possibly not yet staged in). https://twiki.cern.ch/twiki/bin/view/CMS/SAMLocalSRMv2

My own tests with copying from T2_CH_CSCS to T3_CH_PSI show that this seems to be an error happening if the source file does not exist in pnfs.

Transfers from CSCS

At first I got the following errors from the logs:

 *** ERRORS from T2_CH_CSCS:***
     23   copy did not complete or status unknown

Some closer manual investigation showed, that the TFC rule of CSCS for the generated LoadTest filename did not yield an existing file. However, the extracted error message is almost useless for diagnosing this condition. Correcting the rule for the new CMS site names led to successful copies.

-- DerekFeichtinger - 25 Sep 2008


Arrow left Go to next page of Tier3 site log MOVED TO...

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2008-09-26 - DerekFeichtinger
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback