Tags:
create new tag
view all tags

Arrow left Go to previous page / next page of CMS site log MOVED TO...

18-20. 2. 2008 Update and reconfiguration of dCache

Since ATLAS needs space token functionality, the dCache had to be reconfigured. This caused a lot of problems and a solution satisfying also CMS was only possible, by using an undocumented (because probably based on a faulty implementation) possibility in the /opt/d-cache/etc/LinkGroupAuthorization.conf file.

LinkGroup cms-linkGroup
/cms/Role=*
cmssgm/Role=*
cmsprd/Role=*
cms001/Role=*

In the afternoon of Feb 20th the dCache was functional again and we had successful transfers. For the first time I also was able to observe many successful transfers from ASGC. The total rate was the highest I had seen up to this point.

fileservers-nw-report-20080221-0957-day.gif

SITE STATISTICS:
==================
                         first entry: 2008-02-21 02:08:16      last entry: 2008-02-21 06:26:34
site: T1_ASGC_Buffer (OK: 180   Err: 0   Exp/Canceled: 122)	succ. rate: 100.0 %   total: 505.1 GB
site: T1_FNAL_Buffer (OK: 39   Err: 39   Exp/Canceled: 4)	succ. rate: 50.0 %   total: 107.0 GB
site: T1_FZK_Buffer (OK: 158   Err: 12   Exp/Canceled: 46)	succ. rate: 92.9 %   total: 390.1 GB
site: T1_IN2P3_Buffer (OK: 69   Err: 72   Exp/Canceled: 6)	succ. rate: 48.9 %   total: 170.4 GB
site: T1_PIC_Buffer (OK: 15   Err: 0   Exp/Canceled: 0)	succ. rate: 100.0 %   total: 15.5 GB
site: T1_RAL_Buffer (OK: 119   Err: 20   Exp/Canceled: 0)	succ. rate: 85.6 %   total: 331.1 GB

TOTAL SUMMARY:
==================
                         first entry: 2008-02-21 02:08:16      last entry: 2008-02-21 06:26:34
total transferred: 1414.9 GB  in 4.3 hours
avg. total rate: 93.5 MB/s = 747.9 Mb/s  = 7888.1 GB/day

Took a look a few moments later on the errors from IN2P3 and FNAL:

 *** ERRORS from T1_IN2P3_Buffer:***
     73   Failed SOURCE error during PREPARATION phase: [REQUEST_TIMEOUT] failed to prepare source file in 180 seconds

 *** ERRORS from T1_FNAL_Buffer:***
     46   transfer timed out after 10845 seconds with signal 9
      4   Canceled Job canceled
      1   Canceled TRANSFER error during TRANSFER phase: [REQUEST_TIMEOUT] failed to complete srmcopy transfer request in 3600 seconds

Nonetheless, the most frequent error appears on the FZK link:

 *** ERRORS from T1_FZK_Buffer:***
     33   Failed TRANSFER error during TRANSFER phase: [GRIDFTP] the server sent an error response: 426 426 Transfer aborted (Unexpected Exception : java.lang.InterruptedException)

28. 3. 2008 Good throughput of real data, but many transfer errors

I increased the number of WAN movers per CMS pool to 8. Part of the transfers are due to our ordering part of the FastSim data sets.

SITE STATISTICS:
==================
                         first entry: 2008-03-28 00:38:53      last entry: 2008-03-28 12:30:13
      T1_ASGC_Buffer (OK:  95  Err:   5  Exp/Cancl:   0 )   succ.: 95.0 %   total:  266.5 GB  ( 6.2 MB/s)
      T1_CERN_Buffer (OK: 391  Err: 161  Exp/Cancl: 1272 )   succ.: 70.8 %   total:  572.5 GB  (13.4 MB/s)
      T1_FNAL_Buffer (OK: 458  Err: 140  Exp/Cancl: 272 )   succ.: 76.6 %   total:  531.8 GB  (12.5 MB/s)
       T1_FZK_Buffer (OK: 558  Err: 140  Exp/Cancl: 1018 )   succ.: 79.9 %   total: 1211.6 GB  (28.4 MB/s)
     T1_IN2P3_Buffer (OK: 341  Err: 229  Exp/Cancl: 126 )   succ.: 59.8 %   total:  755.4 GB  (17.7 MB/s)
       T1_RAL_Buffer (OK:  66  Err:  26  Exp/Cancl:   0 )   succ.: 71.7 %   total:  183.5 GB  ( 4.3 MB/s)

TOTAL SUMMARY:
==================
                         first entry: 2008-03-28 00:38:53      last entry: 2008-03-28 12:30:13
total transferred: 3279.4 GB  in 11.9 hours
avg. total rate: 78.7 MB/s = 629.5 Mb/s  = 6638.8 GB/day

fileservers-nw-report-20080328-1512-day.gif

Note that the total throughput shown in the above graph also cover the internal traffic between the fileservers. Since each Fileserver is configured as a GridFTP door, it can accept transfers, but will usually proxy it to another server.

31. 3. 2008 High quality throughput, minimal errors, 44 MB/s from FZK

SITE STATISTICS:
==================
                         first entry: 2008-03-30 19:48:10      last entry: 2008-03-31 07:39:39
      T1_ASGC_Buffer (OK: 112  Err:   0  Exp/Cancl:   0 )   succ.: 100.0 %   total:  314.3 GB  ( 7.4 MB/s)
      T1_CERN_Buffer (OK:  74  Err:   5  Exp/Cancl:   0 )   succ.: 93.7 %   total:  198.6 GB  ( 4.7 MB/s)
      T1_FNAL_Buffer (OK: 176  Err:  16  Exp/Cancl:   0 )   succ.: 91.7 %   total:  483.3 GB  (11.3 MB/s)
       T1_FZK_Buffer (OK: 756  Err:  42  Exp/Cancl: 193 )   succ.: 94.7 %   total: 1867.6 GB  (43.7 MB/s)
     T1_IN2P3_Buffer (OK: 103  Err:   0  Exp/Cancl:   0 )   succ.: 100.0 %   total:  254.4 GB  ( 6.0 MB/s)
       T1_RAL_Buffer (OK:  94  Err:   1  Exp/Cancl:   0 )   succ.: 98.9 %   total:  261.4 GB  ( 6.1 MB/s)

TOTAL SUMMARY:
==================
                         first entry: 2008-03-30 19:48:10      last entry: 2008-03-31 07:39:39
total transferred: 3147.6 GB  in 11.9 hours
avg. total rate: 75.5 MB/s = 604.0 Mb/s  = 6370.6 GB/day

Arrow left Go to previous page / next page of CMS site log MOVED TO...

-- DerekFeichtinger - 28 Mar 2008

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2008-06-04 - DerekFeichtinger
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback