General impression

Most of CCRC08 went quite well for T2_CH_CSCS. No major technical problems were observed with the exception of the first week of June, where a middle ware update caused many jobs to abort.

CSCS serves three experiments: ATLAS, CMS and LHCb. For most of this month only CMS significantly exercised the system. This is particularly valid in respect to the storage where only the CMS jobs used access by dcap protocol. The total generated load was not enough to really stress our ressources and the usage patterns are surely not representative of what we will see in a few months.


Running jobs per experiment:

Job quality was rather good over the period of May:



Overview over the used datasets: Dashboard-datasets-May.jpg

During the first week of June the middleware update caused a lot of problems: Dashboard-activity-June.jpg


PhEDEx downloads were of much better quality than in the months before (probably due to everybody being much more attentive), but the system was under much lower stress due to the turning off of the Debug instance's LoadTest transfers. This is reflected in the plots below being mostly empty.

storage_free_cms.jpg fileservers-IO.jpg

PhEDEx Transfer Quality:

PhEDEx Transfer Rate:

PhEDEx Transfer Volume:

2. 6. - 5. 6. 2008 Analysis Latency Tests

Exercises consist of subscribing to a data set and running a sample analysis job with CRAB over all its files.

Monday and Tuesday CSCS had a number of problems due to a middleware upgrade of the CE and nodes which despite expectations caused a lot of jobs to abort. But I was able to download a dataset from PIC at a speed of ~90MB/s.

Tuesday evening and night was ok, but allowing only one cycle for the following data set:

Dataset name: /Njet_5j_400_5600-alpgen/CMSSW_1_6_7-CSA07-1206675842/RECO  from T1_PIC

Time Time delta Comment
15:00 0 Dataset subscription approved
15:25 0:25 download beginning
20:45 5:45 dataset completely on site (280880 Events, 281 Files from T1_ES_PIC, 1.4TB, 0 Errors, avg. 69.1 MB/s)
21:40 6:40 DBS still only shows 41 files at CSCS
21:43 6:43 DBS shows whole dataset to be on site
21:45 6:45 crab -submit
21:49 6:49 all 29 Jobs running at T2_CH_CSCS
04:34 13:34 last job finished (no Errors)

Wednesday was again bad luck, when FZK's network had severe problems, causing all transfers for /WW_incl/CMSSW_1_6_7-CSA07-1196178448/RECO to fail. FZK recovered sometime in the afternoon, but the CSCS PhEDEx download agent got stuck because of blocking of a glite-transfer-query and a glite-transfer-submit request, which I only noticed on Thursday morning. PhEDEx transfers from FZK began to trickle in by 13:30h, but the cluster is now completely filled with user jobs, which will add considerable latency to the eventual CRAB run.

-- DerekFeichtinger - 04 Jun 2008

