Tags:
create new tag
view all tags

CMS data ordering by Rucio

CMS is using Rucio for the data management across sites. Our Tier-3 has the Rucio name T3_CH_PSI, the Swiss Tier-2 is accordingly named T2_CH_CSCS.

Users can submit own requests by Rucio. Since our sites' data managers want to retain the ability to manage rules, and user rules only can be managed by rucio global CMS administrators, we duplicate such requests into requests using the special site account t3_ch_psi_local_users using our scripts - the original rule requests will be denied after the new rules have been created. Alternatively you can also just send us (admin mailing list cms-tier3@lists.psi.ch) a list of data sets or blocks together with the intended expiry date, and we will order them on your behalf.

  • You must specify an expiry date. Rules without expiry date will not be accepted. Please try to pick a minimal date
    • transfers are fast, TBs can be transferred within hours to the Tier-3, if they are on disks somewhere on the Grid (some more waiting time will be needed, if the data first needs to be staged from tape). Therefore, data can be refetched.
    • Ideally, just specify 1-3 months
  • The Tier-3 has limited storage (~1.5 PB) and most of it is consumed by user data. Please try to limit the volume that you bring to the Tier-3.
  • The Tier-2 is reasonably well connected (just 5ms latency), so you could also bring some data there and process it from the Tier-3 remotely, or from other sites of the grid, since the Tier-2 provides better bandwidth for other sites than the Tier-3.

You will receive an Email from us containing the rule-IDs that have been generated on your behalf for the t3_ch_psi_local_users.

Rucio rule deletion campaign (mostly OBSOLETE in 2024 - was for anonymous data without expiry dates from PheDeX days)

This is initially now some work, since we have many data sets still deriving from PhEDEx days, and also many initial rules were entered into the system without any expiration time.

In the folder /t3home/T3-INFO/rucio-delete/ you will find files containing rucio rules selected for deletion. An email announcing the deletion campaign will specify when the deletions are timed to be executed. Each line contains a rule name and a data identifier

Example:

head /t3home/T3-INFO/rucio-delete/rucio-sync_t3_ch_psi-20221115.txt

7ae06b76de8d44f188c97b193cc1840a cms:/WJetsToLNu_HT-400To600_TuneCP5_13TeV-madgraphMLM-pythia8/RunIIFall17NanoAOD-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/NANOAODSIM#32c93a5d-7c9d-4d3d-b769-0f68dae58e53
50e0b0f9e5914a41bbbf8746a122f74e cms:/WJetsToLNu_HT-400To600_TuneCP5_13TeV-madgraphMLM-pythia8/RunIIFall17NanoAOD-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/NANOAODSIM#5bc24552-5745-11e8-84f1-02163e018fca
c499fb4bd6fb48ab8a93dc083fce2c42 cms:/WJetsToLNu_HT-400To600_TuneCP5_13TeV-madgraphMLM-pythia8/RunIIFall17NanoAOD-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/NANOAODSIM#c9778e32-5456-11e8-84f1-02163e018fca

We will create new rules belonging to the special site account t3_ch_psi_local_users for data you want to keep. To notify us of data you want to keep, please do the following

  1. Create a folder RUCIO-KEEP in your home directory
    mkdir ~/RUCIO-KEEP
    
  2. Into your RUCIO-KEEP folder you can place files containing a header defining the expiry date, a comment (optional) followed by one line per data identifier. Do not use your own User container definitions. Use exactly the data sets from the cms: scope as listed in the source files. Example:
    #EXPIRY: 2022-12-24
    #COMMENT: I need these data for my analysis XYZ
    cms:/MuonEG/Run2017F-31Mar2018-v1/NANOAOD#102d6373-d6ab-4bb4-8e4a-b1f3f42a6dbf
    cms:/SingleElectron/Run2017C-31Mar2018-v1/NANOAOD#c837e60a-487d-11e8-8c33-02163e01877e
    cms:/Tau/Run2017C-31Mar2018-v1/NANOAOD#4a97700c-3bb9-44ab-8d18-bf4a795fbf3b
    
    Note: each such file will be turned into a single new rule, so there only can be a single expiry time and a single comment. Please use exactly the given date format.

User scope containers will not be accepted, since we need to manage all data under the same rucio account, and we only have a single quota for that account. Users could add datasets at any later date to the containers, but we cannot put a quota on the container. This is a regrettable limitation in regard to how CMS is using rucio.

When the deadline for deletion arrives, we will collect the files in your ~/RUCIO-KEEP folders and generate rucio requests on behalf of the t3_ch_psi_local_users account. Only when the new rules are in place will we delete the original rules.

Edit | Attach | Watch | Print version | History: r14 < r13 < r12 < r11 < r10 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r14 - 2024-06-25 - DerekFeichtinger
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback