<!-- keep this as a security measure:
* Set ALLOWTOPICCHANGE = TWikiAdminGroup,Main.LCGAdminGroup,Main.EgiGroup
* Set ALLOWTOPICRENAME = TWikiAdminGroup,Main.LCGAdminGroup
#uncomment this if you want the page only be viewable by the internal people
#* Set ALLOWTOPICVIEW = TWikiAdminGroup,Main.LCGAdminGroup,Main.ChippComputingBoardGroup
-->

ATLAS resources federation

* Zoom link: https://ethz.zoom.us/j/91556179990


Plan to federate two ATLAS sites into one.

CSCS-LCG2 (dCache), UNIBE-LHEP (DPM) => CHIPP-CH (DPM)

Note: the CSCS storage will physically remain at CSCS

Step 1 storage:

  • Drain internally one dCache "storage unit" at CSCS, re-install it as a DPM "storage unit" and attach it to the Bern DPM head node
  • Operate in this mode for a minimum of 6-8 weeks
  • If no blocking issues are discovered: transition all storage pools from dCache to the DPM head node in Bern.
  • Strategy to define, but can hope to drain and re-install as DPM (most of) them one by one internally if possible (to reduce load on DDM ops). This will shrink the size of the CSCS storage and increase that of the Bern storage.
  • At some point during this procedure, move CSCS Panda queues to the Bern storage
  • Make CSCS storage RO and finalise its draining and pool transition to the Bern storage

Step 2 create the new ATLAS site:

  • Transition the Bern DPM endpoints to the new site

Step 3 Panda queues

  • Move the Panda sites CSCS-LCG2 and UNIBE-LHEP (or create new ones) to the ATLAS site CHIPP-CH

Technical meeting with the following goals:

Understand the Federation layout

Understand CSCS storage layout for dCache and how to map it to DPM
Understand the network layout between the two sites

  • https://traffic.lan.switch.ch/vip/swiss-map/index.html
  • https://traffic.lan.switch.ch/vip/international-map/
  • Direct link between Bern and Lugano with 100G capacity. Currently limited at 40G at the Bern border

  • Bern SE to CSCS SE path:
    [root@dpm ~]# traceroute se33.cscs.ch
    traceroute to se33.cscs.ch (148.187.19.183), 30 hops max, 60 byte packets
     1  beethoven-67.unibe.ch (130.92.67.1)  0.342 ms  0.876 ms  0.276 ms
     2  castorfw-inside.unibe.ch (130.92.0.36)  0.825 ms  0.815 ms  0.849 ms
     3  castor-inside.unibe.ch (130.92.244.3)  1.008 ms  0.997 ms  1.366 ms
     4  swiBE3-40GE-0-1-0-0-0.switch.ch (195.176.3.1)  1.317 ms  1.305 ms  1.254 ms
     5  swiLG1-100GE-0-0-0-3.switch.ch (130.59.36.102)  4.201 ms  4.200 ms  4.157 ms
     6  100G-C-IPv4.cscs.ch (148.187.0.10)  5.976 ms  3.717 ms  3.782 ms
     7  se33.cscs.ch (148.187.19.183)  3.542 ms  3.509 ms  3.306 ms

  • Bern SE to CSCS CE path (probably behind firewall):
    [root@dpm ~]# traceroute arc04.lcg.cscs.ch
    traceroute to arc04.lcg.cscs.ch (148.187.19.136), 30 hops max, 60 byte packets
     1  beethoven-67.unibe.ch (130.92.67.1)  0.317 ms  0.310 ms  0.271 ms
     2  castorfw-inside.unibe.ch (130.92.0.36)  0.830 ms  0.831 ms  0.930 ms
     3  castor-inside.unibe.ch (130.92.244.3)  1.464 ms  1.551 ms  1.527 ms
     4  swiBE3-40GE-0-1-0-0-0.switch.ch (195.176.3.1)  1.110 ms  1.547 ms  1.049 ms
     5  swiLG1-100GE-0-0-0-3.switch.ch (130.59.36.102)  4.303 ms  4.297 ms  4.271 ms
     6  * * *
  • Bern SE to the outside, e.g. NDGF
  • [root@dpm ~]# traceroute piggy.ndgf.org
    traceroute to piggy.ndgf.org (109.105.124.142), 30 hops max, 60 byte packets
     1  beethoven-67.unibe.ch (130.92.67.1)  0.362 ms  0.314 ms  0.283 ms
     2  castorfw-inside.unibe.ch (130.92.0.36)  0.234 ms  0.258 ms  0.189 ms
     3  castor-inside.unibe.ch (130.92.244.3)  0.824 ms  0.810 ms  0.882 ms
     4  swiBE3-40GE-0-1-0-0-0.switch.ch (195.176.3.1)  0.703 ms  1.678 ms  1.675 ms
     5  swiCE4-100GE-0-0-0-2.switch.ch (130.59.37.146)  3.687 ms  3.363 ms  3.693 ms
     6  swiCE1-B4.switch.ch (130.59.36.69)  3.650 ms  5.370 ms  5.344 ms
     7  switch.mx1.gen.ch.geant.net (62.40.124.21)  3.108 ms  3.107 ms  3.103 ms
     8  ae6.mx1.par.fr.geant.net (62.40.98.183)  12.194 ms  12.023 ms  11.990 ms
     9  ae5.mx1.lon2.uk.geant.net (62.40.98.178)  16.713 ms  16.657 ms  16.860 ms
    10  ae6.mx1.lon.uk.geant.net (62.40.98.36)  17.574 ms  17.694 ms  17.381 ms
    11  nordunet-gw.mx1.lon.uk.geant.net (62.40.124.130)  17.545 ms  17.531 ms  17.517 ms
    12  dk-uni.nordu.net (109.105.97.126)  37.190 ms  37.187 ms  36.999 ms
    13  dk-ore.nordu.net (109.105.97.132)  37.446 ms  38.076 ms  37.996 ms
    14  dk-ore2.nordu.net (109.105.102.119)  53.793 ms  49.192 ms  49.131 ms
    15  piggy.ndgf.org (109.105.124.142)  37.613 ms  37.508 ms  37.412 ms
  • Work in progress: move Bern ARC CEs and SE nodes in DMZ

  • CSCS SE to Bern SE path:
    ...
  • CSCS SE to Bern path:
    ...
  • CSCS SE to outside, e.g. NDGF:
    ...

Understand the current and expected network rates for SE to compute and WAN
  • Network Run-2:
    • Assume analysis on an HT-Core (job-slot) consumes 1.2 MBytes/sec
    • Implies job-slots need that level of network bandwidth to storage
    • WAN access to remote storage at 20% (ATLAS avg now)
      • Nominal Tier-2: 5000 job slots => 6 GBytes/sec, WAN 9.6 Gbits/sec
      • Leadership Tier-2: 10000 job slots => 12 GBytes/sec, WAN 19.2 Gbits/sec

  • NOTE: Run-3 will have 3-4 times the data...have to either increase cores or improve average software throughput by that factor

  • Network Run-3:
    • Add a burst caspability
      • Nominal Tier-2 WAN: 9.6 Gbps x 3 = 28.8 Gbps => 40G link
      • Leadership Tier-2: 10000 job slots => 9.2 Gbps x 3 = 57.6 Gbps => 80G link
Plan concretely the first step

  • Drain internally one dCache "storage unit" at CSCS, re-install it as a DPM "storage unit" and attach it to the Bern DPM head node

Lay a tentative plan out for the following step
  • AOB

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf DPM-internal-layout.pdf r1 manage 839.2 K 2020-07-21 - 08:47 GianfrancoSciacca DPM internal layout
PDFpdf Swiss-ATLAS-Federation-layout.pdf r1 manage 235.0 K 2020-07-20 - 08:08 GianfrancoSciacca Swiss ATLAS Federation layout
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r6 - 2020-07-21 - GianfrancoSciacca
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback