Phenix updated installation and configuration status - 2nd March 2007

  • Minutes of the phone call held the 2nd March 2007
  • Participants:
    • Sergio Maffioletti

Status of installation and configuration

  • WN (X2200)
    • SLC4.4 [tobe finalized and integrated into cfengine]

  • CE (X4200)
    • 1 X4200 used for PNFS + postgres [ok]

  • Thumper
    • 1 Thumper integrated as pool node [ok]
    • Configuration of pools (Alessandro will report)

  • Problem encoutered, solutions and workarounds *

  • Tests (tentative dates)
    • Reliability tests on whole system (Derek + Tom + Alessandro) -->
    • Performance test from WNs to Thumpers via dcache (Derek + Sigve + Alessandro) -->
    • Test different configurations of ZSF [ok]

  • Organisation of the dCache tests
    • functionality tests
    • VO codes
    • local load tests (mainly dcap):
    • writing files in parallel from multiple nodes
    • reading same file from multiple nodes
    • trying to write file that is being written by another process
    • erasing file that is being read by another process
    • measure I/O rates as function of parallel clients
    • WAN protocol tests (SRM, gridftp)
    • CMS PhEDEx transfers
    • Storage access profile of CMS jobs -> they will use dcap protocol
    • Storage access profile of Atlas jobs -> for those using ARC, maily the access is through SRM and/or Gridftp
    • each VO should prepare their own specific tests
    • General test suite ( local and wan test ) will be prepared by Derek
    • Sigve will forward the test description to the Atlas contact to check if Atlas will need additional tests

SE dcache configuration scenario

    • is it necessary to mount PNFS on Thumper ? apparently yes if Thumper is running Gridftp ( thanks to Lionel Schwarz )
    • what is necessary for WN to use the dcap protocol to access the dcache pools ? apparently the client-only dcache package ( Alessandro will check )

What bandwidth can we expect:

  • We may have preformance problem from Dalco WNs to Thumper due to single Gb link from switch to switch

Deadlines

  • next week --> tests
  • 15th March --> production

AOB

  • update UI machine (Strange java exception errors)?
    • UI will be re-installed in the next two weeks
    • Proposal to migrate to a server box ( gain reliability )

  • VOBoxes
    • should we reinstall as true LCG VO-Boxes? This would provide gsissh and easier myproxy management
    • We are planning to migrate these boxes anyway

  • responsibility for Twiki areas (CSCS will take care)
    • Create 1 page per VO
    • add a page with logs of problems
    • VOBoxes page with info about how to start services


Resume of the Configuration

  • 1 X2200 = cluster management system
  • 1 X2200 = SRM + dcache domains + LCG SE related software
  • 1 X4200 = PNFS + posgres
  • 1 X4500 Thumpers = Gridftpd + dcache pool node
  • 1 X4500 Thumpers = Gridftpd + dcache pool node

  • ZFS configuration (Proposal):
    • 1 Thumper test with 4 Raid and 2 parity disks = 16TB + 4 spare disks
    • 1 Thumper test with 4 Raid and 1 parity disk = 18TB + 4 spare disks

  • 1 ZFS pool per Thumper

  • 1 Filesystem per VO per Thumper
  • each VO gets space on both Thumpers

  • Thumper dcache configuration
    • each Thumper will have 1 dcache pool per VO/FS (CMS,Atlas,dteam)
    • 1 Thumper will also have FS for lhcb and hone


-- SergioMaffioletti - 2 March 2007

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2007-03-02 - SergioMaffioletti
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback