Tags:
create new tag
view all tags

Phenix updated installation and configuration status - 2nd March 2007

  • Minutes of the phone call held the 2nd March 2007
  • Participants:
    • Sergio Maffioletti
    • Alessandro Usai
    • Tom Guptil
    • Derek Feichtinger
    • Sigve Haug

Status of installation and configuration

  • WN (X2200)
    • SLC4.4 [tobe finalized and integrated into cfengine]

  • CE (X4200)
    • 1 X4200 used for PNFS + postgres [ok]

  • WN (X2200) * 1 X2200 used as dcache admin node (SRM + dcache packages) [ok]

  • Thumper
    • 1 Thumper integrated as pool node [ok]
    • Configuration of pools
      • 1 pool write only and 1 pool read only per VO

  • Problem encoutered, solutions and workarounds
    • se02 is not registering properly with dcache admin nodes: pools are seen as non active
    • Derek will test dcache installation procedure from scratch on se02
    • MonAMI monitor may not be usable as it is querying tables removed from latest dcache release
    • The final system will need some ad-hoc scripts/cron jobs for the update of CA list and grid-mapfiles

  • Tests (tentative dates)
    • Reliability tests on whole system (Derek + Tom + Alessandro) --> 5 - 9 MArch
    • Performance test from WNs to Thumpers via dcache (Derek + Sigve + Alessandro) --> 2 - 9 March
    • Test different configurations of ZSF [ok]

  • Organisation of the dCache tests
    • functionality tests
    • VO codes
    • local load tests (mainly dcap):
    • writing files in parallel from multiple nodes
    • reading same file from multiple nodes
    • trying to write file that is being written by another process
    • erasing file that is being read by another process
    • measure I/O rates as function of parallel clients
    • WAN protocol tests (SRM, gridftp)
    • CMS PhEDEx transfers
    • Storage access profile of CMS jobs -> they will use dcap protocol
    • Storage access profile of Atlas jobs -> for those using ARC, maily the access is through SRM and/or Gridftp
    • each VO should prepare their own specific tests
    • General test suite ( local and wan test ) will be prepared by Derek
    • Sigve will forward the test description to the Atlas contact to check if Atlas will need additional tests

SE dcache open issues

    • is it necessary to mount PNFS on Thumper ? yes if Thumper is running Gridftp ( thanks to Lionel Schwarz )
    • what is necessary for WN to use the dcap protocol to access the dcache pools ? WNs will point todcache admin node and it will redirect to the right Thumper

What bandwidth can we expect:

  • We may have preformance problem from Dalco WNs to Thumper due to single Gb link from switch to switch
  • Tom will check if both switches can be trunked together
  • This is an important issue to keep in mind when planning the extention

Deadlines

  • next week --> tests
  • 15th March --> production

AOB

  • update UI machine
    • 5 - 9 March reinstall UI
    • announce half day downtime

  • VOBoxes --> Atlas VO could be moved to Xen node

  • responsibility for Twiki areas (CSCS will take care)
    • Create 1 page per VO
    • add a page with logs of problems
    • VOBoxes page with info about how to start services
    • There will be one page per service/server involved

  • We will produce a list of things to be checked before entering production
    • what needs to be available at cscs
    • What configurations needs to be modified at Tier-1 level
    • For every VO there will be a list of configurations to check

  • We need to prepare a message for all users about the scheduled entering production

Resume of the Configuration

  • 1 X2200 = cluster management system
  • 1 X2200 = SRM + dcache domains + LCG SE related software
  • 1 X4200 = PNFS + posgres
  • 1 X4500 Thumpers = Gridftpd + dcache pool node
  • 1 X4500 Thumpers = Gridftpd + dcache pool node

  • ZFS configuration (Proposal):
    • 1 Thumper test with 4 Raid and 2 parity disks = 16TB + 4 spare disks
    • 1 Thumper test with 4 Raid and 1 parity disk = 18TB + 4 spare disks

  • 1 ZFS pool per Thumper

  • 1 Filesystem per VO per Thumper
  • each VO gets space on both Thumpers

  • Thumper dcache configuration
    • each Thumper will have 1 dcache pool per VO/FS (CMS,Atlas,dteam)
    • 1 Thumper will also have FS for lhcb and hone


-- SergioMaffioletti - 2 March 2007

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r3 - 2007-03-02 - SergioMaffioletti
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback