Tags:
create new tag
view all tags

Steering Board Meeting 2009-12-01

Meeting Materials:

Agenda

  1. Welcome
  2. Information about present status of T3­cluster at PSI
  3. Policy proposals for resources
    • CPU
    • storage allocation and management..
    • cluster access, accounts...
  4. Discussion:
    • statements / comments by all parties
    • groups ?
  5. Recommendations:
    • Draft for resource sharing recommendations
    • rules, implementation, publication
    • validity (time),...
  6. AOB

Meeting Minutes

presence: Christoph Grab, Derek Feichtinger, Leonardo Sala, Urs Langenegger, Alexander Schmidt,

via phone: Ernest Aguilo

Items:

  • Welcome by Christoph
  • Derek presents status of T3 cluster at PSI. Some key figures:
    • 8 worker nodes with 2*XeonE5410 processors (8 cores)
    • 6 fileservers with 100TByte space in total
    • major upgrade in December/January consisting of: 20 worker nodes with 2*XeonX5560 CPUs and 5 fileservers with 200TByte
  • guest users: should visitors be allowed to obtain guest accounts at the T3?
    • Pro: collaboration within physics/detector group is simplified if the same computing architecture is used (e.g. to share ntuples)
    • Contra: if everybody is allowed this leads to an uncontrolled increase of number of users
    • Agreement: allow a fixed limited contingent (e.g. 2 guests) per group/institution (?) to be allocated by the respective group responsible. Further guests can only be allowed if previous accounts are deleted.
  • distribution of computing resources:
    • allocation of CPU time is easy and can be changed on the fly. Fairshare algorithms can be adjusted anytime as necessary.
    • More important and more difficult is distribution of storage space. Foresee three types of storage space:
      • private storage space for users
      • group storage space for hosting official CMS datasets as well as group skims, ntuples etc.
      • central space for maintenance etc.
    • Agreed on the followig policy: Storage space is equally divided into number of users. Each user has x TByte and belongs to only one group. The space is added up inside a group. The group decides then how the space is divided. Guest users don't count. There is no quota, but the group responsible needs to take care of the storage space limits.
    • Action item: make list of groups, users and according amount of storage. Define names of group responsibles.

  • commissioning dataset (first LHC collisions): it is currently not distributed to the PSI T3, because it is still more efficient to analyze the small number of events at CAF. This dataset can be requested and hosted in T3 central space (not in group space) if needed.

  • there is no backup of home directories, disk failure protection is provided by RAID6

  • queues: average time consumption by CMS jobs is 4 hours. The 24h queue is too long, will be changed to 8h or 12h. The old computing nodes will be used for the short (test) 1h queue.

  • comment from Urs: we are very happy with the T3 and can't live without it. Derek and his team are doing a great job.

  • three days of T3 downtime is necessary to prepare the upgrade, probably before christmas. need to find a convenient date.

Comments/Corrections

  • Derek
    • Guest users: we agreed to allow one guest user per physics group.

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2009-12-08 - DerekFeichtinger
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback