Tags:
create new tag
view all tags

Cluster Overview

OLD Cluster Overview

Cluster Composition and Services

TypeSorted ascending Hosts Hardware Services
CmsFrontier t3frontier01, DNS alias t3frontier = t3frontier01 PSI DMZ VMWare cluster CMS-Frontier and CVMFS Squid cache
CmsVoBox t3cmsvobox ( t3cmsvobox01 ) PSI DMZ VMWare cluster PhEDEx 4.2.1
ComputingElement t3ce02 PSI DMZ VMWare cluster Sun Grid Engine 6.2u5
dCacheSiteBDII t3bdii0[1,2], DNS alias t3bdii = t3bdii02 PSI DMZ VMWare cluster Site BDII
dCacheSolaris [t3fs02 - t3fs04, t3fs07 - t3fs11] READ-ONLY !! SUN X4500 (2*Opt 290, 16GB RAM, 48*500GB SATA) / SUN X4540 (2*Opt 2435, 32GB RAM, 48*1TB SATA + 16GB Flash) dcache pool cells, gridftp, dcap, gsidcap
dCachet3fs13t3fs14 t3fs[13,14] READ-WRITE !! HP Proliant DL380 G7 dcache pool cells, gridftp, dcap, gsidcap
JumpStart t3jumpstart01 PSI DMZ VMWare cluster jet, ssh
MeGt3fs16 t3fs16 HP DL380 Gen9 file server
Mon t3mon01, DNS alias t3mon = t3mon01 PSI DMZ VMWare cluster ganglia collector, ganglia web front end
NFSServerZFSBackupANDdCache t3nfs02 HP DL380 G9 NFSv4 service based on ZoL + dCache Pool
OLDNFShomeServer t3fs06 - OUTDATED ! SUN X4500 (2*Opt 290, 16GB RAM, 48*500GB SATA) NFS (user home area), backup on t3fs05
OLDNFSServer t3fs05 - OUTDATED ! SUN X4500 (2*Opt 290, 16GB RAM, 48*500GB SATA) central SW NFS service, backup on t3fs06
Ossec t3ossec PSI DMZ VMWare cluster ossec daemon
SyslogNg t3service01 PSI VM DMZ cluster Syslog-ng 2.1.4-9 Central Logging Service
UIs t3ui0[1,2,3] Dalco ssh, freeNX
WNsIntelMeG T3WN[60..63] DALCO r2264i6t 4 nodes in 2U module chassis with 2x20 Intel(R) Xeon(R) E5-2698 v4 @ 2.20GHz; RAM - 256 GB SGE
WNsIntelS2600TP t3wn[51-59] Dalco r2264i5t - Intel S2600TP Sun Grid Engine 6.2u5 execution hosts
WNsSunBlade t3wn[10-29] SUN Blade 6270 (2*Xeon 5560, 24GB RAM, 2*146 GB SAS) / SUN X4150 (2*Xeon E5440, 16GB RAM, 2*146 GB SAS) Sun Grid Engine 6.2u5 execution hosts
WNsSuperMicro t3wn[41-50] SuperMicro 1u got from CSCS Sun Grid Engine 6.2u5 execution hosts

Cluster Specs

Original Requirements

Our requirements document (q.v. our internal CMS-Tier3-Project-description.doc) specified the following for the two install phases.

Phase Year CPU / kCINT2000 Disk / TB
A 2008 180 75
B 2009 500 250
C 2012 ?? ??

Phase A - CPU

CINT = SPECint benchmark value (average baseline value taken from the SPEC published results). For 1000 CINT2000 one frequently uses the abbreviation kSI2k.

No. of WNs Processors Cores/node CINT2006/core kCINT2000/core No. of Cores CINT2006 kCINT2000
8 2*Xeon E5410 8 18.8 3.34 64 1203.2 213.76

Note: We use this conversion for the untabled CINT2000 values: CINT2000base=-1373.9673+250.8226*CINT2006base

Phase A - Storage

No. of Fileservers Type Space/Node (TB) Total Space (TB)
6 SUN X4500 17.5 105

Note: 48 500 GB disks per fileserver:

  • 4 raidz pools with 9 disks, 1 raidz pool with 8 disks: (4*8 + 1*7) * 500 GB = 21 TB usable raw: measured with fs overhead: ca 17.5 (83%)
  • 2 disks mirrored OS
  • 2 disks as hot spare

Phase B - CPU

For 1000 CINT2000 one frequently uses the abbreviation kSI2k.

T3_CH_PSI CPU resources and performance benchmarks 2010/2011
No. of WNs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
7 2*Xeon E5410 8 91.97 11.5 2.95 56 644 165
20 2*Xeon X5560 8 117.53 14.69 3.77 160 2350 603
27           216 2994 768

Note: hepspec06/kSi2k = 3.9 ± 0.2 => kSi2k = hepspec06/3.9

The PSI Tier-3 has 48 active users (August 2011).

Phase B - Storage

T3_CH_PSI SE Storage recources 2010/2011
No. of Fileservers Type Space/Node (TB) Total Space (TB)
4 SUN X4500 16.10 64
5 SUN X4540 34.09 170
9     234

Note: 48 1TB disks per X4540 fileserver:

  • 5 raidz pools with 9 disks: 5*9 * 1TB = 45 TB usable raw. df yields 34093714636800 bytes = 34.09 TB = 31.008 TiB
    • Current dcache config: 2 pools of 14000 GiB = 15.03 TB
  • 3 disks as hot spare
  • OS sits on flash storage
  • Note: Two of our 6 X4500 are now used for home directories, SW areas and backup and therefore were taken out of this table.
  • For old X4500: 16097967341568 bytes = 16.10 TB = 14.64 TiB
    • Current dcache config: 1 pool of 14000 GiB = 15.03 TB

Phase C - CPU

T3_CH_PSI CPU resources and performance benchmarks 2012
No. of WNs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
20 2*Xeon X5560 8 117.53 14.69 3.77 160 2350 603
11 2*E5-2670 2.60GHz 16 263 16.44 4.22 176 2893 743
31           336 5243 1346

Note: hepspec06/kSi2k = 3.9 ± 0.2 => kSi2k = hepspec06/3.9

The PSI Tier-3 has 53 active users (June 2012).

Phase C - Storage

T3_CH_PSI SE Storage recources 2012
No. of Fileservers Type Space/Node (TB) Total Space (TB)
4 SUN X4500 16.10 64
5 SUN X4540 34.09 170
2 HP Proliant DL380 G7 130 260
11     494

Note 1: 48 1TB disks per X4540 fileserver:

  • 5 raidz pools with 9 disks: 5*9 * 1TB = 45 TB usable raw. df yields 34093714636800 bytes = 34.09 TB = 31.008 TiB
    • Current dcache config: 1 unique pool of ~31 TiB
  • 3 disks as hot spare
  • OS sits on flash storage
  • Note: Two of our 6 X4500 are now used for home directories, SW areas and backup and therefore were taken out of this table.
  • For old X4500: 16097967341568 bytes = 16.10 TB = 14.64 TiB
    • Current dcache config: 1 pool of 14000 GiB = 15.03 TB
Note 2: 120 3TB disks hosted in SGI IS5500 per both HP Proliant DL380 G7 fileserver:
  • SGI IS5500 formatted with 12 Raid6, each 8+2 disks, 6 Raid6 offered to first HP Proliant DL380 G7, 6 to the other.
  • So 6 dCache cms pools each 22TB per HP Proliant DL380 G7 fileserver => 130TB per fileserver => 260TB both.
  • 2 hot spares , even if we have the Raid6 protection.
  • Generally speaking, the SGI IS5500 can be expanded by simply adding disk trays.
  • Just an idea, the cheapest way to expand this storage could be add an other expansion to SGI IS5500 with 12*5*3TB disks => organized like 6 Raid6 => we gain 130TB net => we attach these new 6 volumes to t3fs13 and we make a bonding of its 2*10Gbit/s Ethernets, we connect t3fs13 to the last 10Gbit/s uplink available in our 3 network switches.

Phase D - CPU

On 10th Oct 2013 the CPU resources are the same of 2012 Phase C - CPU.

Phase D - Storage

T3_CH_PSI SE Storage Resources 2013
No. of Fileservers Type Space/Node (TB) Total Space (TB) Total Space (TiB)
4 SUN X4500 16 64 58
5 SUN X4540 33 165 150
2 HP Proliant DL380 G7 282 564 513
11     793 721

Note 1: During the Summer 2013 we connected a new NetApp E5400 360TB raw to the 2012 HP Proliant DL380 G7, so the final net storage has been easily doubled.

Phase E - CPU

On 18th Aug 2014 the CPU resources are:

T3_CH_PSI CPU resources and performance benchmarks 2014
No. of WNs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
20 2*Xeon X5560 8 117.53 14.69 3.77 160 2350 603
11 2*E5-2670 2.60GHz 16 263 16.44 4.22 176 2893 743
4 2*AMD 6272 2.40GHz 32 241 7.53 1.93 128 964 247
35           464 6207 1593

No. of UIs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
6 2*AMD 6272 2.40GHz 32 241 7.53 1.93 192 1446 371
6           192 1446 371

Phase E - Storage

Same storage as Tier3Overview#Phase_D_Storage

Phase F - CPU

In April 2016 the CPU resources are :

T3_CH_PSI CPU resources and performance benchmarks 2014
No. of WNs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
9 2*Xeon E5-2698v3 64 700 10.94 2.81 576 6301 1619
20 2*Xeon X5560 8 117.53 14.69 3.77 160 2350 603
11 2*E5-2670 2.60GHz 16 263 16.44 4.22 176 2893 743
4 2*AMD 6272 2.40GHz 32 241 7.53 1.93 128 964 247
44           1040 12508 3212

No. of UIs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
6 2*AMD 6272 2.40GHz 32 241 7.53 1.93 192 1446 371
6           192 1446 371

Phase F - Storage

Same as Phase E

Phase G - CPU NEW

In Nov 2016 the CPU resources are :

T3_CH_PSI CPU resources and performance benchmarks 2016
No. of WNs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
9 2*Xeon E5-2698v3 64 700 10.94 2.81 576 6301 1619
20 2*Xeon X5560 8 117.53 14.69 3.77 160 2350 603
11 2*E5-2670 2.60GHz 16 263 16.44 4.22 176 2893 743
10 2*AMD 6272 2.40GHz 32 241 7.53 1.93 320 2410 618
50           1232 13954 3583

No. of UIs Processors Cores/node HS06/node HS06/core kCINT2000/core total No. of Cores total HS06 total kCINT2000
3 2*E5-2697 v4 @ 2.30GHz 72 700 9.72 2.49 216 2100 538
3           216 2100 538

Phase G - Storage NEW

T3_CH_PSI SE Storage Resources Nov 2016
No. of Fileservers Type Space/Node (TB) Total Space (TB) Total Space (TiB)
4 SUN X4500 Only Redundant Copies 16 64 58
5 SUN X4540 Only Redundant Copies 33 165 150
2 HP Proliant DL380 G7 282 564 513
1 HP Proliant DL380 G9 200 200 182
12     993 903

Rack layout

T3 HW Pictures

Phase A:

Phase B:

  • a lot of crates and packing material
  • Tier-3 phase B system:
    20100514_005.jpg

2 * HP Proliant DL380 G7

1 * SGI IS5500

  • Front: Show Hide
    IS5500-front.jpg
  • Back ( but don't consider the Infiniband expansion ): Show Hide
    IS5500-back.jpg
  • SGI IS5500 360TB front picture + 2 HP Proliant:
    IMG_0605.JPG

11 * Intel S2600JF

  • 11 Intel S2600JF installed by mdadm raid1 (OS) and mdadm raid0 /scratch :
    IMG_0600_2.png

10 * Supermicro1uH8DGU-F NEW

  • 6 SL6 UIs + 4 SL6 WNs installed by mdadm raid1+0 - 32 cores 100GB RAM 1700 GB /scratch:
    New10UIsSummer2014.JPG
Topic attachments
I Attachment History Action Size Date Who Comment
JPEGjpg 16122009529-small.jpg r1 manage 142.1 K 2009-12-23 - 09:53 DerekFeichtinger phase B crates
JPEGjpg 20100514_005.jpg r1 manage 736.6 K 2010-05-14 - 13:20 DerekFeichtinger Tier-3 phase B
JPEGjpg 22072008074.jpg r1 manage 515.6 K 2008-08-23 - 09:55 DerekFeichtinger cluster photo
Unknown file formatodp CHIPP-Meeting-20080909.odp r1 manage 965.2 K 2008-09-01 - 16:18 DerekFeichtinger  
JPEGJPG IMG_0484.JPG r1 manage 94.5 K 2012-01-11 - 10:23 FabioMartinelli 1 Hot Swap Power Supply
JPEGJPG IMG_0486.JPG r1 manage 129.0 K 2012-01-11 - 10:30 FabioMartinelli Back Side with 4 1Gbit/s E, 4 8Gbit/s FC, 2 10Gbit/s E
JPEGJPG IMG_0487.JPG r1 manage 116.5 K 2012-01-11 - 10:33 FabioMartinelli Internal
JPEGJPG IMG_0488.JPG r1 manage 127.6 K 2012-01-11 - 10:34 FabioMartinelli RAM Left Side
JPEGJPG IMG_0489.JPG r1 manage 149.8 K 2012-01-11 - 10:34 FabioMartinelli RAM Right Side
JPEGJPG IMG_0490.JPG r1 manage 77.7 K 2012-01-11 - 10:17 FabioMartinelli 1 Hot Swap Fan
JPEGJPG IMG_0491.JPG r1 manage 93.1 K 2012-01-11 - 10:20 FabioMartinelli Back Plane
JPEGJPG IMG_0492.JPG r1 manage 83.6 K 2012-01-11 - 10:20 FabioMartinelli Chassis Lock
JPEGJPG IMG_0493.JPG r1 manage 125.3 K 2012-01-11 - 10:41 FabioMartinelli Front Side Left
JPEGJPG IMG_0494.JPG r1 manage 131.1 K 2012-01-11 - 10:40 FabioMartinelli Front Side Right:
JPEGJPG IMG_0495.JPG r1 manage 86.4 K 2012-01-11 - 10:43 FabioMartinelli 1 Hot Swap SAS disk Up Side
JPEGJPG IMG_0496.JPG r1 manage 70.9 K 2012-01-11 - 10:45 FabioMartinelli 1 Hot Swap SAS disk Back Side:
PNGpng IMG_0600_2.png r2 r1 manage 5765.2 K 2012-05-30 - 08:04 FabioMartinelli 11 Intel S2600JF + mdadm raid 0/1
JPEGJPG IMG_0605.JPG r1 manage 2599.2 K 2012-05-24 - 13:54 FabioMartinelli SGI IS5500 360TB front picture + 2 HP Proliant
JPEGjpg IS5500-back.jpg r1 manage 2531.9 K 2012-02-14 - 09:32 FabioMartinelli SGI IS5500 360TB back picture
JPEGjpg IS5500-front.jpg r1 manage 2955.6 K 2012-02-14 - 09:32 FabioMartinelli SGI IS5500 360TB front picture
JPEGJPG New10UIsSummer2014.JPG r1 manage 1325.9 K 2014-07-27 - 08:27 FabioMartinelli 6 UIs + 4 WNs 32cores 100GB RAM 1700 GB scratch
PNGpng T3-Racklayout.png r3 r2 r1 manage 37.1 K 2010-07-09 - 12:10 DerekFeichtinger Tier3 Rack layout
JPEGJPG photo_1.JPG r1 manage 2690.8 K 2015-02-23 - 14:28 FabioMartinelli 6 UIs + 4 WNs 32cores 100GB RAM 1700 GB scratch
Edit | Attach | Watch | Print version | History: r50 < r49 < r48 < r47 < r46 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r50 - 2018-08-21 - NinaLoktionova
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback