Tags:
create new tag
view all tags

Hardware Card for Storage Elements (Thors and Thumpers)

The raid setup on the thors and thumpers

RAID configuration

Here is how the raids are set up on Storage Elements.

Firstly the Thumpers:

echo y | yum install xfsprogs

#########################################
# create disk layout for thumpers

sfdisk --force /dev/sda<< EOF
# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=       63, size=976768002, Id=83
/dev/sda2 : start=        0, size=        0, Id= 0
/dev/sda3 : start=        0, size=        0, Id= 0
/dev/sda4 : start=        0, size=        0, Id= 0
EOF

for i in {b..x};do
   mdadm --zero-superblock /dev/sd$i
   sfdisk -d /dev/sda| sfdisk  --force /dev/sd$i
done

   sfdisk -d /dev/sda| sfdisk  --force /dev/sdz
   sfdisk -d /dev/sda| sfdisk  --force /dev/sdaa
   sfdisk -d /dev/sda| sfdisk  --force /dev/sdab

for i in {d..v};do
   mdadm --zero-superblock /dev/sda$i
   sfdisk -d /dev/sda| sfdisk  --force /dev/sda$i
done

echo y | mdadm --create --verbose /dev/md2 --level=6 --raid-devices=11 /dev/sda1 /dev/sdb1 /dev/sdi1 /dev/sdj1 /dev/sdq1 /dev/sdr1 /dev/sdz1 /dev/sdag1 /dev/sdah1 /dev/sdao1 /dev/sdap1
echo y | mdadm --create --verbose /dev/md3 --level=6 --raid-devices=11 /dev/sdc1 /dev/sdd1 /dev/sdk1 /dev/sdl1 /dev/sds1 /dev/sdt1 /dev/sdaa1 /dev/sdab1 /dev/sdai1 /dev/sdaj1 /dev/sdaq1
echo y | mdadm --create --verbose /dev/md4 --level=6 --raid-devices=11 /dev/sde1 /dev/sdm1 /dev/sdn1 /dev/sdu1 /dev/sdv1 /dev/sdad1 /dev/sdak1 /dev/sdal1 /dev/sdar1 /dev/sdas1 /dev/sdat1
echo y | mdadm --create --verbose /dev/md5 --level=6 --raid-devices=11 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdo1 /dev/sdp1 /dev/sdw1 /dev/sdx1 /dev/sdae1 /dev/sdaf1 /dev/sdam1 /dev/sdau1

sleep 120

mdadm /dev/md4 -a /dev/sdan1
mdadm /dev/md5 -a /dev/sdav1

and for the Thors:

sfdisk --force /dev/sda<< EOF
# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=       63, size=1953520002, Id=fd
/dev/sda2 : start=        0, size=        0, Id= 0
/dev/sda3 : start=        0, size=        0, Id= 0
/dev/sda4 : start=        0, size=        0, Id= 0
EOF

for i in {b..z};do
   mdadm --zero-superblock /dev/sd$i
   sfdisk -d /dev/sda| sfdisk  --force /dev/sd$i
done

for i in {a..v};do
   mdadm --zero-superblock /dev/sda$i
   sfdisk -d /dev/sda| sfdisk  --force /dev/sda$i
done

mdadm --create --verbose /dev/md0 --level=6 --raid-devices=9 /dev/sda1 /dev/sdb1 /dev/sdi1 /dev/sdj1 /dev/sdr1 /dev/sdq1  /dev/sdz1  /dev/sdah1 /dev/sdap1
mdadm --create --verbose /dev/md1 --level=6 --raid-devices=9 /dev/sdc1 /dev/sdk1 /dev/sds1 /dev/sdaa1 /dev/sdab1 /dev/sdaj1 /dev/sdai1 /dev/sdar1 /dev/sdaq1
mdadm --create --verbose /dev/md2 --level=6 --raid-devices=9 /dev/sde1 /dev/sdd1 /dev/sdm1 /dev/sdl1 /dev/sdu1 /dev/sdt1 /dev/sdac1 /dev/sdak1 /dev/sdas1
mdadm --create --verbose /dev/md3 --level=6 --raid-devices=9 /dev/sdf1 /dev/sdn1 /dev/sdv1 /dev/sdae1 /dev/sdad1 /dev/sdam1 /dev/sdal1 /dev/sdau1 /dev/sdat1
mdadm --create --verbose /dev/md4 --level=6 --raid-devices=9 /dev/sdh1 /dev/sdg1 /dev/sdo1 /dev/sdp1 /dev/sdx1 /dev/sdw1 /dev/sdaf1 /dev/sdan1 /dev/sdav1

sleep 120

mdadm /dev/md0 -a /dev/sdy1
mdadm /dev/md2 -a /dev/sdag1
mdadm /dev/md4 -a /dev/sdao1

Then, when these commands have been run on either a thor or a thumper, run the following to create /etc/mdadm.conf:

grep recovery /proc/mdstat  >> /dev/null                                                                                                                                                                     
if [ $? -eq 0 ]; then                                                                                                                                                                                        
        echo "[INFO] There is a RAID being recovered right now, it's not safe to run $0. Run it afterwards. Quitting"                                                                                        
        exit -1                                                                                                                                                                                              
fi                                                                                                                                                                                                           
                                                                                                                                                                                                             
echo "DEVICE" /dev/disk/by-id/scsi*part1 > /etc/mdadm.conf                                                                                                                                                   
echo "MAILADDR grid@cscs.ch" >> /etc/mdadm.conf                                                                                                                                                              
mdadm --detail --scan --verbose >> /etc/mdadm.conf                                                                                                                                                           
sed -i -e "s/UUID/spare-group=global UUID/g" /etc/mdadm.conf                                                                                                                                                 
df |grep hd >> /dev/null                                                                                                                                                                                     
is_thumper=$?                                                                                                                                                                                                
if [ "$is_thumper" -eq 1 ]; then                                                                                                                                                                             
        echo "[INFO] Working on a Thumper system, not adding spare-group=global to md0 and md1 raids"                                                                                                        
        sed -i -e "s/\(md[0\|1].*\)spare-group=global.*\(UUID\)/\1\2/g" /etc/mdadm.conf # we remove md0 and md1 from the spare-group global                                                                  
fi                                                                                                                                                                                                           
                                                                                                                                                                                                             
for i in `mdadm --detail --scan --verbose |grep -v num-devices|awk --field-separator="=" '{print $2}'|awk --field-separator="," '{ for (f=1; f <= NF; f++) {print $f} }'|grep -v ' '|awk --field-separator="/" '{print $3}'`;do
  echo $i                                                                                                                                                                                                                        
  DISK=`ls -l /dev/disk/by-id/|grep $i|awk '{print $9}'`                                                                                                                                                                         
  echo $DISK                                                                                                                                                                                                                     
  sed -i -e "s/$i/$DISK/" /etc/mdadm.conf                                                                                                                                                                                        
done                                                                                                                                                                                                                             
                                                                                                                                                                                                                                 
sed -i -e "s/dev\/scsi/dev\/disk\/by-id\/scsi/g" /etc/mdadm.conf

mdmonitor uses the mdadm.conf to know who to send the report to (grid@cscs.ch), and what the disk layout is.

Monitoring

Instructions about monitoring the hardware

Issues

Information about issues found with this hardware, and how to deal with them

Issue1

Mismatch_cnt not zero issue

Issue2

HardwareCardForm
Model name

Manufacturer

Used for

Number in production

First purchase date

CPU performance in HS06

Disk performance in MB/s

Power consumption in Watts

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2011-04-18 - JasonTemple
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback