Hardware Card for Storage Elements (Thors and Thumpers)
The raid setup on the thors and thumpers
RAID configuration
Here is how the raids are set up on Storage Elements.
Firstly the Thumpers:
echo y | yum install xfsprogs
#########################################
# create disk layout for thumpers
sfdisk --force /dev/sda<< EOF
# partition table of /dev/sda
unit: sectors
/dev/sda1 : start= 63, size=976768002, Id=83
/dev/sda2 : start= 0, size= 0, Id= 0
/dev/sda3 : start= 0, size= 0, Id= 0
/dev/sda4 : start= 0, size= 0, Id= 0
EOF
for i in {b..x};do
mdadm --zero-superblock /dev/sd$i
sfdisk -d /dev/sda| sfdisk --force /dev/sd$i
done
sfdisk -d /dev/sda| sfdisk --force /dev/sdz
sfdisk -d /dev/sda| sfdisk --force /dev/sdaa
sfdisk -d /dev/sda| sfdisk --force /dev/sdab
for i in {d..v};do
mdadm --zero-superblock /dev/sda$i
sfdisk -d /dev/sda| sfdisk --force /dev/sda$i
done
echo y | mdadm --create --verbose /dev/md2 --level=6 --raid-devices=11 /dev/sda1 /dev/sdb1 /dev/sdi1 /dev/sdj1 /dev/sdq1 /dev/sdr1 /dev/sdz1 /dev/sdag1 /dev/sdah1 /dev/sdao1 /dev/sdap1
echo y | mdadm --create --verbose /dev/md3 --level=6 --raid-devices=11 /dev/sdc1 /dev/sdd1 /dev/sdk1 /dev/sdl1 /dev/sds1 /dev/sdt1 /dev/sdaa1 /dev/sdab1 /dev/sdai1 /dev/sdaj1 /dev/sdaq1
echo y | mdadm --create --verbose /dev/md4 --level=6 --raid-devices=11 /dev/sde1 /dev/sdm1 /dev/sdn1 /dev/sdu1 /dev/sdv1 /dev/sdad1 /dev/sdak1 /dev/sdal1 /dev/sdar1 /dev/sdas1 /dev/sdat1
echo y | mdadm --create --verbose /dev/md5 --level=6 --raid-devices=11 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdo1 /dev/sdp1 /dev/sdw1 /dev/sdx1 /dev/sdae1 /dev/sdaf1 /dev/sdam1 /dev/sdau1
sleep 120
mdadm /dev/md4 -a /dev/sdan1
mdadm /dev/md5 -a /dev/sdav1
and for the Thors:
sfdisk --force /dev/sda<< EOF
# partition table of /dev/sda
unit: sectors
/dev/sda1 : start= 63, size=1953520002, Id=fd
/dev/sda2 : start= 0, size= 0, Id= 0
/dev/sda3 : start= 0, size= 0, Id= 0
/dev/sda4 : start= 0, size= 0, Id= 0
EOF
for i in {b..z};do
mdadm --zero-superblock /dev/sd$i
sfdisk -d /dev/sda| sfdisk --force /dev/sd$i
done
for i in {a..v};do
mdadm --zero-superblock /dev/sda$i
sfdisk -d /dev/sda| sfdisk --force /dev/sda$i
done
mdadm --create --verbose /dev/md0 --level=6 --raid-devices=9 /dev/sda1 /dev/sdb1 /dev/sdi1 /dev/sdj1 /dev/sdr1 /dev/sdq1 /dev/sdz1 /dev/sdah1 /dev/sdap1
mdadm --create --verbose /dev/md1 --level=6 --raid-devices=9 /dev/sdc1 /dev/sdk1 /dev/sds1 /dev/sdaa1 /dev/sdab1 /dev/sdaj1 /dev/sdai1 /dev/sdar1 /dev/sdaq1
mdadm --create --verbose /dev/md2 --level=6 --raid-devices=9 /dev/sde1 /dev/sdd1 /dev/sdm1 /dev/sdl1 /dev/sdu1 /dev/sdt1 /dev/sdac1 /dev/sdak1 /dev/sdas1
mdadm --create --verbose /dev/md3 --level=6 --raid-devices=9 /dev/sdf1 /dev/sdn1 /dev/sdv1 /dev/sdae1 /dev/sdad1 /dev/sdam1 /dev/sdal1 /dev/sdau1 /dev/sdat1
mdadm --create --verbose /dev/md4 --level=6 --raid-devices=9 /dev/sdh1 /dev/sdg1 /dev/sdo1 /dev/sdp1 /dev/sdx1 /dev/sdw1 /dev/sdaf1 /dev/sdan1 /dev/sdav1
sleep 120
mdadm /dev/md0 -a /dev/sdy1
mdadm /dev/md2 -a /dev/sdag1
mdadm /dev/md4 -a /dev/sdao1
Then, when these commands have been run on either a thor or a thumper, run the following to create /etc/mdadm.conf:
grep recovery /proc/mdstat >> /dev/null
if [ $? -eq 0 ]; then
echo "[INFO] There is a RAID being recovered right now, it's not safe to run $0. Run it afterwards. Quitting"
exit -1
fi
echo "DEVICE" /dev/disk/by-id/scsi*part1 > /etc/mdadm.conf
echo "MAILADDR grid@cscs.ch" >> /etc/mdadm.conf
mdadm --detail --scan --verbose >> /etc/mdadm.conf
sed -i -e "s/UUID/spare-group=global UUID/g" /etc/mdadm.conf
df |grep hd >> /dev/null
is_thumper=$?
if [ "$is_thumper" -eq 1 ]; then
echo "[INFO] Working on a Thumper system, not adding spare-group=global to md0 and md1 raids"
sed -i -e "s/\(md[0\|1].*\)spare-group=global.*\(UUID\)/\1\2/g" /etc/mdadm.conf # we remove md0 and md1 from the spare-group global
fi
for i in `mdadm --detail --scan --verbose |grep -v num-devices|awk --field-separator="=" '{print $2}'|awk --field-separator="," '{ for (f=1; f <= NF; f++) {print $f} }'|grep -v ' '|awk --field-separator="/" '{print $3}'`;do
echo $i
DISK=`ls -l /dev/disk/by-id/|grep $i|awk '{print $9}'`
echo $DISK
sed -i -e "s/$i/$DISK/" /etc/mdadm.conf
done
sed -i -e "s/dev\/scsi/dev\/disk\/by-id\/scsi/g" /etc/mdadm.conf
mdmonitor uses the mdadm.conf to know who to send the report to (
grid@cscs.ch), and what the disk layout is.
Monitoring
Instructions about monitoring the hardware
Issues
Information about issues found with this hardware, and how to deal with them
Issue1
Mismatch_cnt not zero issue
Issue2