Service Card for GPFS
Disk failure/replacement
If a disk fails, there is a cron script that should check it (every five minutes) and remove it with
mmdeldisk
automatically. When a disk is removed this way, the filesystem rebalances (so that there are two copies of every file everywhere again), so there's no risk on a second disk failure.
Gathering information
Recovering a deleted disk
To bring the disk back, you need to:
- Delete the disk (automatic by the cron script) and then delete the NSD first from GPFS
- Add a new NSD (with the same name) and then add the disk to GPFS again
You need to find its failure group by looking at
mmlsdisk
to its colleagues (the ones from the same enclosure have the same failure group)
You need to determine which is the primary server and which is the secondary. The NSD name will tell you that.
You need to make sure you have authorized hosts accepted for all GPFS nodes, including the worker nodes
After you gathered all necesary information, do:
# Check the free (deleted) disks:
mmlsnsd -F
# Delete the NSD
mmdelnsd oss11j1r6c5
# If disk is not new then do also (from the corresponding OSS):
dd if=/dev/zero of=/dev/j1r6c5 bs=1024k count=10k
# And then create the NSD and the disk (format= device:primaryserver,secondaryserver::dataOnly:failuregroup:nsdname:)
echo "j1r6c5:oss11(pri).ib.lcg.cscs.ch,oss12(sec).ib.lcg.cscs.ch::dataOnly:FAILURE_GROUP:oss11j1r6c5:" > /tmp/new_nsd
mmcrnsd -F /tmp/new_nsd
mmadddisk gpfs -F /tmp/new_nsd # may take some time
Rebalancing the filesystem
After adding a new disk, you may want to rebalance the filesystem, in order to have all disks with the same free space. This is an expensive operation (an hour or two), and it's not really needed in production, since every time a file is written, it rebalances it correctly, and our scratch has a very quick file turnover. But in any case, this is how you do it:
mmrestripefs tmpgpfs -b
Metadata server reboot
If, for any reason, a virident card is taken out of the filesystem (it's only vissible from one of the machines), you need to reenable it again. This may happen also when a MDS is rebooted
mmchdisk tmpgpfs start -d virident3
Starting a down disk
If, for any reason, a virident card is taken offline (marked down) here's how to get it back.
[root@mds1:gen]# mmstartup
Fri Nov 11 10:44:18 CET 2011: mmstartup: Starting GPFS ...
[root@mds1:gen]# mmlsdisk gpfs -d "virident1"
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
virident1 nsd 512 1009 yes no ready down system
[root@mds1:gen]# mmchdisk gpfs start -d "virident1"
Scanning file system metadata, phase 1 ...
Scan completed successfully.
Scanning file system metadata, phase 2 ...
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Fri Nov 11 10:45:06 2011
Scan completed successfully.
[root@mds1:gen]# mmlsdisk gpfs -d "virident1"
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
virident1 nsd 512 1009 yes no ready up system
Procedure to follow when GPFS is blocked
In order to obtain all information available from gpfs, we need to create a dump for IBM:
# mmfsadm dump all
# mmfsadm dump waiters
# mmfsadm dump kthreads
Clean all CREAM stalled files on tmpdir_slurm
Running the following on a WN will manually delete all directories on
/gpfs/tmpdir_slurm/CREAM_FQDN
that are relative to jobs not currently running on the system for that CREAM CE:
# CREAM_TAG=cre02
# CREAM=cream02.lcg.cscs.ch
# squeue -t R --noheader |grep ${CREAM_TAG} | awk '{print $1}' |sort > /tmp/running_jobs.${CREAM_TAG}.txt
# cd /gpfs/tmpdir_slurm/${CREAM}/
# ls | egrep -vf /tmp/running_jobs.${CREAM_TAG}.txt | xargs -n 1 echo rm -rf > /tmp/delete_stalled_${CREAM_TAG}.txt
# bash -x /tmp/delete_stalled_${CREAM_TAG}.txt
What to do when receiving memory errors when doing ls
If you see errors relative to 'Insufficient memory'
when doing ls, most likely you ran out of metadata space, even if
df -i
shows that there is space available on GPFS. Look at this example:
# mmdf gpfs
[...]
virident1 293937152 1005 yes no 0 ( 0%) 21504 ( 0%)
virident2 293937152 1006 yes no 0 ( 0%) 20800 ( 0%)
ssd1 390711360 1007 yes no 1024 ( 0%) 44256 ( 0%)
ssd2 390711360 1008 yes no 0 ( 0%) 41568 ( 0%)
------------- -------------------- -------------------
(pool total) 187932952384 168898436096 ( 90%) 4428987840 ( 2%)
============= ==================== ===================
(data) 186563655360 168898435072 ( 91%) 4428859712 ( 2%)
(metadata) 1369297024 1024 ( 0%) 128128 ( 0%)
============= ==================== ===================
(total) 187932952384 168898436096 ( 90%) 4428987840 ( 2%)
Inode Information
-----------------
Number of used inodes: 94736998
Number of free inodes: 55889306
Number of allocated inodes: 150626304
Maximum number of inodes: 150626304
As you can see there is no metadata space available although the amount of used inodes hasn't reached the limit. This is because the calculations of space hasn't been done right and more inodes than actually present have been assigned to the filesystem.
There are only two ways of solving this:
- Clean the filesystem via massive rm or GPFS policies
- Physically add more metadata disks to the system and then add the proper NSDs and disk to GPFS:
# cat nsdssd3
/dev/sdc:mds2.lcg.cscs.ch::metadataOnly:1008:ssd3:
/dev/sdc:mds1.lcg.cscs.ch::metadataOnly:1007:ssd4:
# mmcrnsd -F nsdssd3 -v no
# vim /var/mmfs/etc/nsddevices # and make sure /dev/sdc (in this case) is allowed
# mmlsnd -M
# mmadddisk gpfs -F ./nsdssd3 # Running mmcrnsd will modify this file and make it ready to add the disks just by using the file.
# mmlsdisk
# mmdf gpfs
Filesystem Creation
To create a new GPFS filesystem, do the following:
First, make sure that you can ssh in both directions to and from all servers and clients - mainly this involves typing yes everywhere. We are doing everything through the ib network, so make sure that the hosts file has the name.ib addresses resolvable.
mmcrcluster -N gpfs01.ib.lcg.cscs.ch:manager-quorum,gpfs02.ib.lcg.cscs.ch:manager-quorum,gpfs03.ib.lcg.cscs.ch:quorum \
-p gpfs01.ib.lcg.cscs.ch -s gpfs02.ib.lcg.cscs.ch -r /usr/bin/ssh -R /usr/bin/scp -C scratch
mmchlicense server --accept -N gpfs01.ib.lcg.cscs.ch,gpfs02.ib.lcg.cscs.ch,gpfs03.ib.lcg.cscs.ch
cp -f orig/sdk.dsc.all sdk.dsc.all
mmcrnsd -F sdk.dsc.all -v no
mmstartup -a
#edit this to make different filesystem
#A - automount
#B - blocksize
#E -
#j cluster, just keep this way
#k all, just keep
#K replication, keep 1
#m number of replicated copies, keep 1
#M replicated something else, keep 2
#n number of nodes in cluster, keep 25
#T where to mount it, leave as /gpfs
#mmcrfs scratch -F sdk.dsc.all -A yes -B 1M -D posix -E no -j cluster -k all -K whenpossible -m 1 -M 2 -n 25 -T /gpfs
mmcrfs scratch -F sdk.dsc.all -A yes -B 1M -D posix -E no -j cluster -k all -K no -m 1 -M 1 -n 25 -T /gpfs
mmaddnode -N nodefile
mmchlicense client --accept -N nodefile
the file sdk.dsc.all has the following:
gpfsdata00:gpfs01.ib.lcg.cscs.ch:gpfs02.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun0:
gpfsdata01:gpfs01.ib.lcg.cscs.ch:gpfs03.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun1:
gpfsdata02:gpfs01.ib.lcg.cscs.ch:gpfs02.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun2:
gpfsdata03:gpfs01.ib.lcg.cscs.ch:gpfs03.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun3:
gpfsdata04:gpfs02.ib.lcg.cscs.ch:gpfs01.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun4:
gpfsdata05:gpfs02.ib.lcg.cscs.ch:gpfs03.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun5:
gpfsdata06:gpfs02.ib.lcg.cscs.ch:gpfs01.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun6:
gpfsdata07:gpfs02.ib.lcg.cscs.ch:gpfs03.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun7:
gpfsdata08:gpfs03.ib.lcg.cscs.ch:gpfs01.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun8:
gpfsdata09:gpfs03.ib.lcg.cscs.ch:gpfs02.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun9:
gpfsdata10:gpfs03.ib.lcg.cscs.ch:gpfs01.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun10:
gpfsdata11:gpfs03.ib.lcg.cscs.ch:gpfs02.ib.lcg.cscs.ch:dataAndMetadata:-1:storage3lun11:
I named the individual disks using udev rules:
Aug 17 09:59 [root@gpfs01:gpfs]# cat /etc/udev/rules.d/10-local.rules
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="0:2:1:0", NAME="gpfsmeta"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:0", NAME="gpfsdata00"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:1", NAME="gpfsdata01"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:2", NAME="gpfsdata02"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:3", NAME="gpfsdata03"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:4", NAME="gpfsdata04"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:5", NAME="gpfsdata05"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:6", NAME="gpfsdata06"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:7", NAME="gpfsdata07"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:8", NAME="gpfsdata08"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:9", NAME="gpfsdata09"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:10", NAME="gpfsdata10"
ACTION=="add", SUBSYSTEM=="block", IMPORT{program}="/lib/udev/rename_device", ID=="7:0:0:11", NAME="gpfsdata11"
The following tweaks were made to tweak the disk access parameters
echo "Updating params for $dev ..."
echo 4 > /sys/block/${dev}/queue/nr_requests
echo noop > /sys/block/${dev}/queue/scheduler
echo 1024 > /sys/block/${dev}/queue/max_sectors_kb
echo 64 > /sys/block/${dev}/device/queue_depth
echo 512 > /sys/block/${dev}/queue/read_ahead_kb
These were added to the /etc/sysctl.conf file to optimize memory usage
kernel.shmall = 4294967296
vm.mmap_min_addr=65536
vm.min_free_kbytes=16901008
This is the current working configuration that I did to the filesystem:
Aug 17 10:03 [root@gpfs01:gpfs]# mmlsconfig
Configuration data for cluster scratch.ib.lcg.cscs.ch:
------------------------------------------------------
clusterName scratch.ib.lcg.cscs.ch
clusterId 10717238835674925567
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
pagepool 2048M
nsdbufspace 30
nsdMaxWorkerThreads 36
maxMBpS 1600
maxFilesToCache 10000
worker1Threads 48
subnets 148.187.70.0 148.187.71.0
prefetchThreads 72
verbsRdma enable
verbsPorts mlx4_0
nsdThreadsPerDisk 3
minMissedPingTimeout 240
adminMode central
File systems in cluster scratch.ib.lcg.cscs.ch:
-----------------------------------------------
/dev/scratch
You also will want to add this to the ib configuration:
RENICE_IB_MAD=yes
This prevents gpfs from overwhelming the ib communication. If the kernel is too busy to respond to ib pings, then the gpfs server will assume that the node is dead and kick it out of the fs, even when it isn't dead.
NOTES
dsh -w wn[120-136] -w wn[139-142] mmstartup
dsh -w wn[120-136] -w wn[139-142] mmmount tmpgpfs
dsh -w wn[120-136] -w wn[139-142] mmumount tmpgpfs
dsh -w wn[120-136] -w wn[139-142] mmshutdown
on oss11: mmchconfig minMissedPingTimeout=60
/var/adm/ras/mmfs.log.latest
mmchdisk virident1 start -N oss12
mmlsdisk tmpgpfs
mmlsnsd
mmlsnsd -M
http://141.85.107.254/Documentatie/Sisteme_Paralele_si_Distribuite/IBM_HPC/GPFS/a7604134.pdf
GPFS Policies
The cleanup policy is located in
/opt/cscs/libexec/gpfs-policies/empty_tmpdir_slurm_usertmp_home.policy
and deletes all files older than
6 days on
tmpdir_slurm
,
gridhome
and
usertemp
:
RULE 'gpfswipe' DELETE FROM POOL 'system' WHERE (PATH_NAME like '/gpfs/tmpdir_slurm/%' OR PATH_NAME like '/gpfs/home/%' OR PATH_NAME like '/gpfs/usertmp/%') AND (CURRENT_TIMESTAMP - MODIFICATION_TIME > INTERVAL '6' DAYS)
The policy runs each time on a single node and is scheduled as follows:
-
wn65
runs every Sunday at 12:20
-
wn23
runs every Tuesday at 00:20
-
wn18
runs every Thursday at 12:20
NOTE: Please, be aware that this policy does not delete directories and these need to be removed by hand!
How to refresh a reinstalled client to be back in the GPFS cluster
- Copy /var/mmfs/gen/mmsdrfs from other node (any) to the reinstalled system.
[root@wn11:/] scp wn45:/var/mmfs/gen/mmsdrfs /var/mmfs/gen/
- Run mmrefresh -f on the reinstalled node.
[root@wn11:gen]# mmrefresh -f
[root@wn11:gen]# mmgetstate
Node number Node name GPFS state
------------------------------------------
11 wn11 down
[root@wn11:gen]# mmstartup
CLIENT INSTALLATION
- Stop all GRID services.
grid-service stop
- Remove all the grid users in the system
/opt/cscs/sbin/clean_grid_accounts.bash
- Install the kernel 2.6.18-274.3.1.el5
yum install kernel-2.6.18-274.3.1.el5 kernel-headers-2.6.18-274.3.1.el5 --disableexcludes=main
- Install RPMS available in
xen11:/nfs/gpfs_clean
mount xen11:/nfs /media
cd /media/gpfs_clean
./gpfs_cleanslate.sh
cd -
umount /media
- Reboot the machine.
- Add to
/etc/hosts
the hostname.10 IP addresess
- From mds1
scp /var/mmfs/gen/mmsdrfs $client:/var/mmfs/gen/
mmdelnode -N $client
mmaddnode $client.lcg.cscs.ch
mmchlicense client --accept -N $client.lcg.cscs.ch
- On mds1
mmlscluster
- Startup GPFS on each client.
mmstartup
GPFS Repo
I have set up a repo on puppet to install GPFS only using yum the base urls for which are
http://148.187.64.40:81/gpfs/$releasever/base/ and
http://148.187.64.40:81/gpfs/$releasever/base/
First you need to install the initial release of the base package.
This is because updated base packages only provide a delta and do not list the initial release package as a dependency but rather check by a pre install script in the RPM. Thanks IBM...
yum localinstall http://phoenix1.lcg.cscs.ch:81/gpfs/el6/base/gpfs.base-3.5.0-0.x86_64.rpm
Ok lets install the other packages we need.
yum install gpfs.docs gpfs.msg gpfs.docs gpfs.base
No all we need is the kernel module if it is not already available.
cd /usr/lpp/mmfs/src
make LINUX_DISTRIBUTION=REDHAT_AS_LINUX Autoconfig
make World
make rpm
rpm -ivh /root/rpmbuild/RPMS/x86_64/gpfs.gplbin-*.rpm
modprobe mmfs26
Let make this available by yum for future installs. Note this was built with the GPFS-3.5.0-10 packages.
scp /root/rpmbuild/RPMS/x86_64/gpfs.gplbin-*.rpm phoenix1:/cm/www/html/gpfs/el6/updates/
#On Phoenix1
cd /cm/www/html/gpfs/el6/updates/
createrepo --update -p .
We can now install via yum on the client, obviously make sure the correct kernel version is used.
yum clean all
yum install gpfs.gplbin-$(uname -r)
Other information - 2 GPFS CLUSTERS (obsolete)
Now, we have created two clusters, the i/o servers and the clients. Here's what I did:
Servers - create the cluster the same way as last time (I just left the old one)
May 08 11:26 [root@oss11:gpfs_fs_creation]# mmlscluster
GPFS cluster information
========================
GPFS cluster name: gpfs.lcg.cscs.ch
GPFS cluster id: 10717232453390325744
GPFS UID domain: gpfs.lcg.cscs.ch
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: mds1.lcg.cscs.ch
Secondary server: mds2.lcg.cscs.ch
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 mds1.lcg.cscs.ch 148.187.66.34 mds1.lcg.cscs.ch quorum-manager
2 mds2.lcg.cscs.ch 148.187.66.35 mds2.lcg.cscs.ch quorum-manager
3 oss11.lcg.cscs.ch 148.187.66.3 oss11.lcg.cscs.ch quorum
4 oss12.lcg.cscs.ch 148.187.66.4 oss12.lcg.cscs.ch quorum
5 oss21.lcg.cscs.ch 148.187.66.9 oss21.lcg.cscs.ch quorum
6 oss22.lcg.cscs.ch 148.187.66.10 oss22.lcg.cscs.ch quorum
7 oss31.lcg.cscs.ch 148.187.66.15 oss31.lcg.cscs.ch quorum
8 oss32.lcg.cscs.ch 148.187.66.16 oss32.lcg.cscs.ch
9 oss41.lcg.cscs.ch 148.187.66.21 oss41.lcg.cscs.ch
10 oss42.lcg.cscs.ch 148.187.66.22 oss42.lcg.cscs.ch
mmcrcluster -N mds1.lcg.cscs.ch:manager-quorum,mds2.lcg.cscs.ch:manager-quorum,oss11.lcg.cscs.ch:quorum,oss12.lcg.cscs.ch:quorum,oss21.lcg.cscs.ch:quorum,oss22.lcg.cscs.ch:quorum,oss31.lcg.cscs.ch:quorum,oss32.lcg.cscs.ch:quorum,oss41.lcg.cscs.ch:quorum,oss42.lcg.cscs.ch:quorum -p mds1.lcg.cscs.ch -s mds2.lcg.cscs.ch -r /usr/bin/ssh -R /usr/bin/scp -C gpfs
mmchlicense server --accept -N mds1.lcg.cscs.ch,mds2.lcg.cscs.ch,oss11.lcg.cscs.ch,oss12.lcg.cscs.ch,oss21.lcg.cscs.ch,oss22.lcg.cscs.ch,oss31.lcg.cscs.ch,oss32.lcg.cscs.ch,oss41.lcg.cscs.ch,oss42.lcg.cscs.ch
mmcrfs gpfs -F sdk.dsc.all -A yes -B 1M -D posix -E no -j scatter -k all -K always -m 2 -M 2 -r 2 -R 2 -n 200 -v no -T /gpfs
Clients - create a new cluster:
mmcrcluster -N wn46.lcg.cscs.ch:manager-quorum,wn36.lcg.cscs.ch:manager-quorum,wn26.lcg.cscs.ch:quorum,wn16.lcg.cscs.ch:quorum,wn13.lcg.cscs.ch:quorum,wn40.lcg.cscs.ch:quorum,wn28.lcg.cscs.ch:quorum -p wn46.lcg.cscs.ch -s wn36.lcg.cscs.ch -r /usr/bin/ssh -R /usr/bin/scp -C gpfsclients
mmchlicense server --accept -N wn46.lcg.cscs.ch,wn36.lcg.cscs.ch,wn26.lcg.cscs.ch,wn13.lcg.cscs.ch,wn40.lcg.cscs.ch,wn28.lcg.cscs.ch,wn16.lcg.cscs.ch
mmaddnode -N nodes
mmchlicense client --accept -N nodes
May 08 11:29 [root@wn46:~]# mmlscluster
GPFS cluster information
========================
GPFS cluster name: gpfsclients.lcg.cscs.ch
GPFS cluster id: 10717231405418994821
GPFS UID domain: gpfsclients.lcg.cscs.ch
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: wn46.lcg.cscs.ch
Secondary server: wn36.lcg.cscs.ch
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 wn46.lcg.cscs.ch 148.187.65.46 wn46.lcg.cscs.ch quorum-manager
2 wn36.lcg.cscs.ch 148.187.65.36 wn36.lcg.cscs.ch quorum-manager
3 wn26.lcg.cscs.ch 148.187.65.26 wn26.lcg.cscs.ch quorum
4 wn16.lcg.cscs.ch 148.187.65.16 wn16.lcg.cscs.ch quorum
5 wn13.lcg.cscs.ch 148.187.65.13 wn13.lcg.cscs.ch quorum
6 wn40.lcg.cscs.ch 148.187.65.40 wn40.lcg.cscs.ch quorum
7 wn28.lcg.cscs.ch 148.187.65.28 wn28.lcg.cscs.ch quorum
8 wn03.lcg.cscs.ch 148.187.65.3 wn03.lcg.cscs.ch
9 wn04.lcg.cscs.ch 148.187.65.4 wn04.lcg.cscs.ch
10 wn07.lcg.cscs.ch 148.187.65.7 wn07.lcg.cscs.ch
11 wn08.lcg.cscs.ch 148.187.65.8 wn08.lcg.cscs.ch
12 wn09.lcg.cscs.ch 148.187.65.9 wn09.lcg.cscs.ch
13 wn11.lcg.cscs.ch 148.187.65.11 wn11.lcg.cscs.ch
14 wn12.lcg.cscs.ch 148.187.65.12 wn12.lcg.cscs.ch
15 wn14.lcg.cscs.ch 148.187.65.14 wn14.lcg.cscs.ch
16 wn15.lcg.cscs.ch 148.187.65.15 wn15.lcg.cscs.ch
17 wn17.lcg.cscs.ch 148.187.65.17 wn17.lcg.cscs.ch
18 wn18.lcg.cscs.ch 148.187.65.18 wn18.lcg.cscs.ch
19 wn19.lcg.cscs.ch 148.187.65.19 wn19.lcg.cscs.ch
20 wn20.lcg.cscs.ch 148.187.65.20 wn20.lcg.cscs.ch
21 wn21.lcg.cscs.ch 148.187.65.21 wn21.lcg.cscs.ch
22 wn22.lcg.cscs.ch 148.187.65.22 wn22.lcg.cscs.ch
23 wn23.lcg.cscs.ch 148.187.65.23 wn23.lcg.cscs.ch
24 wn24.lcg.cscs.ch 148.187.65.24 wn24.lcg.cscs.ch
25 wn25.lcg.cscs.ch 148.187.65.25 wn25.lcg.cscs.ch
26 wn27.lcg.cscs.ch 148.187.65.27 wn27.lcg.cscs.ch
27 wn29.lcg.cscs.ch 148.187.65.29 wn29.lcg.cscs.ch
28 wn30.lcg.cscs.ch 148.187.65.30 wn30.lcg.cscs.ch
29 wn31.lcg.cscs.ch 148.187.65.31 wn31.lcg.cscs.ch
30 wn32.lcg.cscs.ch 148.187.65.32 wn32.lcg.cscs.ch
31 wn33.lcg.cscs.ch 148.187.65.33 wn33.lcg.cscs.ch
32 wn34.lcg.cscs.ch 148.187.65.34 wn34.lcg.cscs.ch
33 wn35.lcg.cscs.ch 148.187.65.35 wn35.lcg.cscs.ch
34 wn37.lcg.cscs.ch 148.187.65.37 wn37.lcg.cscs.ch
35 wn38.lcg.cscs.ch 148.187.65.38 wn38.lcg.cscs.ch
36 wn39.lcg.cscs.ch 148.187.65.39 wn39.lcg.cscs.ch
37 wn41.lcg.cscs.ch 148.187.65.41 wn41.lcg.cscs.ch
38 wn42.lcg.cscs.ch 148.187.65.42 wn42.lcg.cscs.ch
39 wn43.lcg.cscs.ch 148.187.65.43 wn43.lcg.cscs.ch
40 wn44.lcg.cscs.ch 148.187.65.44 wn44.lcg.cscs.ch
41 wn45.lcg.cscs.ch 148.187.65.45 wn45.lcg.cscs.ch
42 cream01.lcg.cscs.ch 148.187.66.43 cream01.lcg.cscs.ch
43 cream02.lcg.cscs.ch 148.187.66.44 cream02.lcg.cscs.ch
44 arc01.lcg.cscs.ch 148.187.67.10 arc01.lcg.cscs.ch
45 arc02.lcg.cscs.ch 148.187.66.40 arc02.lcg.cscs.ch
now, set it up to connect to the remote cluster, on both sides:
Client Side:
mmauth genkey new
mmchconfig cipherList=AUTHONLY
scp /var/mmfs/ssl/id_rsa.pub oss11:/root/gpfs_fs_creation/id_rsa.gpfsclients.pub
mmremotecluster add gpfs.lcg.cscs.ch -k id_rsa.gpfs.pub -n mds1.lcg.cscs.ch,mds2.lcg.cscs.ch
mmremotefs add gpfs -f /dev/gpfs -C gpfs.lcg.cscs.ch -T /gpfs
May 08 11:30 [root@wn46:~]# mmremotecluster show all
Cluster name: gpfs.lcg.cscs.ch
Contact nodes: mds1.lcg.cscs.ch,mds2.lcg.cscs.ch
SHA digest: 6d732758ebedeedd0c73eb87cf0d00cf1df9ef0f
File systems: gpfs (gpfs)
Server Side:
mmauth genkey new
scp /var/mmfs/ssl/id_rsa.pub wn46:~/id_rsa.gpfs.pub
mmremotecluster add gpfsclients.lcg.cscs.ch -k id_rsa.gpfsclients.pub -n wn46.lcg.cscs.ch,wn36.lcg.cscs.ch
mmauth add gpfsclients.lcg.cscs.ch -k id_rsa.gpfsclients.pub
mmauth grant gpfsclients.lcg.cscs.ch -f gpfs
GPFS Metrics
We can use the mmpmon command to gather metrics about GPFS. A simple use would be as follows. Note these counters are cumulative!
We echo into the command as if we don't provide an input file we actually get a prompt within mmpmon.
Oct 31 14:55 [root@wn01:~]# echo io_s | mmpmon -s
mmpmon node 148.187.65.1 name wn01 io_s OK
timestamp: 1383227732/587764
bytes read: 27845668669
bytes written: 19746297868
opens: 56460
closes: 55757
reads: 13513
writes: 10478999
readdir: 7276233
inode updates: 837741
For command line usage you can use the following to gather interactive metrics, these counters are relative.
Oct 31 14:54 [root@wn01:~]# gpfs_getio_s.ksh
Started: Thu Oct 31 14:54:21 CET 2013
Sample Interval: 2 Seconds
Timestamp ReadMB/s WriteMB/s F_open f_close reads writes rdir inode
1383227663 0.0 0.0 0 0 0 0 0 0
1383227665 0.0 0.0 0 0 0 0 0 0
For metrics there is a script (/opt/cscs/libexec/gmetric-scripts/gpfs/gpfs_stats.sh) that is run every minute that feeds data to ganglia.
Some notes about mmpmon
- Timestamps are in EPOCH
- When using an input file (-i flag) the separator for options is a newline
- You can use fs_io_s rather than io_s to gather per filesystem metrics useful if you mount more than one gpfs filesystem on a host
--
JasonTemple - 2011-08-17