Node Type: Mon
Firewall requirements
local port |
open to |
reason |
8649/udp |
192.33.123.0/24 |
ganglia collector |
8670/udp |
192.33.123.0/24 |
ganglia collector |
8671/udp |
192.33.123.0/24 |
ganglia collector |
80/tcp |
* |
ganglia web server |
Regular Maintenance work
Emergency Measures
Installation
Fabio uses these aliases, Puppet recipes are in
puppetdirnodes
:
alias kscustom57='cd /afs/psi.ch/software/linux/dist/scientific/57/custom'
alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom'
alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs'
alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/'
alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/'
alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat'
alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10'
alias yumdir5='cd /afs/psi.ch/software/linux/dist/scientific/57/scripts'
alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts'
-
SL5_mon.pp
-
tier3-baseclasses.pp
Services
Ganglia
Installation details can be found
here.
Starting up
gmetad
can sometimes fail with this log entry:
May 11 10:49:56 t3ce01 /usr/sbin/gmetad[5651]: Please make sure that /var/lib/ganglia/rrds is owned by nobody
I am not yet sure what causes
/var/lib/ganglia/rrds
to be owned by root after reboots.
Additional monitoring
Look on
AdditionalMonitoring page
Backups
OS snapshots are nightly taken by PSI VMWare Team ( like Peter Huesser ) + we have
LinuxBackupsByLegato to recover a single file.