Tags:
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> ---+!! Phoenix Ganglia configuration %TOC% Note: All configuration files, non-standard startup files and _gmetric_ scripts needed for this ganglia cluster are kept in the CSCS SVN at %SVNBASE%/monitoring/ganglia. The deployment of the configuration files is done mainly through cfengine. Please consult the CfEngine#Ganglia page to learn about how to implement changes. ---++ Ganglia main server: mon.lcg.cscs.ch This node runs the central ganglia services. There's a number of collector processes for the principal node groups, and a daemon storing the information in round robin data bases. It also runs a web server for displaying the information. ---+++ gmond There are three =gmond= processes running as collectors. They listen to UDP transmissions from the various nodes to be monitored. Each of these nodes runs a =gmond= process configured for sending UDP packets to the respective collector ports. The collector =gmonds= on mon.lcg.cscs.ch use these configuration files * =/etc/gmond-wn-collector.conf= * =/etc/gmond-service-collector.conf= * =/etc/gmond-fileserver-collector.conf= The standard service startup file =/etc/init.d/gmond= is modified to start/stop all three of these services (the init file can be found at %SVNBASE%/monitoring/ganglia/ganglia-config/mon-box/init.d/gmond) ---+++ gmetad The =gmetad= records the history of ganglia monitoring information in round robin data bases (located under =/var/lib/ganglia/rrds=). It's configuration file =/etc/gmetad.conf= contains directives for polling the three =gmond= collectors. The =gmetad= web pages reside under =/var/www/html/ganglia= and also contain a small configuration file =conf.php=. *Note:* In order to get pie charts, you need to have the *php-gd* package installed! ---+++ ramdisk / tmpfs for RRD files The =gmetad= writes the ganglia monitoring information to RRD (_round robin data base_) files, using one file per sensor. For a large cluster this leads to a high frequency of I/O operations to large numbers of files, always on the same disk area. The CPU will tend to be in I/O-wait states most of the time, and people have reported fast degradation of the hard disk. Therefore the RRD files are hosted in a tmpfs area in memory (earlier we had used a ram disk), and the contents of this area are synchronized every few minutes to a disk area to prevent information loss in case of system breakdown. *Note: The standard location for the ganglia RRDs is a symbolic link to the tmpfs area*:<br> =/var/lib/ganglia/rrds -> /dev/shm/ganglia/rrds= The ram disk is started as a service with the custom =/etc/init.d/tmpfs-sync-area= init script. The script resides in the CSCS svn at %SVNBASE%/monitoring/ganglia/ganglia-config/mon-box/init.d. It is started before =gmetad= and does the following * upon start * initializes tmpfs area with contents from disk area * does sanity checks on every important operation * installs a cron job for synchronizing tmpfs area to disk area in regular intervals * upon stop * makes sure that dependent services (gmetad) are stopped first * synchronizes tmpfs area to disk area * uninstalls the cron job Note: The same functionality is available in the form of a ramdisk based service (%SVNBASE%/monitoring/ganglia/ganglia-config/mon-box/init.d/ramdisk). The scripts are quite generic and use configuration information from an appropriate file in =/etc/sysconfg/=. ---+++ httpd Needs to allow running *php* scripts in the ganglia web directory. ---+++ CSCS custom graphs There is a script at =/root/CSCS_custom_graphs/custom_rrd_cscs.pl= producing the CSCS custom graphs which gets executed by the cron job =/etc/cron.d/CSCS_custom_graphs=. The custom graphs get stored in the =/var/www/html/ganglia/CSCS-custom= directory. They are used for the PhoenixMonOverview page and other statistics pages. A similar script pulls down the pie charts for the subclusters for display on the monitoring page. The sources for the scripts can be found under %SVNBASE%/monitoring/ganglia/custom_graphs. ---++ Client nodes ---+++ gmond For every class of node there exists a specific =gmond.conf= configuration file. These can be found in our SVN at %SVNBASE%/monitoring/ganglia/ganglia-config. The files need to be copied to =/etc/gmond.conf= on the node in order to work with the standard init procedure. There are three classes of nodes: * *worker-nodes* * *fileservers*: The dcache pool servers * *service-nodes*: all remaining nodes, including the dcache head and data base nodes ---+++ gmetric scripts Some nodes send additional information by using ganglia's =gmetric= utility. Every node making use of this feature has the same kind of basic configuration: * =/root/gmetric=: contains the specific scripts issuing the =gmetric= command lines. This is a direct checkout from the corresponding %SVNBASE%/monitoring/ganglia/gmetric-scripts subdirectory. Keep this up to date when you make changes * =/etc/cron.d/gmetric=: cron job to regularly run the scripts The following nodes have gmetric scripts: * *CE*: send queue length and running jobs per VO information * *SE head node*: collects information from dCache. NOTE: The =dCache_gmetric.py= is now located in =/opt/cscs/libexec/gmetric-scripts/dcache=!!! * *WN*: (not installed) There is a script that collects the jobID and user for each nodes and displays it as a string. ---++ Ganglia 3.1.7 on ganglia.lcg.cscs.ch ---+++ Build * =yum install apr apr-devel pango pango-devel pcre-devel= * download [[http://savannah.nongnu.org/download/confuse/confuse-2.7.tar.gz][confuse]] * =./configure CFLAGS=-fPIC --disable-nls= * =make && make install= * download [[http://oss.oetiker.ch/rrdtool/pub/rrdtool-1.4.3.tar.gz][rrdtool]] * =./configure --prefix=/usr= * =make && make install= * configure Ganglia: =./configure --with-gmetad --with-librrd=/usr/lib --sysconfdir=/etc/ganglia= * =cp gmond/gmond.init /etc/init.d/gmond= * =cp gmetad/gmetad.init /etc/init.d/gmetad= * =cp -r web/ /var/www/ganglia= -- Main.PeterOettl - 2010-05-04 -- Main.DerekFeichtinger - 30 Jan 2008
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r18
<
r17
<
r16
<
r15
<
r14
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r15 - 2011-01-20
-
DerekFeichtinger
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback