Node Type: ComputingElement

Firewall requirements

local port	open to	reason

Table of contents

Regular Maintenance work

Check out our t3nagios

Emergency Measures

if you've really corrupted this VM then ask to Peter to restore the yesterday snapshot.

Installation

Fabio uses these aliases, Puppet recipes are in puppetdirnodes:

alias kscustom57='cd /afs/psi.ch/software/linux/dist/scientific/57/custom'
alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom'
alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs'
alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/'
alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/nodes'
alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat'
alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10'
alias yumdir5='cd /afs/psi.ch/software/linux/dist/scientific/57/scripts'
alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts'

SL5_ce.pp
tier3-baseclasses.pp

Services

[root@t3ce02 ~]# netstat -tpl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 *:nfs                       *:*                         LISTEN      -                   
tcp        0      0 *:7937                      *:*                         LISTEN      3148/nsrexecd       
tcp        0      0 *:962                       *:*                         LISTEN      20711/rpc.mountd    <--- t3ui* mount RO /gridware/sge/default/common
tcp        0      0 *:5666                      *:*                         LISTEN      16337/nrpe          
tcp        0      0 *:7938                      *:*                         LISTEN      3148/nsrexecd       
tcp        0      0 *:7939                      *:*                         LISTEN      3148/nsrexecd       
tcp        0      0 *:smc-http                  *:*                         LISTEN      3276/java           
tcp        0      0 *:7940                      *:*                         LISTEN      3148/nsrexecd       
tcp        0      0 *:smc-https                 *:*                         LISTEN      3276/java           
tcp        0      0 *:rpasswd                   *:*                         LISTEN      20520/rpc.statd     
tcp        0      0 localhost.localdomain:smux  *:*                         LISTEN      16151/snmpd         
tcp        0      0 *:8649                      *:*                         LISTEN      3031/gmond          
tcp        0      0 *:mysql                     *:*                         LISTEN      20233/mysqld        <--- local DB for accounting
tcp        0      0 *:34571                     *:*                         LISTEN      2715/sge_qmaster    
tcp        0      0 *:6444                      *:*                         LISTEN      2715/sge_qmaster    
tcp        0      0 *:6446                      *:*                         LISTEN      2715/sge_qmaster    
tcp        0      0 *:sunrpc                    *:*                         LISTEN      2326/portmap        
tcp        0      0 localhost.localdomain:33714 *:*                         LISTEN      3276/java           
tcp        0      0 *:948                       *:*                         LISTEN      20696/rpc.rquotad   
tcp        0      0 *:ssh                       *:*                         LISTEN      16448/sshd          
tcp        0      0 localhost.lo:x11-ssh-offset *:*                         LISTEN      17412/0             
tcp        0      0 localhost.localdomain:6011  *:*                         LISTEN      25438/2             
tcp        0      0 *:58940                     *:*                         LISTEN      -                   
[root@t3ce02 ~]# netstat -upl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
udp        0      0 *:768                       *:*                                     20520/rpc.statd     
udp        0      0 *:nfs                       *:*                                     -                   
udp        0      0 localhost.locald:syslog     *:*                                     26284/syslog-ng     
udp        0      0 *:7938                      *:*                                     3148/nsrexecd       
udp        0      0 *:rtip                      *:*                                     20520/rpc.statd     
udp        0      0 *:snmp                      *:*                                     16151/snmpd         
udp        0      0 *:945                       *:*                                     20696/rpc.rquotad   
udp        0      0 *:959                       *:*                                     20711/rpc.mountd    <--- t3ui* mount RO /gridware/sge/default/common
udp        0      0 *:bootpc                    *:*                                     2209/dhclient       
udp        0      0 *:48608                     *:*                                     -                   
udp        0      0 *:sunrpc                    *:*                                     2326/portmap        
udp        0      0 t3ce02.psi.ch:ntp           *:*                                     15996/ntpd          
udp        0      0 localhost.localdomain:ntp   *:*                                     15996/ntpd          
udp        0      0 *:ntp                       *:*                                     15996/ntpd

Sun Grid Engine - old doc

I should reorganize this info as it's still valuable in many respects, so please quickly read SGE6dot2u5andARCOMySQLhostedonZFS but consider it outdated

Sun Grid Engine

It's installed by RPMs in /gridware/sge

Consult also the Tier3Policies#Batch_system_policies

Sun Grid Engine doesn't consider the Unix secondary groups !

SGE queue short.q.validation@t3wn10.psi.ch will accept just users with primary group cms ;

During my tests the account martinelli_f belonged to the group cms but NOT as his primary group that was instead ethz-ecal

SGE Man page about ACL

[martinelli_f@t3ui10 QSUB_TESTs]$ qstat -j 3642032
==============================================================
job_number:                 3642032
exec_file:                  job_scripts/3642032
submission_time:            Mon May  6 16:35:48 2013
owner:                      martinelli_f
uid:                        2980
group:                      ethz-ecal
gid:                        529
sge_o_home:                 /shome/martinelli_f
sge_o_log_name:             martinelli_f
sge_o_path:                 /bin:/opt/d-cache/srm/bin:/opt/d-cache/dcap/bin:/gridware/sge/bin/lx24-amd64:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/swshare/psit3/bin:/shome/martinelli_f/shellutils:/shome/martinelli_f/bin:/shome/martinelli_f/eclipse-IDE/
sge_o_shell:                /bin/bash
sge_o_workdir:              /shome/martinelli_f/QSUB_TESTs
sge_o_host:                 t3ui10
account:                    sge
cwd:                        /shome/martinelli_f/QSUB_TESTs
mail_list:                  martinelli_f@t3ui10.psi.ch
notify:                     FALSE
job_name:                   hostname.sh
jobshare:                   0
hard_queue_list:            short.q.validation@t3wn10.psi.ch
env_list:                   
script_file:                hostname.sh
scheduling info:            queue instance "all.q@t3wn35.psi.ch" dropped because it is full
                            queue instance "all.q@t3wn36.psi.ch" dropped because it is full
                            queue instance "all.q@t3wn34.psi.ch" dropped because it is full
                            queue instance "all.q@t3wn32.psi.ch" dropped because it is full
...
                            cannot run in queue "debug.q" because it is not contained in its hard queue list (-q)
                            cannot run in queue "short.q" because it is not contained in its hard queue list (-q)

                            has no permission for cluster queue "short.q.validation"

                            cannot run in queue "all.q" because it is not contained in its hard queue list (-q)
                            cannot run in queue "long.q" because it is not contained in its hard queue list (-q)
                            cannot run in queue "all.q.admin" because it is not contained in its hard queue list (-q)

Sun Grid Engine MySQL DB - ARCO

Apart from running qacct on the CLI an SGE Admin can check the cluster usage by running SELECTs vs the ARCO MySQL DB hosted on t3ce02; that will produce more detailed reports than qacct; the ARCO DB gets constantly updated with new rows ( both raw values and values derived from the raw values ) and cleaned of old rows, both operations are made by the Java daemon sgedbwriter.
Here is the official Oracle Grid Engine Website

but consider that we usually consult the ARCO DB by a direct mysql session without interacting with the ARCO Web Console that's very old and slow, so you can safely avoid to fully understand the Web Console logic.

sgedbwriter is started as a normal init service and in the remote past it was found dead many times as pointed out by https://t3nagios.psi.ch/nagios/cgi-bin/extinfo.cgi?type=2&host=t3ce02&service=SGE+ARCO+file+dbwriter+log ; so maybe you'll have to restart it:

/etc/init.d/sgedbwriter.p6444 start

sgedbwriter uses the following files:

/gridware/sge/dbwriter/lib/mysql-connector-java.jar   <-- to connect to MySQL by Java
/gridware/sge/default/common/reporting  <--- Sun Grid Engine will create and constantly update the reporting file with new usage info, sgedbwriter will analyze it, fill accordingly the ARCO DB and eventually will delete the reporting file.
/gridware/sge/default/common/dbwriter.conf
/gridware/sge/dbwriter/database/mysql/dbwriter.xml
/gridware/sge/default/spool/dbwriter/dbwriter.log   <--- Nagios constantly check its freshness to understand if sgedbwriter is alive or not.

How to run a SQL query

If everything work ok then you can run a query like:

[root@t3ce02 ~]#  mysql --defaults-extra-file=/root/arco_read_my.cnf -u arco_read -D sge_arco -h t3ce02 --execute="SELECT date_format(time, '%Y-%m-%d') AS day, sum(completed) AS jobs FROM view_jobs_completed WHERE time > (current_timestamp - interval 1 year) GROUP BY day"
+------------+-------+
| day        | jobs  |
+------------+-------+
| 2012-07-17 |  5501 | 
| 2012-07-18 |  1161 | 
| 2012-07-19 |  1165 | 
| 2012-07-20 |  2848 | 
| 2012-07-21 |  1097 | 
| 2012-07-22 |   805 | 
...

/var/spool/arco/queries

Here you have some default ARCO queries, you just have to extract the SQL part from them:

/var/spool/arco/queries/1_Month_CPU_Time_per_day_per_user.xml
/var/spool/arco/queries/1_Month_SUM_Wall_Time_per_User.xml
/var/spool/arco/queries/1_Month_SUM_Wall_time_and_SUM_CPU_Time_per_User.xml
/var/spool/arco/queries/1_day_CPU_User_and_System_usage.xml
/var/spool/arco/queries/24HoursJobs.xml
/var/spool/arco/queries/AR_Attributes.xml
/var/spool/arco/queries/AR_Log.xml
/var/spool/arco/queries/AR_Reserved_Time_Usage.xml
/var/spool/arco/queries/AR_by_User.xml
/var/spool/arco/queries/Accounting_per_AR.xml
/var/spool/arco/queries/Accounting_per_Department.xml
/var/spool/arco/queries/Accounting_per_Project.xml
/var/spool/arco/queries/Accounting_per_User.xml
/var/spool/arco/queries/Average_Job_Turnaround_Time.xml
/var/spool/arco/queries/Average_Job_Wait_Time.xml
/var/spool/arco/queries/Average_job_length_per_user_per_month.xml
/var/spool/arco/queries/DBWriter_Performance.xml
/var/spool/arco/queries/Failed_overlong_jobs_per_user.xml
/var/spool/arco/queries/Host_Load.xml
/var/spool/arco/queries/JOBs_MORE_3GB_RAM_LAST_2_MONTHS.xml
/var/spool/arco/queries/Job_Log.xml
/var/spool/arco/queries/Job_efficiency_per_user.xml
/var/spool/arco/queries/Job_length_histogram.xml
/var/spool/arco/queries/Jobs_per_a_specific_hour_per_users.xml
/var/spool/arco/queries/Jobs_per_hours_per_users.xml
/var/spool/arco/queries/Jobs_shorter_than_1h.xml
/var/spool/arco/queries/Jobs_shorter_that_1h_per_user.xml
/var/spool/arco/queries/Number_of_Jobs_Completed_per_AR.xml
/var/spool/arco/queries/Number_of_Jobs_completed.xml
/var/spool/arco/queries/Queue_Consumables.xml
/var/spool/arco/queries/Statistic_History.xml
/var/spool/arco/queries/Statistics.xml
/var/spool/arco/queries/Wallclock_time.xml
/var/spool/arco/queries/average2.xml
/var/spool/arco/queries/cumul_walltime_vs_job_walltime.xml

/gridware/sge/default/common/reporting

To ask Sun Grid Engine to generate this file you need to turn on the "reporting true" setting:

[root@t3ce02 ~]# qconf -sconf |grep reporting_params
reporting_params             accounting=true reporting=true

/gridware/sge/default/common/reporting.not.deleted.by.dbwriter

By default, and regrettably we can't change it, sgedbwriter will delete /gridware/sge/default/common/reporting once that's processed; to save a copy for the future we run a permanent tail left on in the background and started during the initial init sequence:

# ll /gridware/sge/default/common/reporting*
-rw-r--r-- 1 root root     8337 Jul 16 14:59 /gridware/sge/default/common/reporting
-rw-r--r-- 1 root root 25221477 Jul  4  2011 /gridware/sge/default/common/reporting.4-Jul-2001_15:42
lrwxrwxrwx 1 root root       42 Apr 24 18:34 /gridware/sge/default/common/reporting.not.deleted.by.dbwriter -> /mnt/sdb/reporting.not.deleted.by.dbwriter

# cat /etc/rc.local <--- last commands executed during the initial init sequence
#!/bin/sh
# Puppet Managed File
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

#http://yoshinorimatsunobu.blogspot.com/2009/04/linux-io-scheduler-queue-size-and.html
echo 100000 > /sys/block/sdb/queue/nr_requests
echo deadline > /sys/block/sdb/queue/scheduler

# by martinelli to start Sun Web Console + SGE ARCO 
/usr/sbin/smcwebserver stop
/usr/sbin/smcwebserver start

# 2 May 2013 - F.Martinelli 
# needed by VMWare I/O path failover,
# if you add an other disk then add an other line here 
echo 180 > /sys/block/sda/device/timeout
echo 180 > /sys/block/sdb/device/timeout

nohup tail --pid=$(pidof sge_qmaster) -n 0 -F /gridware/sge/default/common/accounting >>  /gridware/sge/default/common/accounting.not.deleted.by.logrotate &
nohup tail --pid=$(pidof sge_qmaster) -n 0 -F /gridware/sge/default/common/reporting >> /gridware/sge/default/common/reporting.not.deleted.by.dbwriter &

/gridware/sge/default/common/accounting.not.deleted.by.logrotate

See the previous section.

/gridware/sge/default/common/dbwriter.conf

DBWRITER_USER_PW=:)
DBWRITER_USER=arco_write
READ_USER=arco_read
READ_USER_PW=
DBWRITER_URL=jdbc:mysql://localhost:3306/sge_arco
DB_SCHEMA=n/a
TABLESPACE=n/a
TABLESPACE_INDEX=n/a
DBWRITER_CONTINOUS=true
DBWRITER_INTERVAL=180
DBWRITER_DRIVER=com.mysql.jdbc.Driver
DBWRITER_REPORTING_FILE=/gridware/sge/default/common/reporting
DBWRITER_CALCULATION_FILE=/gridware/sge/dbwriter/database/mysql/dbwriter.xml
DBWRITER_SQL_THRESHOLD=3
SPOOL_DIR=/gridware/sge/default/spool/dbwriter
DBWRITER_DEBUG=INFO

/gridware/sge/dbwriter/database/mysql/dbwriter.xml

..
   average queue utilization per hour
   Not really correct value, as each entry for slot usage is weighted equally.
   It would be necessary to have time_start and time_end per value and weight
   the values by time.
...      
number of jobs finished per host
...   
number of jobs finished per user
...
number of jobs finished per project
...  
build daily values from hourly ones
...
=========== Statistic Rules ========================================== --> 
        SELECT sge_host, sge_queue, sge_user, sge_group, sge_project, sge_department,
        sge_host_values, sge_queue_values, sge_user_values, sge_group_values, sge_project_values, sge_department_values, 
        sge_job, sge_job_log, sge_job_request, sge_job_usage, sge_statistic, sge_statistic_values,
        sge_share_log, sge_ar, sge_ar_attribute, sge_ar_usage, sge_ar_log, sge_ar_resource_usage
        FROM (SELECT count(*) AS sge_host FROM sge_host) AS c_host, 
        (SELECT count(*) AS sge_queue FROM sge_queue) AS c_queue, 
        (SELECT count(*) AS sge_user FROM sge_user) AS c_user, 
        (SELECT count(*) AS sge_group FROM sge_group) AS c_group, 
        (SELECT count(*) AS sge_project FROM sge_project) AS c_project,
        (SELECT count(*) AS sge_department FROM sge_department) AS c_department,
        (SELECT count(*) AS sge_host_values FROM sge_host_values) AS c_host_values, 
        (SELECT count(*) AS sge_queue_values FROM sge_queue_values) AS c_queue_values, 
        (SELECT count(*) AS sge_user_values FROM sge_user_values) AS c_user_values, 
        (SELECT count(*) AS sge_group_values FROM sge_group_values) AS c_group_values, 
        (SELECT count(*) AS sge_project_values FROM sge_project_values) AS c_project_values,
        (SELECT count(*) AS sge_department_values FROM sge_department_values) AS c_department_values,
        (SELECT count(*) AS sge_job FROM sge_job) AS c_job, 
        (SELECT count(*) AS sge_job_log FROM sge_job_log) AS c_job_log, 
        (SELECT count(*) AS sge_job_request FROM sge_job_request) AS c_job_request, 
        (SELECT count(*) AS sge_job_usage FROM sge_job_usage) AS c_job_usage, 
        (SELECT count(*) AS sge_share_log FROM sge_share_log) AS c_share_log,
        (SELECT count(*) AS sge_statistic FROM sge_statistic) AS c_sge_statistic,
        (SELECT count(*) AS sge_statistic_values FROM sge_statistic_values) AS c_sge_statistic_values,
        (SELECT count(*) AS sge_ar FROM sge_ar) AS c_sge_ar,
        (SELECT count(*) AS sge_ar_attribute FROM sge_ar_attribute) AS c_sge_ar_attribute,
        (SELECT count(*) AS sge_ar_usage FROM sge_ar_usage) AS c_sge_ar_usage,
        (SELECT count(*) AS sge_ar_log FROM sge_ar_log) AS c_sge_ar_log,
        (SELECT count(*) AS sge_ar_resource_usage FROM sge_ar) AS c_sge_ar_resource_usage
 =========== Deletion Rules ========================================== -->  
      keep host raw values only 7 days
...

Backups

OS snapshots are nightly taken by PSI VMWare Team ( like Peter Huesser ) + we have LinuxBackupsByLegato to recover a single file.

Also:

[root@t3ce02 gridware]# /gridware/sge/util/upgrade_modules/save_sge_config.sh /gridware/sge_backup
Configuration successfully saved to /gridware/sge_backup directory.