Tags:
dcache1Add my vote for this tag create new tag
view all tags

Node Type: dCacheSolaris

Firewall requirements

local port open to reason
2811/tcp * gridftp control connection
22125/tcp 192.33.123.0/24 unauthenticated dcap (read only)
22128/tcp 192.33.123.0/24 gsidcap (GSI authenticated dcap)
20000-25000/tcp * Globus port range for gridftp/xrootd data streams


Regular Maintenance work

Emergency Measures

During a reboot don't use the SOL console redirection

It seems strange but last time we got a reboot with a X4540 File Server the server rebooted just if we were NOT using the SOL console redirection ! If you have to reboot try this way first

Broken 16GB Compact Flash Card NEW

  • Fabio left an already installed and fully tested t3fs10 Compact Flash Card placed inside the X4540 installed above the t3fs11 server ; simply take it and use it to recover the failed X4540;
  • you'll have to delete the related Puppet keys from psi-puppet4.psi.ch and to run Puppet on the restored X4540 to get installed the correct X509 cert and key, or, simply, copy them from t3admin01:/root/clusteradmin/etc/hostkeys/switch-QuoVadis
  • dCache won't start automatically, you have to start it by /data1/dcache/bin/dcache start

Broken 16GB Compact Flash Card -> We have to reinstall Solaris 10

Example of 16GB Flash Card
  • If you're in emergency then stop Nagios to avoid to get too many fake emails ssh root@t3nagios /etc/init.d/nagios stop.
  • If the 16GB Compact Flash Card is broken:
    • in the 3 spares X4540 servers mounted in our last rack there are 3 16GB Compact Flash Card; also in the AIT warehouse close to the Derek's office there are 2 16GB Compact Flash Card placed inside the spares X4540 there stored.
    • once you will have inserted the new 16GB Compact Flash Card you have to use the Solaris VM t3jumpstart to reinstall from scratch Solaris 10 1/13 and configure it automatically in the shape described by the following chapters. Remember that you can use the VM t3fs15 to quickly test the Solaris 10 1/13 installation executed by t3jumpstart, by doing that you can both validate the installation procedure and avoid to write too much onto the target 16GB Compact Flash Card that's actually supposed to be seldomly written!! Furthermore, it could be that by inserting a new 16GB Compact Flash Card into the X4540 the boot disks are going to be reshaffled in the BIOS!. That will prevent to complete the Solaris 10 1/13 installation, Fabio got this case. If that happens rebot and enter into the server BIOS and reorder the boot disks, with the Compact Flash Card placed as the 1st boot device.
  • If doable then zfs export data1 from the failed Solaris installation before to zfs import data1 into the new Solaris installation; anyhow you can always force the zfs import
  • Always inform the users by sending an e-mail to cms-tier3-users@lists.psi.ch; if you want to produce the list of files affected you can use the v_pnfs views made by Fabio.
  • If you accidentaly altered/erased a Solaris file located in / then maybe you could recover it either by a puppetd -t -v = run or by searching for it among the =zfs list snapshots.

Broken HW ( e.g. a 1TB disk )

[root@t3fs08 ~]# fmadm faulty will tell you ; Nagios also has a check related to that

dCache 2.10

dCache runs as the user dcache not anymore as the user root so you might be hit by a permission denied.

dCache 2.10 is not distributed anymore for Solaris as package but as a .tgz bundle !
See: http://www.dcache.org/downloads/1.9/index.shtml#server-2.10

/data1/dcache

More... Close
[root@t3fs01 ~]# ll /data1/dcache                                                                                                                                                                                                                                                                                                                                                           
lrwxrwxrwx 1 root root 14 Mar 19 23:37 /data1/dcache -> dcache-2.10.21/
[root@t3fs01 ~]# ll /data1/dcache-2.10.21/                                                                                                                                                                                                                                                                                                                                                  
total 21
drwxr-xr-x  2 root dcache  5 Mar 12 09:22 bin/
drwxr-xr-x  3 root dcache  3 Feb 24 07:55 doc/
drwxr-xr-x  4 root dcache 10 Mar 20 15:22 etc/
drwxr-xr-x  3 root dcache  3 Feb 24 07:55 man/
drwxr-xr-x  2 root dcache  5 Mar 12 09:22 sbin/
drwxr-xr-x 18 root dcache 18 Feb 24 07:55 share/
drwxr-xr-x 14 root dcache 14 Mar 12 09:22 var/

dCache CSCS page

LCGTier2/ServiceDcache just as a reference, Fabio never uses it and furthermore CSCS doesn't run Solaris

Important files in a nutshell

[root@t3fs01 ~]# ll /data1/dcache/etc/
total 46
drwxr-xr-x 3 root dcache     3 Feb 24 07:55 admin/
-r--r--r-- 1 root dcache  1892 Mar 25 16:47 dcache.conf  <-- main dCache conf, it should be the same on each node
-rw-r--r-- 1 root dcache   152 Mar 10 07:46 gplazma.conf
-rw-r--r-- 1 root dcache  9303 Mar 10 07:46 info-provider.xml
drwxr-xr-x 2 root dcache     3 Mar 20 15:44 layouts/  <-- specific node conf will be deployed here
-rw-r--r-- 1 root dcache  8309 Mar 10 07:46 logback.xml  <-- to tune the logging verbosity
-rw-r--r-- 1 root dcache 10043 Mar 10 07:46 tc-config.xml

# dCache optional plugins dir is /usr/share/dcache/plugins
[root@t3fs01 ~]# ll /usr/share/dcache/plugins/monitor-5.0.8                                                                                                                                                                                                                                                                                                                                 
lrwxrwxrwx 1 root root 20 Mar 20 15:14 /usr/share/dcache/plugins/monitor-5.0.8 -> /data1/monitor-5.0.8/
[root@t3fs01 ~]# ll /usr/share/dcache/plugins/monitor-5.0.8/                                                                                                                                                                                                                                                                                                                                
total 83
-rw-r--r-- 1 root dcache 42804 Sep 22  2014 LICENSE.txt
-rw-r--r-- 1 root dcache  1908 Sep 22  2014 README.md
-rw-r--r-- 1 root dcache 35272 Sep 23  2014 monitor-5.0.8.jar
-r--r--r-- 1 root dcache   133 Feb 19 14:59 myplugin.properties

# dCache Logs
/var/log/dcache/t3fs01-Domain-gsidcap.log
/var/log/dcache/t3fs01-Domain-gsiftp.log
/var/log/dcache/t3fs01-Domain-pool-t3fs01_ops.log
/var/log/dcache/t3fs01-Domain-pool-t3fs01_cms.log
/var/log/dcache/t3fs01-Domain-pool.log
/var/log/dcache/t3fs01-Domain-dcap.log
/var/log/dcache/t3fs01-Domain-gridftp.log


# dCache GSI layer
root@t3fs01 $ ls -l /etc/grid-security/  
total 267
drwxr-xr-x   2 root     root        1202 Nov 26 13:58 certificates  <-- CRLs must to be updated
-rw-r--r--   1 dcache   root        1880 Apr  5  2013 hostcert.pem
-r--------   1 dcache   root        1679 Jul 21  2009 hostkey.pem
drwxr-x---   2 root     nagios         3 Nov 20 13:53 nagios  <-----   hostcert.pem -> /etc/grid-security/hostcert.pem

# Nagios checks
/opt/csw/etc/nrpe.cfg.d/check_dcache_cms_queues.cfg  <-- each dCache file server checks the status of its I/O queues
/opt/csw/etc/nrpe.cfg.d/check_fmd_output.cfg
/opt/csw/etc/nrpe.cfg.d/check_dsvclockd.cfg
/opt/csw/etc/nrpe.cfg.d/check_statd.cfg
/opt/csw/etc/nrpe.cfg.d/check_nfsd.cfg
/opt/csw/etc/nrpe.cfg.d/check_file_age_cern_crl.cfg
/opt/csw/etc/nrpe.cfg.d/check_X509.cfg
/opt/csw/etc/nrpe.cfg.d/check_rpc.bootparamd.cfg
/opt/csw/etc/nrpe.cfg.d/check_gmond.cfg
/opt/csw/etc/nrpe.cfg.d/check_syslogd.cfg
/opt/csw/etc/nrpe.cfg.d/check_rpcbind.cfg
/opt/csw/etc/nrpe.cfg.d/check_in.dhcpd.cfg
/opt/csw/etc/nrpe.cfg.d/check_mountd.cfg
/opt/csw/etc/nrpe.cfg.d/check_zfs_rpool.cfg
/opt/csw/etc/nrpe.cfg.d/check_cron.cfg
/opt/csw/etc/nrpe.cfg.d/check_root_disk.cfg
/opt/csw/etc/nrpe.cfg.d/check_mem.cfg
/opt/csw/etc/nrpe.cfg.d/check_swap.cfg
/opt/csw/etc/nrpe.cfg.d/check_zfs_data1.cfg
/opt/csw/etc/nrpe.cfg.d/check_ntp_time.cfg

/data1/dcache/etc/dcache.conf

The same file as NodeTypedCacheStorageElement#etc_dcache_dcache_conf

/data1/dcache/etc/layouts/t3fs01.conf

More... Close

# Puppet Managed File 

dcache.log.level.file=debug

[${host.name}-Domain-pool-t3fs01_cms]
#dcache.log.file=${dcache.log.dir}/${dcache.domain.name}.${pool.name}.log
# t3fs01_cms
[${host.name}-Domain-pool-t3fs01_cms/pool]
pool.name=t3fs01_cms
pool.path=/data1/t3fs01_cms
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs01_ops]
# t3fs01_ops
[${host.name}-Domain-pool-t3fs01_ops/pool]
pool.name=t3fs01_ops
pool.path=/data1/t3fs01_ops
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-dcap]
[${host.name}-Domain-dcap/dcap]
dcap.authz.anonymous-operations=READONLY

[${host.name}-Domain-gsidcap]
[${host.name}-Domain-gsidcap/dcap]
dcap.authn.protocol=gsi

[${host.name}-Domain-gsiftp]
[${host.name}-Domain-gsiftp/ftp]
ftp.authn.protocol=gsi
ftp.enable.overwrite=false
ftp.mover.queue=wan

/usr/share/dcache/plugins/monitor-5.0.8/myplugin.properties

Xrootd traffic is published to xrootd.t2.ucsd.edu by the following plugin :
[root@t3fs01 ~]# find /usr/share/dcache/plugins/monitor-5.0.8/                                                                                                                                                                                                                                                                                                                              
/usr/share/dcache/plugins/monitor-5.0.8/
/usr/share/dcache/plugins/monitor-5.0.8/README.md
/usr/share/dcache/plugins/monitor-5.0.8/myplugin.properties
/usr/share/dcache/plugins/monitor-5.0.8/LICENSE.txt
/usr/share/dcache/plugins/monitor-5.0.8/monitor-5.0.8.jar

# cat /usr/share/dcache/plugins/monitor-5.0.8/myplugin.properties
pool.mover.xrootd.plugins=edu.uchicago.monitor
site=T3_CH_PSI
summary=xrootd.t2.ucsd.edu:9931:60
detailed=xrootd.t2.ucsd.edu:9930:60

lsof -Pnl +M -i4

More... Close
COMMAND   PID     USER   FD   TYPE             DEVICE   SIZE/OFF NODE NAME
gmond     262    60001    4u  IPv4 0xffffffffb702d200  0xb115da4  UDP 192.33.123.41:32800
gmond     262    60001    5u  IPv4 0xffffffffb5a5ae00  0xb115d64  UDP 192.33.123.41:32804
rpcbind   477        1    3u  IPv4 0xffffffffb702de00        0t0  UDP *:111[rpcbind]
rpcbind   477        1    4u  IPv4 0xffffffffb702dc00        0t0  UDP *:*
rpcbind   477        1    5u  IPv4 0xffffffffb702da00        0t0  UDP *:32773
rpcbind   477        1    6u  IPv4 0xffffffffa15e04c0        0t0  TCP *:111[rpcbind] (LISTEN)
rpcbind   477        1    7u  IPv4 0xffffffffa15dfe00        0t0  TCP *:* (IDLE)
statd     495        1    4u  IPv4 0xffffffffb4371200        0t0  UDP *:*
statd     495        1    5u  IPv4 0xffffffffb4371400        0t0  UDP *:32774[status]
statd     495        1    7u  IPv4 0xffffffffa15e1240        0t0  TCP *:32771[status] (LISTEN)
lockd     511        1    5u  IPv4 0xffffffff9f00ba00        0t0  UDP *:4045[nlockmgr]
lockd     511        1    7u  IPv4 0xffffffffa15df080        0t0  TCP *:4045[nlockmgr] (LISTEN)
inetd     555        0   16u  IPv4 0xffffffff9f00b800        0t0  UDP *:32775[rstatd]
inetd     555        0   20u  IPv4 0xffffffffa15d3e40        0t0  TCP *:32774[rusersd] (LISTEN)
inetd     555        0   22u  IPv4 0xffffffffb4371800        0t0  UDP *:32776[rusersd]
inetd     555        0   24u  IPv4 0xffffffffb702d600        0t0  UDP *:32777[rquotad]
inetd     555        0   26u  IPv4 0xfffffe851552bdc0        0t0  TCP *:6481 (LISTEN)
smcboot   576        0    5u  IPv4 0xffffffffa15d5940        0t0  TCP 127.0.0.1:5987 (LISTEN)
smcboot   576        0    6u  IPv4 0xffffffffa15d5280        0t0  TCP 127.0.0.1:898 (LISTEN)
smcboot   576        0    9u  IPv4 0xffffffffa15d4bc0        0t0  TCP 127.0.0.1:5988 (LISTEN)
smcboot   577        0    5u  IPv4 0xffffffffa15df740        0t0  TCP 127.0.0.1:32772 (LISTEN)
smcboot   578        0    5u  IPv4 0xffffffffa15d4500        0t0  TCP 127.0.0.1:32773 (LISTEN)
snmpdx    687        0    4u  IPv4 0xffffffffb5d10e00        0t0  UDP *:16161
snmpdx    687        0    5u  IPv4 0xffffffffb5d10c00        0t0  UDP *:32781
snmpdx    687        0    6u  IPv4 0xffffffffb5d10a00        0t0  UDP *:32782
dmispd    697        0    3u  IPv4 0xffffffffb5d10800        0t0  UDP *:32783[300598]
dmispd    697        0    4u  IPv4 0xffffffffa15c9880        0t0  TCP *:32775[300598] (LISTEN)
java      748    60002    7u  IPv4 0xffffffffa15c8440        0t0  TCP 127.0.0.1:6788 (LISTEN)
java      748    60002    8u  IPv4 0xffffffffa15c7000        0t0  TCP *:32786 (BOUND)
java      748    60002   10u  IPv4 0xffffffffbb1c0200        0t0  TCP 127.0.0.1:6789 (LISTEN)
java      748    60002   16u  IPv4 0xffffffffbb1bf480        0t0  TCP 127.0.0.1:32784 (LISTEN)
xntpd     751        0   19u  IPv4 0xffffffffb4371c00        0t0  UDP *:123
xntpd     751        0   20u  IPv4 0xffffffffb5a5ac00        0t0  UDP 127.0.0.1:123
xntpd     751        0   21u  IPv4 0xffffffffb5a5aa00        0t0  UDP 192.33.123.41:123
sshd      755        0    9u  IPv4 0xfffffe85b9a00240        0t0  TCP 127.0.0.1:6010 (LISTEN)
java     3386      513    5u  IPv4 0xfffffe851a78c500        0t0  TCP 192.33.123.41:57201->192.33.123.26:9867 (ESTABLISHED)
java     3386      513    6u  IPv4 0xffffffffb5a5a800        0t0  UDP *:39722
java     3386      513    8u  IPv4 0xfffffe85487901c0 0xaa1f6de1  TCP 192.33.123.41:57205->192.33.123.24:11111 (ESTABLISHED)
java     3386      513    9u  IPv4 0xfffffe84dff3ce00        0t0  TCP *:22160 (LISTEN)
java     3386      513   50u  IPv4 0xffffffff9f00b200        0t0  UDP *:39726
java     3386      513   75u  IPv4 0xfffffe852a47be00        0t0  UDP *:39727
java     3386      513   80u  IPv4 0xfffffe85252344c0        0t0  TCP *:21840 (LISTEN)
java     3432      513    5u  IPv4 0xfffffe85ca30bb40        0t0  TCP 192.33.123.41:57198->192.33.123.26:9867 (ESTABLISHED)
java     3432      513    6u  IPv4 0xfffffe84de9be800        0t0  UDP *:39719
java     3432      513    8u  IPv4 0xfffffe851a78be40 0x174a0489  TCP 192.33.123.41:57202->192.33.123.24:11111 (ESTABLISHED)
java     3432      513    9u  IPv4 0xfffffe8536d198c0        0t0  TCP *:23243 (LISTEN)
java     3432      513   50u  IPv4 0xffffffff9f00b400        0t0  UDP *:39724
java     3432      513   75u  IPv4 0xfffffe852a47b800        0t0  UDP *:39725
java     3477      513    5u  IPv4 0xfffffe851552cb40        0t0  TCP 192.33.123.41:57199->192.33.123.26:9867 (ESTABLISHED)
java     3477      513    6u  IPv4 0xffffffffb4371a00        0t0  UDP *:39720
java     3477      513    8u  IPv4 0xfffffe84dff3e900        0t0  TCP *:22125 (LISTEN)
java     3477      513    9u  IPv4 0xfffffe85296a98c0 0x19304a4e  TCP 192.33.123.41:57203->192.33.123.24:11111 (ESTABLISHED)
java     3524      513    5u  IPv4 0xffffffffbdd05440        0t0  TCP 192.33.123.41:57196->192.33.123.26:9867 (ESTABLISHED)
java     3524      513    6u  IPv4 0xfffffe852a47b000        0t0  UDP *:39718
java     3524      513    8u  IPv4 0xfffffe850dd07240 0x196148e3  TCP 192.33.123.41:57197->192.33.123.24:11111 (ESTABLISHED)
java     3524      513    9u  IPv4 0xfffffe852ade9e40        0t0  TCP *:22128 (LISTEN)
java     3571      513    5u  IPv4 0xffffffffbdd05b00        0t0  TCP 192.33.123.41:57200->192.33.123.26:9867 (ESTABLISHED)
java     3571      513    6u  IPv4 0xffffffffb5a5a000        0t0  UDP *:39721
java     3571      513    8u  IPv4 0xffffffffbdd061c0 0x2b39b230  TCP 192.33.123.41:57204->192.33.123.24:11111 (ESTABLISHED)
java     3571      513    9u  IPv4 0xfffffe85b2bf7b80        0t0  TCP *:2811 (LISTEN)
java     3571      513   10u  IPv4 0xfffffe857be09b00    0t18085  TCP 192.33.123.41:2811->192.33.123.135:52972 (ESTABLISHED)
java     3571      513   14u  IPv4 0xfffffe857be09440    0t17289  TCP 192.33.123.41:2811->192.33.123.135:59831 (ESTABLISHED)
java     3571      513   15u  IPv4 0xfffffe85ba0818c0        0t0  TCP 192.33.123.41:23479 (LISTEN)
syslogd  4941        0    4u  IPv4 0xffffffffb5d10400        0t0  UDP *:514
syslogd  4941        0    6u  IPv4 0xfffffe84de9be600        0t0  UDP *:60404
syslogd  4941        0    7u  IPv4 0xfffffe852a47b200        0t0  UDP *:60405
syslogd  4941        0    8u  IPv4 0xffffffffb5d10200        0t0  UDP *:60406
nrpe    18273      101    5u  IPv4 0xffffffff9d704b40        0t0  TCP *:5666 (LISTEN)  <-- Nagios NRPE daemon
snmpd   19823        0   15u  IPv4 0xffffffffb702d000        0t0  UDP *:161
snmpd   19823        0   16u  IPv4 0xfffffe856ddaee00        0t0  UDP *:43402
snmpd   19823        0   17u  IPv4 0xffffffffb4371e00        0t0  UDP *:*
nfs4cbd 20571        1    5u  IPv4 0xfffffe850312eb00        0t0  TCP *:46841[1073741824] (LISTEN)

/data1/dcache/bin/dcache services

DOMAIN                        SERVICE CELL            LOG                                               
t3fs01-Domain-pool-t3fs01_cms pool    t3fs01_cms      /var/log/dcache/t3fs01-Domain-pool-t3fs01_cms.log <-- a domain per pool
t3fs01-Domain-pool-t3fs01_ops pool    t3fs01_ops      /var/log/dcache/t3fs01-Domain-pool-t3fs01_ops.log 
t3fs01-Domain-dcap            dcap    DCap-t3fs01     /var/log/dcache/t3fs01-Domain-dcap.log            
t3fs01-Domain-gsidcap         dcap    DCap-gsi-t3fs01 /var/log/dcache/t3fs01-Domain-gsidcap.log         
t3fs01-Domain-gsiftp          ftp     GFTP-t3fs01     /var/log/dcache/t3fs01-Domain-gsiftp.log   
</>

/data1/dcache/bin/dcache pool ls

POOL       DOMAIN                        META SIZE   FREE  PATH              
t3fs01_cms t3fs01-Domain-pool-t3fs01_cms db   14000G 7163G /data1/t3fs01_cms 
t3fs01_ops t3fs01-Domain-pool-t3fs01_ops db   250G   7163G /data1/t3fs01_ops 

Solaris Installation - Example For t3fs08

Solaris 10 1/13 installation

Last Puppet installation was described by tier3-baseclasses.pp plus Sol10_fs26.pp ; nowadays these servers are running in READ-ONLY mode and we stopped the Puppet development efforts ; Fabio uses these alias and Puppet recipes are in puppetdirnodes; Solaris files are in puppetdirsolaris
alias dcache='ssh -2 -l admin -p 22224 t3dcachedb.psi.ch'
alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom'
alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs'
alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/'
alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/nodes'
alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat'
alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10'
alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts'

Remember to erase the existing Puppet keys associated to the X4540 that you're reinstalling from scratch! e.g.:

$ ssh  -XY martinelli_f@psi-puppet4.psi.ch
[martinelli_f@psi-puppet3 ~]$ sudo /usr/sbin/puppetca --clean  t3fs08.psi.ch
t3fs08.psi.ch
notice: Removing file Puppet::SSL::Certificate t3fs08.psi.ch at '/var/puppet/ssl/ca/signed/t3fs08.psi.ch.pem'

If everything work as designed you just have to follow the instructions on NodeTypeJumpStart to reinstall a X4540.

Once Solaris 10 1/13 will be installed onto the new 16GB Compact Flash Card tune ZFS by:


zfs set atime=off rpool           
zfs set sync=always rpool

Nagios

To restart the local NRPE daemon following a configuration change run kill -9:
root@t3fs02 $ ps -ef | grep nrpe
  nagios  6793     1   0   Nov 20 ?           0:14 /opt/csw/bin/nrpe -c /opt/csw/etc/nrpe.cfg -d

root@t3fs02 $ kill -9 6793

root@t3fs02 $ ps -ef | grep nrpe
  nagios  15477     1   0 10:41:35 ?           0:00 /opt/csw/bin/nrpe -c /opt/csw/etc/nrpe.cfg -d

SSHd keys

Puppet will upload the previous server SSH keys, this will avoid the ssh clients complaints.

SSHd and TcpWrapper

We prevent SSH logins from unauthorized hosts.
cat /etc/hosts.allow
# Puppet Managed File
sshd: t3admin01.psi.ch fabiom-mac.psi.ch wmgt01.psi.ch wmgt02.psi.ch dflt1w.psi.ch localhost t3ossec.psi.ch t3nagios.psi.ch t3fs01.psi.ch t3fs02.psi.ch t3fs03.psi.ch t3fs04.psi.ch t3fs07.psi.ch t3fs08.psi.ch t3fs09.psi.ch t3fs10.psi.ch t3fs11.psi.ch 

ZFS setup for data1 partition

Warning ! Warning !
CREATE data1 ONLY IF data1 DOESN'T EXIST !
IN REAL LIFE data1 WILL ALREADY EXIST SO RUN zpool import data1 INSTEAD AND NEITHER CREATE data1 NOR ALTER ITS ZFS PROPERTIES !!

More... Close

zpool create -f data1 raidz2  c1t0d0 c1t5d0 c2t2d0 c2t7d0 c3t4d0 c4t1d0 c4t6d0 c5t3d0 c6t0d0
zpool add -f data1 raidz2 c1t1d0 c1t6d0 c2t3d0 c3t0d0 c3t5d0 c4t2d0 c4t7d0 c5t4d0 c6t1d0
zpool add -f data1 raidz2 c1t2d0 c1t7d0 c2t4d0 c3t1d0 c3t6d0 c4t3d0 c5t0d0 c5t5d0 c6t2d0
zpool add -f data1 raidz2 c1t3d0 c2t0d0 c2t5d0 c3t2d0 c3t7d0 c4t4d0 c5t1d0 c5t6d0 c6t3d0
zpool add -f data1 raidz2 c1t4d0 c2t1d0 c2t6d0 c3t3d0 c4t0d0 c4t5d0 c5t2d0 c5t7d0 c6t4d0
zpool add -f data1 spare c6t7d0 c6t6d0 c6t5d0
# ZFS tuning 
zfs create data1/t3fs08_cms
zfs create data1/t3fs08_ops
zfs set quota=30TB data1/t3fs08_cms
zfs set quota=1GB data1/t3fs08_ops
zfs set recordsize=1024K data1
zfs set devices=off data1
zfs set atime=off data1
zfs set exec=on data1 # to avoid an additional stress to the weak Compact Flash Cards then you'll want to relocate /opt/csw into /data1 and put exec=on ; otherwise it's safer exec=off because /data1 shouldn't contains executables  

GRID PKI infrastructure

A Puppet run will upload :
  • /etc/grid-security/hostcert.pem
  • /etc/grid-security/hostkey.pem
  • and the tool /opt/fetch-crl/fetch-crl needed to daily updated the lcg-CA CRLs.

Just the first time, to upload the lcg-CA files into /etc/grid-security/certificates please connect to t3admin01 and run /root/clusteradmin/sync_cacerts_tofs.singlet3fs.sh t3fs08

The CA CRL files transferred from t3admin01 will be updated because on t3admin01 there is a cron that regularly refresh them and a Nagios check that check this 'freshness' ; following this first manual upload there is a root crontab created by Puppet that will invoke daily /opt/fetch-crl/fetch-crl on t3fs08; that's showed below in this page.

NTP time server

Make sure that the time service is running correctly; t3nagios will constantly check that; the automatic Solaris 10 1/13 installation made by t3jumpstart will take care of ntp:
-bash-3.2# svcs ntp
STATE          STIME    FMRI
online         Oct_02   svc:/network/ntp:default

The configuration file for xntpd is found at /etc/inet/ntp.conf :

/etc/inet/ntp.conf

More... Close
# NOTE: This file is managed through puppet
# If you edit this file locally, it will be replaced in
# the next puppet run
#
# File is located at
# $Id: NodeTypedCacheSolaris.txt,v 1.55 2016/11/04 11:07:36 fabiom Exp $
# $URL: svn+ssh://savannah01.psi.ch/repos/tier3/tier3/puppet/TRUNK/modules/Tier3/files/Solaris/5.10/etc/inet/ntp.conf $
# as produced by a fresh Solaris 10 jumpstart install
#server 192.33.126.10 prefer
driftfile /var/ntp/ntp.drift
statsdir /var/ntp/ntpstats
filegen peerstats file peerstats type day enable
filegen loopstats file loopstats type day enable
filegen clockstats file clockstats type day enable
server   dmztime1.psi.ch
restrict dmztime1.psi.ch noquery nomodify
server   dmztime2.psi.ch
restrict dmztime2.psi.ch noquery nomodify

Java JDK

JDK7 is a requirement for dCache 2.10; the automatic Solaris 10 1/13 installation performed by t3jumpstart will take care of JDK7:
-bash-3.2# which java
/usr/bin/java
-bash-3.2# ls -l /usr/bin/java
lrwxrwxrwx   1 root     other         16 Oct  2 15:57 /usr/bin/java -> ../java/bin/java
[root@t3fs01 ~]# ll /usr/java                                                                                                                                                                                                                                                                                                                                                               
lrwxrwxrwx 1 root root 27 Mar 20 13:29 /usr/java -> /usr/jdk/instances/jdk1.7.0/
[root@t3fs01 ~]# ll  /usr/jdk/instances/jdk1.7.0/                                                                                                                                                                                                                                                                                                                                           
total 19749
-rw-r--r-- 1 root bin     3339 Dec 19 03:39 COPYRIGHT
-rw-r--r-- 1 root bin       40 Dec 19 03:39 LICENSE
-rw-r--r-- 1 root bin      114 Dec 19 03:39 README.html
-rw-r--r-- 1 root bin   173559 Dec 19 03:40 THIRDPARTYLICENSEREADME.txt
drwxr-xr-x 2 root bin       46 Mar 20 12:35 bin/
drwxr-xr-x 4 root bin        9 Mar 20 12:35 db/
drwxr-xr-x 3 root bin        9 Mar 20 12:35 include/
drwxr-xr-x 5 root bin       10 Mar 20 12:35 jre/
drwxr-xr-x 4 root bin       11 Mar 20 12:35 lib/
drwxr-xr-x 6 root bin        6 Mar 20 12:35 man/
-rw-r--r-- 1 root bin 19914779 Dec 19 03:40 src.zip

pkgutil

The Solaris 10 1/13 installation made by t3jumpstart will automatically take care of the pkgutil + useful pkgs. http://www.opencsw.org/package/pkgutil/ is a must on Solaris to use your daily Linux tools also in Solaris.

If you need to install the pkgs by hand then this is the list:

/opt/csw/bin/pkgutil -i -y CSWnagiosp 
/opt/csw/bin/pkgutil -i -y CSWnrpe
/opt/csw/bin/pkgutil -i -y CSWruby
/opt/csw/bin/pkgutil -i -y CSWsmartmontools
/opt/csw/bin/pkgutil -i -y CSWwatch
/opt/csw/bin/pkgutil -i -y CSWpstree
/opt/csw/bin/pkgutil -i -y CSWtop 
/opt/csw/bin/pkgutil -i -y CSWiftop
/opt/csw/bin/pkgutil -i -y CSWnfswatch
/opt/csw/bin/pkgutil -i -y CSWnano
/opt/csw/bin/pkgutil -i -y CSWalternatives
/opt/csw/bin/pkgutil -i -y CSWaudiofile
/opt/csw/bin/pkgutil -i -y CSWaugeas
/opt/csw/bin/pkgutil -i -y CSWbash
/opt/csw/bin/pkgutil -i -y CSWbdb47
/opt/csw/bin/pkgutil -i -y CSWbdb48
/opt/csw/bin/pkgutil -i -y CSWbonobo2
/opt/csw/bin/pkgutil -i -y CSWbzip2
/opt/csw/bin/pkgutil -i -y CSWcacertificates
/opt/csw/bin/pkgutil -i -y CSWcas-cpsampleconf
/opt/csw/bin/pkgutil -i -y CSWcas-cptemplates
/opt/csw/bin/pkgutil -i -y CSWcas-crontab
/opt/csw/bin/pkgutil -i -y CSWcas-etcservices
/opt/csw/bin/pkgutil -i -y CSWcas-etcshells
/opt/csw/bin/pkgutil -i -y CSWcas-inetd
/opt/csw/bin/pkgutil -i -y CSWcas-initsmf
/opt/csw/bin/pkgutil -i -y CSWcas-migrateconf
/opt/csw/bin/pkgutil -i -y CSWcas-postmsg
/opt/csw/bin/pkgutil -i -y CSWcas-preserveconf
/opt/csw/bin/pkgutil -i -y CSWcas-pycompile
/opt/csw/bin/pkgutil -i -y CSWcas-texinfo
/opt/csw/bin/pkgutil -i -y CSWcas-usergroup
/opt/csw/bin/pkgutil -i -y CSWcommon
/opt/csw/bin/pkgutil -i -y CSWcoreutils
/opt/csw/bin/pkgutil -i -y CSWcswclassutils
/opt/csw/bin/pkgutil -i -y CSWdbusglib
/opt/csw/bin/pkgutil -i -y CSWelinks
/opt/csw/bin/pkgutil -i -y CSWemacs
/opt/csw/bin/pkgutil -i -y CSWemacsbincommon
/opt/csw/bin/pkgutil -i -y CSWemacschooser
/opt/csw/bin/pkgutil -i -y CSWemacscommon
/opt/csw/bin/pkgutil -i -y CSWesound
/opt/csw/bin/pkgutil -i -y CSWexpat
/opt/csw/bin/pkgutil -i -y CSWfconfig
/opt/csw/bin/pkgutil -i -y CSWfindutils
/opt/csw/bin/pkgutil -i -y CSWfontconfig
/opt/csw/bin/pkgutil -i -y CSWfreeglut
/opt/csw/bin/pkgutil -i -y CSWftype2
/opt/csw/bin/pkgutil -i -y CSWgawk
/opt/csw/bin/pkgutil -i -y CSWgcc3corert
/opt/csw/bin/pkgutil -i -y CSWgconf2
/opt/csw/bin/pkgutil -i -y CSWgcpio
/opt/csw/bin/pkgutil -i -y CSWgcrypt
/opt/csw/bin/pkgutil -i -y CSWgdbm
/opt/csw/bin/pkgutil -i -y CSWgdkpixbuf
/opt/csw/bin/pkgutil -i -y CSWggettext
/opt/csw/bin/pkgutil -i -y CSWggettext-data
/opt/csw/bin/pkgutil -i -y CSWggettextrt
/opt/csw/bin/pkgutil -i -y CSWggrep
/opt/csw/bin/pkgutil -i -y CSWgio-fam-backend
/opt/csw/bin/pkgutil -i -y CSWgit
/opt/csw/bin/pkgutil -i -y CSWgit-emacs
/opt/csw/bin/pkgutil -i -y CSWgit-gui
/opt/csw/bin/pkgutil -i -y CSWglib2
/opt/csw/bin/pkgutil -i -y CSWgnomekeyring
/opt/csw/bin/pkgutil -i -y CSWgnomevfs2
/opt/csw/bin/pkgutil -i -y CSWgnupg
/opt/csw/bin/pkgutil -i -y CSWgpg-error
/opt/csw/bin/pkgutil -i -y CSWgpgerr
/opt/csw/bin/pkgutil -i -y CSWgsed
/opt/csw/bin/pkgutil -i -y CSWgtar
/opt/csw/bin/pkgutil -i -y CSWgtk2
/opt/csw/bin/pkgutil -i -y CSWgtk2-printbackends-file
/opt/csw/bin/pkgutil -i -y CSWgtk2-printbackends-papi
/opt/csw/bin/pkgutil -i -y CSWgvim
/opt/csw/bin/pkgutil -i -y vim 
/opt/csw/bin/pkgutil -i -y CSWgzip
/opt/csw/bin/pkgutil -i -y CSWhicoloricontheme
/opt/csw/bin/pkgutil -i -y CSWiconv
/opt/csw/bin/pkgutil -i -y CSWiftop
/opt/csw/bin/pkgutil -i -y CSWiozone
/opt/csw/bin/pkgutil -i -y CSWipython
/opt/csw/bin/pkgutil -i -y CSWisaexec
/opt/csw/bin/pkgutil -i -y CSWjbigkit
/opt/csw/bin/pkgutil -i -y CSWjpeg
/opt/csw/bin/pkgutil -i -y CSWkrb5lib
/opt/csw/bin/pkgutil -i -y CSWlsof
# Perl
/opt/csw/bin/pkgutil -i -y CSWpm-compress-raw-bzip2
/opt/csw/bin/pkgutil -i -y CSWpm-compress-raw-zlib
/opt/csw/bin/pkgutil -i -y CSWpm-html-parser
/opt/csw/bin/pkgutil -i -y CSWpm-html-tagset
/opt/csw/bin/pkgutil -i -y CSWpm-io-compress
/opt/csw/bin/pkgutil -i -y CSWpm-libwww-perl
/opt/csw/bin/pkgutil -i -y CSWpm-mime-base64
/opt/csw/bin/pkgutil -i -y CSWpm-uri
/opt/csw/bin/pkgutil -i -y CSWpmbutils
/opt/csw/bin/pkgutil -i -y CSWpmdatemanip
/opt/csw/bin/pkgutil -i -y CSWpmfontafm
/opt/csw/bin/pkgutil -i -y CSWpmhtmlfmt
/opt/csw/bin/pkgutil -i -y CSWpmhtmlformat
/opt/csw/bin/pkgutil -i -y CSWpmhtmlparser
/opt/csw/bin/pkgutil -i -y CSWpmhtmltagset
/opt/csw/bin/pkgutil -i -y CSWpmhtmltree
/opt/csw/bin/pkgutil -i -y CSWpmiocompress
/opt/csw/bin/pkgutil -i -y CSWpmmimebase64
/opt/csw/bin/pkgutil -i -y CSWpmuri

Cron Jobs

We regularly update the /etc/grid-security/certificates folder by using the tool /opt/fetch-crl/fetch-crl: More... Close
crontab -l 
10 3 * * * /usr/sbin/logadm
15 3 * * 0 /usr/lib/fs/nfs/nfsfind
30 3 * * * [ -x /usr/lib/gss/gsscred_clean ] && /usr/lib/gss/gsscred_clean

1 2 * * * [ -x /usr/sbin/rtc ] && /usr/sbin/rtc -c > /dev/null 2>&1
43 3 * * * [ -x /opt/csw/bin/gupdatedb ] && /opt/csw/bin/gupdatedb --prunepaths="/dev /devices /proc /tmp /var/tmp" 1>/dev/null 2>&1 # Added by CSWfindutils
# Puppet Name: fetch-crl
10 22 * * * /opt/fetch-crl/fetch-crl -c /opt/fetch-crl/fetch-crl.cnf -v  2>&1 | /usr/bin/tee /var/cron/fetch-crl.log 2>&1

Backups

We take ZFS snapshots only for the OS ; =/data1/ is not protected by snapshots.
NodeTypeForm
Hostnames [t3fs02 - t3fs04, t3fs07 - t3fs11] READ-ONLY !!
Services dcache pool cells, gridftp, dcap, gsidcap
Hardware SUN X4500 (2*Opt 290, 16GB RAM, 48*500GB SATA) / SUN X4540 (2*Opt 2435, 32GB RAM, 48*1TB SATA + 16GB Flash)
Install Profile dcachefs
Guarantee/maintenance until t3fs01-04: 2011-06-02, t3fs07-12: 2011-12-17 ( X4540 only 2 years)
Edit | Attach | Watch | Print version | History: r55 < r54 < r53 < r52 < r51 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r55 - 2016-11-04 - FabioMartinelli
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback