Node Type: CmsFrontier

Firewall requirements

local port	open to	reason
3128/tcp	192.33.123.0/24	T3 Local Squid access
3401/udp	128.142.0.0/16 , 188.185.0.0/17	CMS central Squid monitoring based on SNMP

Table of contents

Regular Maintenance work

Updating Frontier

t3nagios checks if there are new Frontier RPMs to be installed

. If so during a T3 downtime you'll have to update by stopping squid plus:

[root@t3frontier01 ~]# yum --disablerepo=* --enablerepo=cern-frontier update
Loaded plugins: downloadonly, priorities, security, versionlock
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package frontier-squid.x86_64 11:2.7.STABLE9-20.1 will be updated
---> Package frontier-squid.x86_64 11:2.7.STABLE9-21.1 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================================================================================================================================================================================================================
 Package                                                            Arch                                                       Version                                                                  Repository                                                         Size
================================================================================================================================================================================================================================================================================
Updating:
 frontier-squid                                                     x86_64                                                     11:2.7.STABLE9-21.1                                                      cern-frontier                                                     835 k

Transaction Summary
================================================================================================================================================================================================================================================================================
Upgrade       1 Package(s)

Total download size: 835 k
Is this ok [y/N]:

Emergency Measures

Actually if t3frontier01 goes down the CMS Jobs will use the CERN Squid frontiers and the CVMFS clients will use their local caches; hopefully you'll have the time to fix this VM.

Services

[root@t3frontier01 ~]# lsof -u squid -P
COMMAND   PID  USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
squid   13103 squid  cwd    DIR                8,2     4096    130564 /root
squid   13103 squid  rtd    DIR                8,2     4096         2 /
squid   13103 squid  txt    REG                8,2   811848    167040 /usr/sbin/squid (deleted)
squid   13103 squid  mem    REG                8,2   156928    263334 /lib64/ld-2.12.so
squid   13103 squid  mem    REG                8,2  1926680    263335 /lib64/libc-2.12.so
squid   13103 squid  mem    REG                8,2   145896    263336 /lib64/libpthread-2.12.so
squid   13103 squid  mem    REG                8,2   599384    263357 /lib64/libm-2.12.so
squid   13103 squid  mem    REG                8,3   217016       189 /var/db/nscd/group
squid   13103 squid  mem    REG                8,3   217016       190 /var/db/nscd/hosts
squid   13103 squid    0u   CHR                1,3      0t0      3656 /dev/null
squid   13103 squid    1u   CHR                1,3      0t0      3656 /dev/null
squid   13103 squid    2u   CHR                1,3      0t0      3656 /dev/null
squid   13103 squid    3u  unix 0xffff8803381e96c0      0t0   3500924 socket
squid   13106 squid  cwd    DIR               8,32     4096 268435584 /home/dbfrontier/squid/squid_cache
squid   13106 squid  rtd    DIR                8,2     4096         2 /
squid   13106 squid  txt    REG                8,2   811848    167040 /usr/sbin/squid (deleted)
squid   13106 squid  mem    REG                8,2   156928    263334 /lib64/ld-2.12.so
squid   13106 squid  mem    REG                8,2  1926680    263335 /lib64/libc-2.12.so
squid   13106 squid  mem    REG                8,2   145896    263336 /lib64/libpthread-2.12.so
squid   13106 squid  mem    REG                8,2   599384    263357 /lib64/libm-2.12.so
squid   13106 squid  mem    REG                8,3   217016       191 /var/db/nscd/services
squid   13106 squid  mem    REG                8,3   217016       189 /var/db/nscd/group
squid   13106 squid  mem    REG                8,3   217016       190 /var/db/nscd/hosts
squid   13106 squid    0u   CHR                1,3      0t0      3656 /dev/null
squid   13106 squid    1u   CHR                1,3      0t0      3656 /dev/null
squid   13106 squid    2u   CHR                1,3      0t0      3656 /dev/null
squid   13106 squid    3u  unix 0xffff880338909380      0t0   3500932 socket
squid   13106 squid    4u   REG                0,9        0      3654 anon_inode
squid   13106 squid    5u   REG                8,7       68   1310729 /home/dbfrontier/squid_logs/cache.log
squid   13106 squid    6u  IPv4            3500939      0t0        UDP *:51795 http://linuxplayer.org/2012/02/why-squid-listen-on-high-udp-port-number
squid   13106 squid    7w   REG                8,7 29636757   1310737 /home/dbfrontier/squid_logs/access.log
squid   13106 squid    8r  FIFO                0,8      0t0   3500940 pipe
squid   13106 squid    9w   REG               8,32 26837640 271800258 /home/dbfrontier/squid/squid_cache/swap.state
squid   13106 squid   10u  IPv4            3500942      0t0       TCP *:3128 (LISTEN)
squid   13106 squid   11w  FIFO                0,8      0t0   3500941 pipe
squid   13106 squid   12u  IPv4            3500943      0t0       UDP *:3401 http://etutorials.org/Server+Administration/Squid.+The+definitive+guide/Chapter+14.+Monitoring+Squid/14.3+Using+SNMP/
squid   13106 squid   13u  IPv4           15800738      0t0       TCP t3frontier01.psi.ch:3128->t3wn41.psi.ch:39457 (ESTABLISHED)
squid   13106 squid   14u  IPv4           15800794      0t0       TCP t3frontier01.psi.ch:3128->t3wn35.psi.ch:49764 (ESTABLISHED)
squid   13106 squid   15u  IPv4           15800829      0t0       TCP t3frontier01.psi.ch:3128->t3wn28.psi.ch:44577 (ESTABLISHED)
squid   13106 squid   16u  IPv4           15800873      0t0       TCP t3frontier01.psi.ch:3128->t3wn13.psi.ch:41743 (ESTABLISHED)
squid   13106 squid   18u  IPv4           15800839      0t0       TCP t3frontier01.psi.ch:37468->cvmfs02.racf.bnl.gov:80 (ESTABLISHED)
squid   13106 squid   19u  IPv4           15800932      0t0       TCP t3frontier01.psi.ch:3128->t3ui12.psi.ch:59661 (ESTABLISHED)
squid   13106 squid   21u  IPv4           15800934      0t0       TCP t3frontier01.psi.ch:54272->front15.cern.ch:80 (ESTABLISHED)
unlinkd 13107 squid  cwd    DIR                8,2     4096    130564 /root
unlinkd 13107 squid  rtd    DIR                8,2     4096         2 /
unlinkd 13107 squid  txt    REG                8,2     4952    145185 /usr/libexec/squid/unlinkd (deleted)
unlinkd 13107 squid  mem    REG                8,2   156928    263334 /lib64/ld-2.12.so
unlinkd 13107 squid  mem    REG                8,2  1926680    263335 /lib64/libc-2.12.so
unlinkd 13107 squid    0r  FIFO                0,8      0t0   3500941 pipe
unlinkd 13107 squid    1w  FIFO                0,8      0t0   3500940 pipe
unlinkd 13107 squid    2u   CHR                1,3      0t0      3656 /dev/null

Installation

Squid Installation

Read the CERN central wiki

Fabio uses these aliases, do the same, Puppet recipes are in puppetdirnodes:

alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom'
alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs'
alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/'
alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/nodes'
alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat'
alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10'
alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts'

Puppet recipes ordered top down :

SL6_frontier.pp
SL6.pp
tier3-baseclasses.pp

Squid conf /etc/squid/squid.conf

CERN monitoring connections by SNMP
T3 file requests
local cache

More... Close

# grep -v \# /etc/squid/squid.conf | strings 
acl NET_LOCAL src 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 192.33.123.0/24 127.0.0.1/32
acl HOST_MONITOR src 127.0.0.1/32 128.142.0.0/16 188.184.128.0/17 188.185.128.0/17 131.225.240.232/32 
acl snmppublic snmp_community public
acl all src all
acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32
acl SSL_ports port 443
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow NET_LOCAL
http_access allow localhost
http_access deny all
icp_access allow localnet
icp_access deny all
http_port 3128
hierarchy_stoplist cgi-bin 
cache_mem 500 MB
maximum_object_size_in_memory 128 KB
cache_dir ufs /home/dbfrontier/squid_cache 25000 16 256
maximum_object_size 1048576 KB
logformat awstats %>a %ui %un [%{%d/%b/%Y:%H:%M:%S}tl.%03tu %{%z}tl] "%rm %ru HTTP/%rv" %Hs %h %{cvmfs-info}>h" "%{Referer}>h" "%{User-Agent}>h"
access_log /var/log/squid/access.log awstats
logfile_daemon /usr/libexec/squid/logfile-daemon
cache_log /var/log/squid/cache.log
cache_store_log none
mime_table /etc/squid/mime.conf
pid_filename /var/run/squid/squid.pid
strip_query_terms off
unlinkd_program /usr/libexec/squid/unlinkd
refresh_pattern ^ftp:		1440	20%	10080
refresh_pattern ^gopher:	1440	0%	1440
refresh_pattern -i /cgi-bin/    0	0%	0
refresh_pattern .		0	20%	4320
negative_ttl 1 minute
acl shoutcast rep_header X-HTTP09-First-Line ^ICY.[0-9]
upgrade_http0.9 deny shoutcast
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
  collapsed_forwarding on
cache_mgr squid
cache_effective_user squid
cache_effective_group squid
umask 022
snmp_access allow snmppublic HOST_MONITOR
snmp_access deny all
icp_port 0
icon_directory /usr/share/squid/icons
error_directory /usr/share/squid/errors/English
ignore_ims_on_miss on
coredump_dir /home/dbfrontier/squid_cache

Registering the local frontier to the central CMS operations

Read https://twiki.cern.ch/twiki/bin/view/CMS/SquidForCMS#Register_Your_Server

Registering the local frontier to the central WLCG operations

Read https://twiki.cern.ch/twiki/bin/view/LCG/WLCGSquidRegistration

Outcome is http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgall/T3_CH_PSI_t3frontier.psi.ch/index.html

CVMFS

Read 1st http://cernvm.cern.ch/portal/filesystem/techinformation

The T3 CMS Frontier t3frontier01 is also acting as the Squid server of our CVMFS clients t3ui1* t3wn* t3se01 t3cmsvobox02 that from their side use the following files + the service automount to mount /cvmfs/cms.cern.ch only when it's accessed :

[martinelli_f@t3ui19 nodes]$ grep cvmfs /etc/{passwd,group}
/etc/passwd:cvmfs:x:495:495:CernVM-FS service account:/var/lib/cvmfs:/sbin/nologin
/etc/group:fuse:x:296:cvmfs
/etc/group:cvmfs:x:495:

[martinelli_f@t3ui19 nodes]$ /etc/init.d/autofs status
automount (pid  645) is running...

[martinelli_f@t3ui19 nodes]$ tail -1 /etc/auto.master 
/cvmfs /etc/auto.cvmfs

[martinelli_f@t3ui19 nodes]$ cat /etc/cvmfs/config.d/cms.cern.ch.conf
export CMS_LOCAL_SITE <-- important for CMS Jobs

[martinelli_f@t3ui19 nodes]$ cat /etc/cvmfs/default.local 
#CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,cms.cern.ch,lhcb.cern.ch,hone.cern.ch,grid.cern.ch
CVMFS_REPOSITORIES=cms
#CVMFS_HTTP_PROXY="http://cvmfs.lcg.cscs.ch:3128|http://ppcvmfs.lcg.cscs.ch:3128"
CVMFS_HTTP_PROXY="http://t3frontier.psi.ch:3128"
CVMFS_CACHE_BASE=/scratch/cvmfs_local
CVMFS_QUOTA_LIMIT=30000
CMS_LOCAL_SITE=/cvmfs/cms.cern.ch/SITECONF/T3_CH_PSI  <-- CVMFS uses the var CMS_LOCAL_SITE to resolve the Tier1/2/3 agnostic paths /cvmfs/cms.cern.ch/SITECONF/local/{JobConfig,PhEDEx}

[martinelli_f@t3ui19 nodes]$ pgrep -u cvmfs -fl
26697 /usr/bin/cvmfs2 __cachemgr__ . 7 8 31457280000 15728640000 0 3 -1 :
26699 /usr/bin/cvmfs2 __cachemgr__ . 7 8 31457280000 15728640000 0 3 -1 :
26705 /usr/bin/cvmfs2 -o rw,fsname=cvmfs2,allow_other,grab_mountpoint,uid=495,gid=495 cms.cern.ch /cvmfs/cms.cern.ch
26709 /usr/bin/cvmfs2 -o rw,fsname=cvmfs2,allow_other,grab_mountpoint,uid=495,gid=495 cms.cern.ch /cvmfs/cms.cern.ch

t3nagios constantly checks:

If automount is running
The CVMFS clients status
The CVMFS RPMs updates ( based on check_yum )

Service startup/stop

/etc/init.d/frontier-squid start

Testing whether service is running:

pgrep -fl squid
3245 /usr/sbin/squid -DF
3248 (squid) -DF

awstats statistics

Not strictly required by CMS or CVMFS but useful as all the stats ; run locally =[root@t3frontier01 ~]# firefox http://localhost.localdomain/awstats/awstats.pl=

Squid Testing

To be moved in =t3nagios=

Read 1st https://twiki.cern.ch/twiki/bin/viewauth/CMS/SAMSquidByHand

then in nagios@t3wn41:/opt/nagios/test_squid there is a .sh script that has to return OK :

-bash-4.1$ /opt/nagios/test_squid/test_squid.py.sh
node: t3wn41.psi.ch
SiteLocalConfig: /swshare/cms/SITECONF/local/JobConfig/site-local-config.xml

Contents of site-local-config.xml are:
<site-local-config>
  <site name="T3_CH_PSI">
     <event-data>
       <catalog url="trivialcatalog_file:/swshare/cms/SITECONF/local/PhEDEx/storage.xml?protocol=dcap"/>
     </event-data>
     <source-config>
       <cache-hint value="application-only"/>
       <read-hint  value="auto-detect"/>
       <statistics-destination name="cms-udpmon-collector.cern.ch:9331" />
     </source-config>
     <local-stage-out>
       <command value="srmv2"/>
       <catalog url="trivialcatalog_file:/experiment-software/cms/SITECONF/local/PhEDEx/storage.xml?protocol=srmv2"/>
       <se-name value="t3se01.psi.ch"/>
       <option value="-debug"/>
       <phedex-node value="T3_CH_PSI"/>
     </local-stage-out>
     <calib-data>
       <frontier-connect>
          <proxy url="http://t3frontier.psi.ch:3128"/>
          <backupproxy url="http://cmsbpfrontier.cern.ch:3128"/>
          <backupproxy url="http://cmsbproxy.fnal.gov:3128"/>
          <server url="http://cmsfrontier.cern.ch:8000/FrontierInt"/>
          <server url="http://cmsfrontier1.cern.ch:8000/FrontierInt"/>
          <server url="http://cmsfrontier2.cern.ch:8000/FrontierInt"/>
          <server url="http://cmsfrontier3.cern.ch:8000/FrontierInt"/>
        </frontier-connect>
     </calib-data>
  </site>
</site-local-config>

site: T3_CH_PSI
loadtag: None
script version: $Id: NodeTypeCmsFrontier.txt,v 1.20 2014/12/16 18:08:55 fabiom Exp $
Using Frontier URL: http://cmsfrontier.cern.ch:8080/FrontierProd/Frontier
Query: SELECT 1 FROM DUAL

Query started: 09/17/14 09:46:38 CEST
squid: http://t3frontier.psi.ch:3128

Frontier Request:
http://cmsfrontier.cern.ch:8080/FrontierProd/Frontier?type=frontier_request:1:DEFAULT&encoding=BLOB&p1=eNoLdvVxdQ5RMFRwC/L3VXAJdfQBACyLBKw=

Query ended: 09/17/14 09:46:38 CEST
Query time: 0.04 [seconds]

Query result:
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE frontier SYSTEM "http://frontier.fnal.gov/frontier.dtd">
<frontier version="3.33" xmlversion="1.0">
 <transaction payloads="1">
  <payload type="frontier_request" version="1" encoding="BLOB">
   <data>BgAAAAExBgAAAAZOVU1CRVIHBgAAAAExBw==</data>
   <quality error="0" md5="2e47f41c56b898fb582b7ecf1e8686cc" records="1" full_size="25"/>
  </payload>
 </transaction>
</frontier>


Fields: 
     1     NUMBER

Records:
     1
OK

Remote Monitoring vs PSI

http://wlcg-squid-monitor.cern.ch/squid_monitors.txt ( look inside for the string PSI )
http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgall/T3_CH_PSI_t3frontier.psi.ch/index.html <-- it means that t3frontier01.psi.ch has been also properly registered in GOCDB
http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgcms/psi/proxy-hit.html
http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgcms/psi/proxy-srvkbinout.html
http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgcms/psi/proxy-obj.html

Backups

OS snapshots are nightly taken by PSI VMWare Team ( like Peter Huesser ) + we have LinuxBackupsByLegato to recover a single file.

NodeTypeForm
Hostnames	t3frontier01, DNS alias t3frontier = t3frontier01
Services	CMS-Frontier and CVMFS Squid cache
Hardware	PSI DMZ VMWare cluster
Install Profile	t3frontier
Guarantee/maintenance until	n.a.