Firewall requirements
local port |
open to |
reason |
3128/tcp |
192.33.123.0/24 |
T3 Local Squid access |
3401/udp |
128.142.0.0/16 , 188.185.0.0/17 |
CMS central Squid monitoring based on SNMP |
Regular Maintenance work
Updating Frontier
t3nagios checks if there are new Frontier RPMs to be installed. If so during a T3 downtime you'll have to update by stopping
squid
plus:
[root@t3frontier01 ~]# yum --disablerepo=* --enablerepo=cern-frontier update
Loaded plugins: downloadonly, priorities, security, versionlock
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package frontier-squid.x86_64 11:2.7.STABLE9-20.1 will be updated
---> Package frontier-squid.x86_64 11:2.7.STABLE9-21.1 will be an update
--> Finished Dependency Resolution
Dependencies Resolved
================================================================================================================================================================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================================================================================================================================================================
Updating:
frontier-squid x86_64 11:2.7.STABLE9-21.1 cern-frontier 835 k
Transaction Summary
================================================================================================================================================================================================================================================================================
Upgrade 1 Package(s)
Total download size: 835 k
Is this ok [y/N]:
Emergency Measures
Actually if
t3frontier01
goes down the CMS Jobs will use the CERN Squid frontiers and the
CVMFS clients will use their local caches; hopefully you'll have the time to fix this VM.
Services
[root@t3frontier01 ~]# lsof -u squid -P
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
squid 13103 squid cwd DIR 8,2 4096 130564 /root
squid 13103 squid rtd DIR 8,2 4096 2 /
squid 13103 squid txt REG 8,2 811848 167040 /usr/sbin/squid (deleted)
squid 13103 squid mem REG 8,2 156928 263334 /lib64/ld-2.12.so
squid 13103 squid mem REG 8,2 1926680 263335 /lib64/libc-2.12.so
squid 13103 squid mem REG 8,2 145896 263336 /lib64/libpthread-2.12.so
squid 13103 squid mem REG 8,2 599384 263357 /lib64/libm-2.12.so
squid 13103 squid mem REG 8,3 217016 189 /var/db/nscd/group
squid 13103 squid mem REG 8,3 217016 190 /var/db/nscd/hosts
squid 13103 squid 0u CHR 1,3 0t0 3656 /dev/null
squid 13103 squid 1u CHR 1,3 0t0 3656 /dev/null
squid 13103 squid 2u CHR 1,3 0t0 3656 /dev/null
squid 13103 squid 3u unix 0xffff8803381e96c0 0t0 3500924 socket
squid 13106 squid cwd DIR 8,32 4096 268435584 /home/dbfrontier/squid/squid_cache
squid 13106 squid rtd DIR 8,2 4096 2 /
squid 13106 squid txt REG 8,2 811848 167040 /usr/sbin/squid (deleted)
squid 13106 squid mem REG 8,2 156928 263334 /lib64/ld-2.12.so
squid 13106 squid mem REG 8,2 1926680 263335 /lib64/libc-2.12.so
squid 13106 squid mem REG 8,2 145896 263336 /lib64/libpthread-2.12.so
squid 13106 squid mem REG 8,2 599384 263357 /lib64/libm-2.12.so
squid 13106 squid mem REG 8,3 217016 191 /var/db/nscd/services
squid 13106 squid mem REG 8,3 217016 189 /var/db/nscd/group
squid 13106 squid mem REG 8,3 217016 190 /var/db/nscd/hosts
squid 13106 squid 0u CHR 1,3 0t0 3656 /dev/null
squid 13106 squid 1u CHR 1,3 0t0 3656 /dev/null
squid 13106 squid 2u CHR 1,3 0t0 3656 /dev/null
squid 13106 squid 3u unix 0xffff880338909380 0t0 3500932 socket
squid 13106 squid 4u REG 0,9 0 3654 anon_inode
squid 13106 squid 5u REG 8,7 68 1310729 /home/dbfrontier/squid_logs/cache.log
squid 13106 squid 6u IPv4 3500939 0t0 UDP *:51795 http://linuxplayer.org/2012/02/why-squid-listen-on-high-udp-port-number
squid 13106 squid 7w REG 8,7 29636757 1310737 /home/dbfrontier/squid_logs/access.log
squid 13106 squid 8r FIFO 0,8 0t0 3500940 pipe
squid 13106 squid 9w REG 8,32 26837640 271800258 /home/dbfrontier/squid/squid_cache/swap.state
squid 13106 squid 10u IPv4 3500942 0t0 TCP *:3128 (LISTEN)
squid 13106 squid 11w FIFO 0,8 0t0 3500941 pipe
squid 13106 squid 12u IPv4 3500943 0t0 UDP *:3401 http://etutorials.org/Server+Administration/Squid.+The+definitive+guide/Chapter+14.+Monitoring+Squid/14.3+Using+SNMP/
squid 13106 squid 13u IPv4 15800738 0t0 TCP t3frontier01.psi.ch:3128->t3wn41.psi.ch:39457 (ESTABLISHED)
squid 13106 squid 14u IPv4 15800794 0t0 TCP t3frontier01.psi.ch:3128->t3wn35.psi.ch:49764 (ESTABLISHED)
squid 13106 squid 15u IPv4 15800829 0t0 TCP t3frontier01.psi.ch:3128->t3wn28.psi.ch:44577 (ESTABLISHED)
squid 13106 squid 16u IPv4 15800873 0t0 TCP t3frontier01.psi.ch:3128->t3wn13.psi.ch:41743 (ESTABLISHED)
squid 13106 squid 18u IPv4 15800839 0t0 TCP t3frontier01.psi.ch:37468->cvmfs02.racf.bnl.gov:80 (ESTABLISHED)
squid 13106 squid 19u IPv4 15800932 0t0 TCP t3frontier01.psi.ch:3128->t3ui12.psi.ch:59661 (ESTABLISHED)
squid 13106 squid 21u IPv4 15800934 0t0 TCP t3frontier01.psi.ch:54272->front15.cern.ch:80 (ESTABLISHED)
unlinkd 13107 squid cwd DIR 8,2 4096 130564 /root
unlinkd 13107 squid rtd DIR 8,2 4096 2 /
unlinkd 13107 squid txt REG 8,2 4952 145185 /usr/libexec/squid/unlinkd (deleted)
unlinkd 13107 squid mem REG 8,2 156928 263334 /lib64/ld-2.12.so
unlinkd 13107 squid mem REG 8,2 1926680 263335 /lib64/libc-2.12.so
unlinkd 13107 squid 0r FIFO 0,8 0t0 3500941 pipe
unlinkd 13107 squid 1w FIFO 0,8 0t0 3500940 pipe
unlinkd 13107 squid 2u CHR 1,3 0t0 3656 /dev/null
Installation
Squid Installation
Read the
CERN central wiki.
Fabio uses these aliases, do the same, Puppet recipes are in
puppetdirnodes
:
alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom'
alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs'
alias puppetdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/'
alias puppetdirnodes='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/nodes'
alias puppetdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat'
alias puppetdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10'
alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts'
Puppet recipes ordered top down :
-
SL6_frontier.pp
-
SL6.pp
-
tier3-baseclasses.pp
Squid conf /etc/squid/squid.conf
- CERN monitoring connections by SNMP
- T3 file requests
- local cache
More... Close
# grep -v \# /etc/squid/squid.conf | strings
acl NET_LOCAL src 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 192.33.123.0/24 127.0.0.1/32
acl HOST_MONITOR src 127.0.0.1/32 128.142.0.0/16 188.184.128.0/17 188.185.128.0/17 131.225.240.232/32
acl snmppublic snmp_community public
acl all src all
acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32
acl SSL_ports port 443
acl CONNECT method CONNECT
http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow NET_LOCAL
http_access allow localhost
http_access deny all
icp_access allow localnet
icp_access deny all
http_port 3128
hierarchy_stoplist cgi-bin
cache_mem 500 MB
maximum_object_size_in_memory 128 KB
cache_dir ufs /home/dbfrontier/squid_cache 25000 16 256
maximum_object_size 1048576 KB
logformat awstats %>a %ui %un [%{%d/%b/%Y:%H:%M:%S}tl.%03tu %{%z}tl] "%rm %ru HTTP/%rv" %Hs %h %{cvmfs-info}>h" "%{Referer}>h" "%{User-Agent}>h"
access_log /var/log/squid/access.log awstats
logfile_daemon /usr/libexec/squid/logfile-daemon
cache_log /var/log/squid/cache.log
cache_store_log none
mime_table /etc/squid/mime.conf
pid_filename /var/run/squid/squid.pid
strip_query_terms off
unlinkd_program /usr/libexec/squid/unlinkd
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i /cgi-bin/ 0 0% 0
refresh_pattern . 0 20% 4320
negative_ttl 1 minute
acl shoutcast rep_header X-HTTP09-First-Line ^ICY.[0-9]
upgrade_http0.9 deny shoutcast
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
collapsed_forwarding on
cache_mgr squid
cache_effective_user squid
cache_effective_group squid
umask 022
snmp_access allow snmppublic HOST_MONITOR
snmp_access deny all
icp_port 0
icon_directory /usr/share/squid/icons
error_directory /usr/share/squid/errors/English
ignore_ims_on_miss on
coredump_dir /home/dbfrontier/squid_cache
Registering the local frontier to the central CMS operations
Read
https://twiki.cern.ch/twiki/bin/view/CMS/SquidForCMS#Register_Your_Server
Registering the local frontier to the central WLCG operations
Read
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGSquidRegistration
Outcome is
http://wlcg-squid-monitor.cern.ch/snmpstats/mrtgall/T3_CH_PSI_t3frontier.psi.ch/index.html
Read 1st
http://cernvm.cern.ch/portal/filesystem/techinformation
The T3 CMS Frontier
t3frontier01
is also acting as the Squid server of our
CVMFS clients
t3ui1* t3wn* t3se01 t3cmsvobox02
that from their side use the following files + the service
automount
to mount
/cvmfs/cms.cern.ch
only when it's accessed :
[martinelli_f@t3ui19 nodes]$ grep cvmfs /etc/{passwd,group}
/etc/passwd:cvmfs:x:495:495:CernVM-FS service account:/var/lib/cvmfs:/sbin/nologin
/etc/group:fuse:x:296:cvmfs
/etc/group:cvmfs:x:495:
[martinelli_f@t3ui19 nodes]$ /etc/init.d/autofs status
automount (pid 645) is running...
[martinelli_f@t3ui19 nodes]$ tail -1 /etc/auto.master
/cvmfs /etc/auto.cvmfs
[martinelli_f@t3ui19 nodes]$ cat /etc/cvmfs/config.d/cms.cern.ch.conf
export CMS_LOCAL_SITE <-- important for CMS Jobs
[martinelli_f@t3ui19 nodes]$ cat /etc/cvmfs/default.local
#CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,cms.cern.ch,lhcb.cern.ch,hone.cern.ch,grid.cern.ch
CVMFS_REPOSITORIES=cms
#CVMFS_HTTP_PROXY="http://cvmfs.lcg.cscs.ch:3128|http://ppcvmfs.lcg.cscs.ch:3128"
CVMFS_HTTP_PROXY="http://t3frontier.psi.ch:3128"
CVMFS_CACHE_BASE=/scratch/cvmfs_local
CVMFS_QUOTA_LIMIT=30000
CMS_LOCAL_SITE=/cvmfs/cms.cern.ch/SITECONF/T3_CH_PSI <-- CVMFS uses the var CMS_LOCAL_SITE to resolve the Tier1/2/3 agnostic paths /cvmfs/cms.cern.ch/SITECONF/local/{JobConfig,PhEDEx}
[martinelli_f@t3ui19 nodes]$ pgrep -u cvmfs -fl
26697 /usr/bin/cvmfs2 __cachemgr__ . 7 8 31457280000 15728640000 0 3 -1 :
26699 /usr/bin/cvmfs2 __cachemgr__ . 7 8 31457280000 15728640000 0 3 -1 :
26705 /usr/bin/cvmfs2 -o rw,fsname=cvmfs2,allow_other,grab_mountpoint,uid=495,gid=495 cms.cern.ch /cvmfs/cms.cern.ch
26709 /usr/bin/cvmfs2 -o rw,fsname=cvmfs2,allow_other,grab_mountpoint,uid=495,gid=495 cms.cern.ch /cvmfs/cms.cern.ch
t3nagios
constantly checks:
Service startup/stop
/etc/init.d/frontier-squid start
Testing whether service is running:
pgrep -fl squid
3245 /usr/sbin/squid -DF
3248 (squid) -DF
awstats statistics
Not strictly required by CMS or
CVMFS but useful as all the stats ; run locally =[root@t3frontier01 ~]# firefox
http://localhost.localdomain/awstats/awstats.pl=
Squid Testing
To be moved in =t3nagios=
Read 1st
https://twiki.cern.ch/twiki/bin/viewauth/CMS/SAMSquidByHand
then in
nagios@t3wn41:/opt/nagios/test_squid
there is a
.sh
script that has to return
OK
:
-bash-4.1$ /opt/nagios/test_squid/test_squid.py.sh
node: t3wn41.psi.ch
SiteLocalConfig: /swshare/cms/SITECONF/local/JobConfig/site-local-config.xml
Contents of site-local-config.xml are:
<site-local-config>
<site name="T3_CH_PSI">
<event-data>
<catalog url="trivialcatalog_file:/swshare/cms/SITECONF/local/PhEDEx/storage.xml?protocol=dcap"/>
</event-data>
<source-config>
<cache-hint value="application-only"/>
<read-hint value="auto-detect"/>
<statistics-destination name="cms-udpmon-collector.cern.ch:9331" />
</source-config>
<local-stage-out>
<command value="srmv2"/>
<catalog url="trivialcatalog_file:/experiment-software/cms/SITECONF/local/PhEDEx/storage.xml?protocol=srmv2"/>
<se-name value="t3se01.psi.ch"/>
<option value="-debug"/>
<phedex-node value="T3_CH_PSI"/>
</local-stage-out>
<calib-data>
<frontier-connect>
<proxy url="http://t3frontier.psi.ch:3128"/>
<backupproxy url="http://cmsbpfrontier.cern.ch:3128"/>
<backupproxy url="http://cmsbproxy.fnal.gov:3128"/>
<server url="http://cmsfrontier.cern.ch:8000/FrontierInt"/>
<server url="http://cmsfrontier1.cern.ch:8000/FrontierInt"/>
<server url="http://cmsfrontier2.cern.ch:8000/FrontierInt"/>
<server url="http://cmsfrontier3.cern.ch:8000/FrontierInt"/>
</frontier-connect>
</calib-data>
</site>
</site-local-config>
site: T3_CH_PSI
loadtag: None
script version: $Id: NodeTypeCmsFrontier.txt,v 1.20 2014/12/16 18:08:55 fabiom Exp $
Using Frontier URL: http://cmsfrontier.cern.ch:8080/FrontierProd/Frontier
Query: SELECT 1 FROM DUAL
Query started: 09/17/14 09:46:38 CEST
squid: http://t3frontier.psi.ch:3128
Frontier Request:
http://cmsfrontier.cern.ch:8080/FrontierProd/Frontier?type=frontier_request:1:DEFAULT&encoding=BLOB&p1=eNoLdvVxdQ5RMFRwC/L3VXAJdfQBACyLBKw=
Query ended: 09/17/14 09:46:38 CEST
Query time: 0.04 [seconds]
Query result:
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE frontier SYSTEM "http://frontier.fnal.gov/frontier.dtd">
<frontier version="3.33" xmlversion="1.0">
<transaction payloads="1">
<payload type="frontier_request" version="1" encoding="BLOB">
<data>BgAAAAExBgAAAAZOVU1CRVIHBgAAAAExBw==</data>
<quality error="0" md5="2e47f41c56b898fb582b7ecf1e8686cc" records="1" full_size="25"/>
</payload>
</transaction>
</frontier>
Fields:
1 NUMBER
Records:
1
OK
Remote Monitoring vs PSI
Backups
OS snapshots are nightly taken by PSI VMWare Team ( like Peter Huesser ) + we have
LinuxBackupsByLegato to recover a single file.