CMS Box
Real machine name |
cms02.lcg.cscs.ch |
Firewall requirements
port |
open to |
reason |
3128/tcp |
WN |
access from WNs to FroNtier squid proxy |
1094/tcp |
* |
access to Xrootd redirector that forwards requests to the native dCache Xrootd door |
3401/udp |
128.142.0.0/16, 188.185.0.0/17 |
central SNMP monitoring of FroNtier service |
For further, but older, information look at CmsVObox
Setup
3rd party tasks
CMS GitLab SITECONF
Be aware of
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SiteConfInGitlab and always keep both
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/ and
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS_HPC updated ; clone the repos into a
/tmp
or
/opt
dir, make a new branch and try
there your changes because Puppet will constantly checkout the Master branch !
Monitoring the current installation
LCGTier2/CMSMonitoring
Dino Conciatore's Puppet recipes ( after Dec 2016 )
Fully puppet installation
To simplify the cms vo box installation here at CSCS we prepared a puppet recipe to fully install and configure the CMS vo box.
Hiera config
We use Hiera to keep the puppet code more dynamic, and to lookup some key variables.
---
role: role_wlcg_cms_vobox
environment: dev1
cluster: phoenix4
profile_cscs_base::network::interfaces_hash:
eth0:
enable_dhcp: true
hwaddr:
mtu: '1500'
eth1:
ipaddress: 148.187.66.63
netmask: 255.255.252.0
mtu: '1500'
gateway: 148.187.64.2
profile_cscs_base::network::hostname: 'cms02.lcg.cscs.ch'
profile_monitoring::ganglia::gmond_cluster_options:
cluster_name: 'PHOENIX-services'
udp_send_channel: [{'bind_hostname' : 'yes', 'port' : '8693', 'host' : 'ganglia.lcg.cscs.ch'}]
udp_recv_channel: [ { mcast_join: '239.2.11.71', port: '8649', bind: '239.2.11.71' } ]
tcp_accept_channel: [ port: '8649' ]
# crontab
cron::hourly:
'cron_proxy':
command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_proxy.sh'
user: 'phedex'
environment:
- 'MAILTO=root'
- 'PATH="/usr/bin:/bin:/usr/local/sbin"'
cron::daily:
'cron_stats':
command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_stats.sh'
user: 'phedex'
environment:
- 'MAILTO=root'
- 'PATH="/usr/bin:/bin:/usr/local/sbin"'
# Phedex SITECONF
# git clone ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS
## Change in hash and use puppet git
profile_wlcg_cms_vobox::phedex::siteconf:
'T2_CH_CSCS':
path: '/home/phedex/config/T2_CH_CSCS'
source: 'ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS'
owner: 'phedex'
group: 'phedex'
#environment: ["HOME=/home/phedex"]
branch: 'master'
update: true
'wlcg_cms_vobox':
path: '/localgit/profile_wlcg_cms_vobox'
source: 'ssh://git@git.cscs.ch/puppet_profiles/profile_wlcg_cms_vobox.git'
branch: 'dev1'
update: true
'dmwm_PHEDEX':
path: '/localgit/dmwm_PHEDEX'
source: 'https://github.com/dmwm/PHEDEX.git'
branch: 'master'
update: true
profile_cscs_base::extra_mounts:
'/users':
ensure: 'mounted'
device: 'nas.lcg.cscs.ch:/ifs/LCG/shared/phoenix4/users'
atboot: true
fstype: 'nfs'
options: 'rw,bg,proto=tcp,rsize=32768,wsize=32768,soft,intr,nfsvers=3'
'/pnfs':
ensure: 'mounted'
device: 'storage02.lcg.cscs.ch:/pnfs'
atboot: true
fstype: 'nfs'
options: 'ro,intr,noac,hard,proto=tcp,nfsvers=3'
profile_cscs_base::ssh_host_dsa_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_dsa_key_pub: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key_pub: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key_pub: >
ENC[PKCS7,.....]
profile_wlcg_base::grid_hostcert: >
ENC[PKCS7,.....]
profile_wlcg_base::grid_hostkey: >
ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa: >
ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa_pub: >
ENC[PKCS7,.....]
Puppet module profile_wlcg_cms_vobox
This module configure:
- Firewall
- CVMFS
- Frontier Squid
- Xrootd
- Phedex
Puppet will run automatically every 30 minutes and check if the those service are running:
- cmsd
- xrootd
- frontier-squid (managed by the included frontier::squid module)
Run the installation
Currently we still use foreman to run the initial OS setup:
hammer host create --name "cms02.lcg.cscs.ch" --hostgroup-id 13 --environment "dev1" --puppet-ca-proxy-id 1 --puppet-proxy-id 1 --puppetclass-ids 697 --operatingsystem-id 9 --medium "Scientific Linux" --partition-table-id 7 --build yes --mac "00:10:3e:66:00:63" --ip=10.10.66.63 --domain-id 4
If you want reinstall the machine you have yust to set:
hammer host update --name cms02.lcg.cscs.ch --build yes
Pre Dino Conciatore's Puppet recipes ( before Dec 2016 )
UMD3
Make sure UMD3 Yum repo are setup :
yum install http://repository.egi.eu/sw/production/umd/3/sl6/x86_64/updates/umd-release-3.0.1-1.el6.noarch.rpm
Installation
create
/etc/squid/squidconf
export FRONTIER_USER=dbfrontier
export FRONTIER_GROUP=dbfrontier
run installation as described at
https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid
rpm -Uvh http://frontier.cern.ch/dist/rpms/RPMS/noarch/frontier-release-1.0-1.noarch.rpm
yum install frontier-squid
chkconfig frontier-squid on
create folders on special partition
mkdir /home/dbfrontier/cache
mkdir /home/dbfrontier/log
chown dbfrontier:dbfrontier /home/dbfrontier/cache/
chown dbfrontier:dbfrontier /home/dbfrontier/log/
Configuration
edit
/etc/squid/customize.sh
#!/bin/bash
awk --file `dirname $0`/customhelps.awk --source '{
setoption("acl NET_LOCAL src", "10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 148.187.0.0/16")
setoption("cache_mem", "128 MB")
setoptionparameter("cache_dir", 3, "15000")
setoption("cache_log", "/home/dbfrontier/log/cache.log")
setoption("coredump_dir", "/home/dbfrontier/cache/")
setoptionparameter("cache_dir", 2, "/home/dbfrontier/cache/")
setoptionparameter("access_log", 1, "/home/dbfrontier/log/access.log")
setoption("logfile_rotate", "1")
print
}'
start the service and then move the log files
service frontier-squid start
rmdir /var/cache/squid
rmdir /var/log/squid
ln -s /home/dbfrontier/cache /var/cache/squid
ln -s /home/dbfrontier/log /var/log/squid
create
/etc/sysconfig/frontier-squid
export LARGE_ACCESS_LOG=500000000
restart the service
service frontier-squid reload
Allow SNMP monitoring of squid service
edit
/etc/sysconfig/iptables
add:
-A INPUT -s 128.142.0.0/16 -p udp --dport 3401 -j ACCEPT
-A INPUT -s 188.185.0.0/17 -p udp --dport 3401 -j ACCEPT
service iptables reload
Test squid proxy
Connect for instance on
ui.lcg.cscs.ch
:
(look into cms01:/home/dbfrontier/log/access.log and cms02:/var/log/squid/access.log after executing)
wget http://frontier.cern.ch/dist/fnget.py
chmod +x fnget.py
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms01.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms02.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
Expected Output :
More... Close
Using Frontier URL: http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier
Query: select 1 from dual
Decode results: True
Refresh cache: False
Frontier Request:
http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip&p1=eNorTs1JTS5RMFRIK8rPVUgpTcwBAD0rBmw_
Query started: 02/29/16 22:32:47 CET
Query ended: 02/29/16 22:32:47 CET
Query time: 0.00365591049194 [seconds]
Query result:
eF5jY2BgYDRkA5JsfqG+Tq5B7GxgEXYAGs0CVA==
Fields:
1 NUMBER
Records:
1
Xrootd
Installation
Install Xrootd on SL6 for dCache according to
rpm -Uhv http://repo.grid.iu.edu/osg-el6-release-latest.rpm
install packages and copy host certificate/key
yum install --disablerepo='*' --enablerepo=osg-contrib,osg-testing cms-xrootd-dcache
yum install xrootd-cmstfc
yum
mount cvmfs and make sure that this file is available
/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml
; that's the cvmfs version of
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/raw/master/PhEDEx/storage.xml
chkconfig xrootd on
chkconfig cmsd on
service xrootd start
service cmsd start
Configuration
Redirector Setup (from June 2013, dCache 2.10)
Detailed reporting requires a plugin on the dCache side
Verify that
/pnfs
is properly mounted both at :
- runtime :
$ df -h /pnfs Filesystem Size Used Avail Use% Mounted on storage02.lcg.cscs.ch:/pnfs 1.0E 1.7P 1023P 1% /pnfs
- boot time :
$ grep pnfs /etc/fstab storage02.lcg.cscs.ch:/pnfs /pnfs nfs rw,intr,noac,hard,proto=tcp,nfsvers=3 0 0
edit
/etc/xrootd/xrootd-clustered.cfg
xrd.port 1094
all.role server
all.sitename T2_CH_CSCS
all.manager any xrootd-cms.infn.it+ 1213
oss.localroot /pnfs/lcg.cscs.ch/cms/trivcat/
xrootd.redirect storage01.lcg.cscs.ch:1095 /
all.export / nostage
cms.allow host *
xrootd.trace emsg login stall redirect
ofs.trace none
xrd.trace conn
cms.trace all
oss.namelib /usr/lib64/libXrdCmsTfc.so file:/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml?protocol=direct
xrootd.seclib /usr/lib64/libXrdSec.so
xrootd.fslib /usr/lib64/libXrdOfs.so
all.adminpath /var/run/xrootd
all.pidpath /var/run/xrootd
cms.delay startup 10
cms.fxhold 60s
xrd.report xrootd.t2.ucsd.edu:9931 every 60s all sync
xrootd.monitor all auth flush io 60s ident 5m mbuff 8k rbuff 4k rnums 3 window 10s dest files io info user redir xrootd.t2.ucsd.edu:9930
corresponding dCache door configuration
[xrootd-CMS-Domain2]
[xrootd-CMS-Domain2/xrootd]
loginBroker=srm-LoginBroker
useGPlazmaAuthorizationModule=true
xrootdRootPath=/pnfs/lcg.cscs.ch/cms/trivcat/
xrootdAllowedReadPaths=/pnfs/lcg.cscs.ch/cms/trivcat/
xrootdAuthNPlugin=gsi
xrootdPort=1095
xrootdMoverTimeout=28800000
xrootdThreads=400
Allow Xrootd requests
edit
/etc/sysconfig/iptables
, add:
-A INPUT -p tcp -m tcp --dport 1094 -m state --state NEW -j ACCEPT
Test Xrootd file access
From a UI machine, for instance from PSI or LXPLUS :
xrdcp --debug 2 root://cms01.lcg.cscs.ch//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
xrdcp --debug 2 root://cms02.lcg.cscs.ch//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
xrdcp --debug 2 root://xrootd-cms.infn.it//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
Read
https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhedexAdminDocsInstallation#PhEDEx_Agent_Installation
yum install zsh perl-ExtUtils-Embed libXmu libXpm tcl tk compat-libstdc++-33 git perl-XML-LibXML
adduser phedex
yum install httpd
mkdir /var/www/html/phedexlog
chown phedex:phedex /var/www/html/phedexlog/
su - phedex
mkdir -p state log sw gridcert config
chmod 700 gridcert
export sw=$PWD/sw
myarch=slc6_amd64_gcc461
wget -O $sw/bootstrap.sh http://cmsrep.cern.ch/cmssw/comp/bootstrap.sh
sh -x $sw/bootstrap.sh setup -path $sw -arch $myarch -repository comp 2>&1|tee $sw/bootstrap_$myarch.log
source $sw/$myarch/external/apt/*/etc/profile.d/init.sh
apt-get update
apt-cache search PHEDEX|grep PHEDEX
version=4.1.3-comp3
apt-get install cms+PHEDEX+$version
unlink PHEDEX
ln -s /home/phedex/sw/$myarch/cms/PHEDEX/$version/ PHEDEX
reset environment (zlib required for git is missing in
slc6_amd64_gcc461
)
cd config
# pre gitlab era
# git clone https://dmeister@git.cern.ch/reps/siteconf
git clone https://dconciat@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://mgila@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://jpata@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://dpetrusi@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
cd ..
run
./config/siteconf/T2_CH_CSCS/PhEDEx/FixHostnames.sh
to fix hostnames in PhEDEX config
get helper scripts from SVN
mkdir svn-sandbox
cd svn-sandbox
svn co https://svn.cscs.ch/LCG/VO-specific/cms/phedex
cd ..
cp -R svn-sandbox/phedex/init.d ./
cd init.d
rm -Rf .svn
ln -s phedex_Prod phedex_Debug
ln -s phedex_Prod phedex_Dev
and copy host certificates
mkdir .globus
chmod 700 .globus
cp /etc/grid-security/hostcert.pem .globus/usercert.pem
cp /etc/grid-security/hostkey.pem .globus/userkey.pem
chown -R phedex:phedex .globus
chmod 600 .globus/*
and setup proxy and DB logins
touch config/DBParam.CSCS
chmod 600 config/DBParam.CSCS
# write secret config to file
Cron scripts/cron_restart.sh
#!/bin/bash
HOST=$(hostname)
HOST=${HOST%%\.*}
/home/phedex/init.d/phedex_Debug download-$HOST stop
/home/phedex/init.d/phedex_Debug download-$HOST start
/home/phedex/init.d/phedex_Prod download-$HOST stop
/home/phedex/init.d/phedex_Prod download-$HOST start
Cron scripts/cron_stats.sh
#!/bin/bash
HOST=$(hostname)
HOST=${HOST%%\.*}
SUMMARYFILE=/var/www/html/phedexlog/statistics.$(date +m%d-M).txt
#source /etc/profile.d/grid-env.sh
source /home/phedex/PHEDEX/etc/profile.d/init.sh
echo -e generated on `date` "\n------------------------" > $SUMMARYFILE
echo "Prod:" >> $SUMMARYFILE
/home/phedex/init.d/phedex_Prod status >> $SUMMARYFILE
echo "Debug:" >> $SUMMARYFILE
/home/phedex/init.d/phedex_Debug status >> $SUMMARYFILE
/home/phedex/PHEDEX/Utilities/InspectPhedexLog -c 300 -es "-12 hours" /home/phedex/log/Prod/download-$HOST /home/phedex/log/Debug/download-$HOST >> $SUMMARYFILE 2>/dev/null
Cron scripts/cron_proxy.sh
#!/bin/bash
HOST=$(hostname)
HOST=${HOST%%\.*}
unset X509_USER_PROXY
voms-proxy-init
myproxy-get-delegation -s myproxy.cern.ch -v -l cscs_phedex_${HOST}_dm_2014 -a /home/phedex/gridcert/proxy.cert -o /home/phedex/gridcert/proxy.cert
export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert
voms-proxy-init -noregen -voms cms
Cron scripts/cron_spacemon.sh
#!/bin/bash
cd /lhome/phedex/spacemon
unset PERL5LIB
source /lhome/phedex/PHEDEX/etc/profile.d/init.sh
export PHEDEX_ROOT=/lhome/phedex/spacemon/PHEDEX
export PERL5LIB=$PHEDEX_ROOT/perl_lib:$PERL5LIB
export PATH=$PHEDEX_ROOT/Utilities:$PHEDEX_ROOT/Utilities/testSpace:$PATH
export X509_USER_PROXY=/lhome/phedex/gridcert/proxy.cert
DUMPDATE=$(date '+%Y-%m-%d')
wget --quiet http://ganglia.lcg.cscs.ch/dcache/dcache-cms-dump-$DUMPDATE.xml.bz2 -O dumps/dcache-cms-dump-$DUMPDATE.xml.bz2
if [ $? -gt 0 ]; then
echo "no dCache dump available for $DUMPDATE; abort..."
exit 1
fi
spacecount dcache --dump dumps/dcache-cms-dump-$DUMPDATE.xml.bz2 --node T2_CH_CSCS
if [ $? -gt 0 ]; then
echo "uploading dump for $DUMPDATE failed; keeping the file..."
exit 1
fi
rm dumps/dcache-cms-dump-$DUMPDATE.xml.bz2
Cron scripts/cron_clean.sh
CMS dirs in /pnfs to be regularly cleaned up by a set of crons
#!/bin/bash
# 2 weeks: store/unmerged, store/temp, store/backfill/1, store/backfill/2
find /pnfs/lcg.cscs.ch/cms/trivcat/store/unmerged/ -mindepth 1 -mtime +14 -delete
find /pnfs/lcg.cscs.ch/cms/trivcat/store/temp/ -mindepth 1 -mtime +14 -delete
find /pnfs/lcg.cscs.ch/cms/trivcat/store/backfill/1/ -mindepth 1 -mtime +14 -delete
find /pnfs/lcg.cscs.ch/cms/trivcat/store/backfill/2/ -mindepth 1 -mtime +14 -delete
# 1 week: store/temp/user
find /pnfs/lcg.cscs.ch/cms/trivcat/store/temp/user/ -mindepth 1 -mtime +7 -delete
Cron final setups
chmod +x scripts/*.sh
run
config/siteconf/T2_CH_CSCS/PhEDEx/CreateLogrotConf.pl
crontab -e
05 0 * * * /usr/sbin/logrotate -s /home/phedex/state/logrotate.state /home/phedex/config/logrotate.conf
13 5,17 * * * /home/phedex/scripts/cron_restart.sh
*/15 * * * * /home/phedex/scripts/cron_stats.sh
0 */4 * * * /home/phedex/scripts/cron_proxy.sh
0 7 * * * /lhome/phedex/scripts/cron_spacemon.sh
and for
root
crontab -l
30 2 * * * /lhome/phedex/scripts/cron_clean.sh
X509 Proxy into MyProxy at CERN
setup myproxy service from a UI machine
voms-proxy-init -voms cms
myproxy-init -s myproxy.cern.ch -l cscs_phedex_cms0X_dm_2014 -x -R "/DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=hosts/C=CH/ST=Zuerich/L=Zuerich/O=ETH Zuerich/CN=cms0X.lcg.cscs.ch" -c 6400
and copy initial proxy to
gridcert/proxy.cert
PhEDEx locally generated stats
Make a couple of SSH RSA keys
[phedex@cms02 ~]$ ssh-keygen -C "To push the cms02 phedex stats into mysql@ganglia:/var/www/html/ganglia/phedex/" -t rsa
Make a phedex cron pushing :
[phedex@cms02 ~]$ ll /var/www/html/ganglia/phedex/
total 588
-rw-r--r-- 1 phedex phedex 15043 Feb 1 23:45 statistics.DONEm01-HELPM.txt
-rw-r--r-- 1 phedex phedex 17526 Feb 2 23:45 statistics.DONEm02-HELPM.txt
-rw-r--r-- 1 phedex phedex 14815 Feb 3 23:45 statistics.DONEm03-HELPM.txt
...
into
mysql@ganglia.lcg.cscs.ch:/var/www/html/ganglia/phedex/
Check if you can browse them :
http://ganglia.lcg.cscs.ch/ganglia/phedex/