CMS Box
Real machine name |
cms02.lcg.cscs.ch |
Firewall requirements
port |
open to |
reason |
3128/tcp |
WN |
access from WNs to FroNtier squid proxy |
3401/udp |
128.142.0.0/16, 188.185.0.0/17 |
central SNMP monitoring of FroNtier service |
1094/tcp |
* |
access to Xrootd redirector that forwards requests to the native dCache Xrootd door |
For further, but older, information look at CmsVObox
Setup
3rd party tasks
CMS GitLab SITECONF
Be aware of
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SiteConfInGitlab and always keep both
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/ and
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS_HPC updated ; clone the repos into a
/tmp
or
/opt
dir, make a new branch and try
there your changes because Puppet will constantly checkout the Master branch !
Monitoring the current installation
LCGTier2/CMSMonitoring
Dino Conciatore's Puppet recipes ( after Dec 2016 )
Fully puppet installation
To simplify the cms vo box installation here at CSCS we prepared a puppet recipe to fully install and configure the CMS vo box.
VM
cms02.lcg.cscs.ch is a VM hosted on the CSCS vmware cluster
For any
VirtualHW modification, hard restart, etc. just ping us we have fully access to the admin infrastructure.
Hiera config
We use Hiera to keep the puppet code more dynamic, and to lookup some key variables.
---
role: role_wlcg_cms_vobox
environment: dev1
cluster: phoenix4
profile_cscs_base::network::interfaces_hash:
eth0:
enable_dhcp: true
hwaddr:
mtu: '1500'
eth1:
ipaddress: 148.187.66.63
netmask: 255.255.252.0
mtu: '1500'
gateway: 148.187.64.2
profile_cscs_base::network::hostname: 'cms02.lcg.cscs.ch'
profile_monitoring::ganglia::gmond_cluster_options:
cluster_name: 'PHOENIX-services'
udp_send_channel: [{'bind_hostname' : 'yes', 'port' : '8693', 'host' : 'ganglia.lcg.cscs.ch'}]
udp_recv_channel: [ { mcast_join: '239.2.11.71', port: '8649', bind: '239.2.11.71' } ]
tcp_accept_channel: [ port: '8649' ]
# crontab
cron::hourly:
'cron_proxy':
command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_proxy.sh'
user: 'phedex'
environment:
- 'MAILTO=root'
- 'PATH="/usr/bin:/bin:/usr/local/sbin"'
cron::daily:
'cron_stats':
command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_stats.sh'
user: 'phedex'
environment:
- 'MAILTO=root'
- 'PATH="/usr/bin:/bin:/usr/local/sbin"'
# Phedex SITECONF
profile_wlcg_cms_vobox::phedex::myproxy_user: 'cscs_cms02_phedex_xxxxx_user_2017'
# git clone ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS
## Change in hash and use puppet git
profile_wlcg_cms_vobox::phedex::siteconf:
'T2_CH_CSCS':
path: '/home/phedex/config/T2_CH_CSCS'
source: 'ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS'
owner: 'phedex'
group: 'phedex'
#environment: ["HOME=/home/phedex"]
branch: 'master'
update: true
'wlcg_cms_vobox':
path: '/localgit/profile_wlcg_cms_vobox'
source: 'ssh://git@git.cscs.ch/puppet_profiles/profile_wlcg_cms_vobox.git'
branch: 'dev1'
update: true
'dmwm_PHEDEX':
path: '/localgit/dmwm_PHEDEX'
source: 'https://github.com/dmwm/PHEDEX.git'
branch: 'master'
update: true
profile_cscs_base::extra_mounts:
'/users':
ensure: 'mounted'
device: 'nas.lcg.cscs.ch:/ifs/LCG/shared/phoenix4/users'
atboot: true
fstype: 'nfs'
options: 'rw,bg,proto=tcp,rsize=32768,wsize=32768,soft,intr,nfsvers=3'
'/pnfs':
ensure: 'mounted'
device: 'storage02.lcg.cscs.ch:/pnfs'
atboot: true
fstype: 'nfs'
options: 'ro,intr,noac,hard,proto=tcp,nfsvers=3'
profile_cscs_base::ssh_host_dsa_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_dsa_key_pub: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key_pub: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key_pub: >
ENC[PKCS7,.....]
profile_wlcg_base::grid_hostcert: >
ENC[PKCS7,.....]
profile_wlcg_base::grid_hostkey: >
ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa: >
ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa_pub: >
ENC[PKCS7,.....]
Puppet module profile_wlcg_cms_vobox
This module configure:
- Firewall
- CVMFS
- Frontier Squid
- Xrootd
- Phedex
Puppet will run automatically every 30 minutes and check if the those service are running:
- cmsd
- xrootd
- frontier-squid (managed by the included frontier::squid module)
Run the installation
Currently we still use foreman to run the initial OS setup:
hammer host create --name "cms02.lcg.cscs.ch" --hostgroup-id 13 --environment "dev1" --puppet-ca-proxy-id 1 --puppet-proxy-id 1 --puppetclass-ids 697 --operatingsystem-id 9 --medium "Scientific Linux" --partition-table-id 7 --build yes --mac "00:10:3e:66:00:63" --ip=10.10.66.63 --domain-id 4
If you want reinstall the machine you have yust to set:
hammer host update --name cms02.lcg.cscs.ch --build yes
Phedex certificate
To update an expired certificate just edit
profile_wlcg_cms_vobox::phedex::proxy_cert in the cms02 hiera file.
The myproxy user (if you have to re-init the myproxy) is located in the script:
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/edit/master/PhEDEx/tools/cron/cron_proxy.sh
Puppet will automatically pull the repo every 30 min.
Pre Dino Conciatore's Puppet recipes ( before Dec 2016 )
UMD3
Make sure UMD3 Yum repo are setup :
yum install http://repository.egi.eu/sw/production/umd/3/sl6/x86_64/updates/umd-release-3.0.1-1.el6.noarch.rpm
Installation
create
/etc/squid/squidconf
export FRONTIER_USER=dbfrontier
export FRONTIER_GROUP=dbfrontier
run installation as described at
https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid
rpm -Uvh http://frontier.cern.ch/dist/rpms/RPMS/noarch/frontier-release-1.0-1.noarch.rpm
yum install frontier-squid
chkconfig frontier-squid on
create folders on special partition
mkdir /home/dbfrontier/cache
mkdir /home/dbfrontier/log
chown dbfrontier:dbfrontier /home/dbfrontier/cache/
chown dbfrontier:dbfrontier /home/dbfrontier/log/
Configuration
edit
/etc/squid/customize.sh
#!/bin/bash
awk --file `dirname $0`/customhelps.awk --source '{
setoption("acl NET_LOCAL src", "10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 148.187.0.0/16")
setoption("cache_mem", "128 MB")
setoptionparameter("cache_dir", 3, "15000")
setoption("cache_log", "/home/dbfrontier/log/cache.log")
setoption("coredump_dir", "/home/dbfrontier/cache/")
setoptionparameter("cache_dir", 2, "/home/dbfrontier/cache/")
setoptionparameter("access_log", 1, "/home/dbfrontier/log/access.log")
setoption("logfile_rotate", "1")
print
}'
start the service and then move the log files
service frontier-squid start
rmdir /var/cache/squid
rmdir /var/log/squid
ln -s /home/dbfrontier/cache /var/cache/squid
ln -s /home/dbfrontier/log /var/log/squid
create
/etc/sysconfig/frontier-squid
export LARGE_ACCESS_LOG=500000000
restart the service
service frontier-squid reload
Allow SNMP monitoring of squid service
edit
/etc/sysconfig/iptables
add:
-A INPUT -s 128.142.0.0/16 -p udp --dport 3401 -j ACCEPT
-A INPUT -s 188.185.0.0/17 -p udp --dport 3401 -j ACCEPT
service iptables reload
Test squid proxy
Connect for instance on
ui.lcg.cscs.ch
:
(look into cms01:/home/dbfrontier/log/access.log and cms02:/var/log/squid/access.log after executing)
wget http://frontier.cern.ch/dist/fnget.py
chmod +x fnget.py
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms01.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms02.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
Expected Output :
More... Close
Using Frontier URL: http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier
Query: select 1 from dual
Decode results: True
Refresh cache: False
Frontier Request:
http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip&p1=eNorTs1JTS5RMFRIK8rPVUgpTcwBAD0rBmw_
Query started: 02/29/16 22:32:47 CET
Query ended: 02/29/16 22:32:47 CET
Query time: 0.00365591049194 [seconds]
Query result:
eF5jY2BgYDRkA5JsfqG+Tq5B7GxgEXYAGs0CVA==
Fields:
1 NUMBER
Records:
1
Xrootd
Installation
Install Xrootd on SL6 for dCache according to
rpm -Uhv http://repo.grid.iu.edu/osg-el6-release-latest.rpm
install packages and copy host certificate/key
yum install --disablerepo='*' --enablerepo=osg-contrib,osg-testing cms-xrootd-dcache
yum install xrootd-cmstfc
yum
mount cvmfs and make sure that this file is available
/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml
; that's the cvmfs version of
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/raw/master/PhEDEx/storage.xml
chkconfig xrootd on
chkconfig cmsd on
service xrootd start
service cmsd start
Configuration
Redirector Setup (from June 2013, dCache 2.10)
Detailed reporting requires a plugin on the dCache side
Verify that
/pnfs
is properly mounted both at :
- runtime :
$ df -h /pnfs Filesystem Size Used Avail Use% Mounted on storage02.lcg.cscs.ch:/pnfs 1.0E 1.7P 1023P 1% /pnfs
- boot time :
$ grep pnfs /etc/fstab storage02.lcg.cscs.ch:/pnfs /pnfs nfs rw,intr,noac,hard,proto=tcp,nfsvers=3 0 0
edit
/etc/xrootd/xrootd-clustered.cfg
xrd.port 1094
all.role server
all.sitename T2_CH_CSCS
all.manager any xrootd-cms.infn.it+ 1213
oss.localroot /pnfs/lcg.cscs.ch/cms/trivcat/
xrootd.redirect storage01.lcg.cscs.ch:1095 /
all.export / nostage
cms.allow host *
xrootd.trace emsg login stall redirect
ofs.trace none
xrd.trace conn
cms.trace all
oss.namelib /usr/lib64/libXrdCmsTfc.so file:/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml?protocol=direct
xrootd.seclib /usr/lib64/libXrdSec.so
xrootd.fslib /usr/lib64/libXrdOfs.so
all.adminpath /var/run/xrootd
all.pidpath /var/run/xrootd
cms.delay startup 10
cms.fxhold 60s
xrd.report xrootd.t2.ucsd.edu:9931 every 60s all sync
xrootd.monitor all auth flush io 60s ident 5m mbuff 8k rbuff 4k rnums 3 window 10s dest files io info user redir xrootd.t2.ucsd.edu:9930
corresponding dCache door configuration
[xrootd-CMS-Domain2]
[xrootd-CMS-Domain2/xrootd]
loginBroker=srm-LoginBroker
useGPlazmaAuthorizationModule=true
xrootdRootPath=/pnfs/lcg.cscs.ch/cms/trivcat/
xrootdAllowedReadPaths=/pnfs/lcg.cscs.ch/cms/trivcat/
xrootdAuthNPlugin=gsi
xrootdPort=1095
xrootdMoverTimeout=28800000
xrootdThreads=400
Allow Xrootd requests
edit
/etc/sysconfig/iptables
, add:
-A INPUT -p tcp -m tcp --dport 1094 -m state --state NEW -j ACCEPT
Test Xrootd file access
From a UI machine, for instance from PSI or LXPLUS :
xrdcp --debug 2 root://cms01.lcg.cscs.ch//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
xrdcp --debug 2 root://cms02.lcg.cscs.ch//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
xrdcp --debug 2 root://xrootd-cms.infn.it//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
Read
https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhedexAdminDocsInstallation#PhEDEx_Agent_Installation
yum install zsh perl-ExtUtils-Embed libXmu libXpm tcl tk compat-libstdc++-33 git perl-XML-LibXML
adduser phedex
yum install httpd
mkdir /var/www/html/phedexlog
chown phedex:phedex /var/www/html/phedexlog/
su - phedex
mkdir -p state log sw gridcert config
chmod 700 gridcert
export sw=$PWD/sw
myarch=slc6_amd64_gcc461
wget -O $sw/bootstrap.sh http://cmsrep.cern.ch/cmssw/comp/bootstrap.sh
sh -x $sw/bootstrap.sh setup -path $sw -arch $myarch -repository comp 2>&1|tee $sw/bootstrap_$myarch.log
source $sw/$myarch/external/apt/*/etc/profile.d/init.sh
apt-get update
apt-cache search PHEDEX|grep PHEDEX
version=4.1.3-comp3
apt-get install cms+PHEDEX+$version
unlink PHEDEX
ln -s /home/phedex/sw/$myarch/cms/PHEDEX/$version/ PHEDEX
reset environment (zlib required for git is missing in
slc6_amd64_gcc461
)
cd config
# pre gitlab era
# git clone https://dmeister@git.cern.ch/reps/siteconf
git clone https://dconciat@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://mgila@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://jpata@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://dpetrusi@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
cd ..
run
./config/siteconf/T2_CH_CSCS/PhEDEx/FixHostnames.sh
to fix hostnames in PhEDEX config
get helper scripts from SVN
mkdir svn-sandbox
cd svn-sandbox
svn co https://svn.cscs.ch/LCG/VO-specific/cms/phedex
cd ..
cp -R svn-sandbox/phedex/init.d ./
cd init.d
rm -Rf .svn
ln -s phedex_Prod phedex_Debug
ln -s phedex_Prod phedex_Dev
and copy host certificates
mkdir .globus
chmod 700 .globus
cp /etc/grid-security/hostcert.pem .globus/usercert.pem
cp /etc/grid-security/hostkey.pem .globus/userkey.pem
chown -R phedex:phedex .globus
chmod 600 .globus/*
and setup proxy and DB logins
touch config/DBParam.CSCS
chmod 600 config/DBParam.CSCS
# write secret config to file
Cron scripts/cron_restart.sh
#!/bin/bash
HOST=$(hostname)
HOST=${HOST%%\.*}
/home/phedex/init.d/phedex_Debug download-$HOST stop
/home/phedex/init.d/phedex_Debug download-$HOST start
/home/phedex/init.d/phedex_Prod download-$HOST stop
/home/phedex/init.d/phedex_Prod download-$HOST start
Cron scripts/cron_stats.sh
#!/bin/bash
HOST=$(hostname)
HOST=${HOST%%\.*}
SUMMARYFILE=/var/www/html/phedexlog/statistics.$(date +m%d-M).txt
#source /etc/profile.d/grid-env.sh
source /home/phedex/PHEDEX/etc/profile.d/init.sh
echo -e generated on `date` "\n------------------------" > $SUMMARYFILE
echo "Prod:" >> $SUMMARYFILE
/home/phedex/init.d/phedex_Prod status >> $SUMMARYFILE
echo "Debug:" >> $SUMMARYFILE
/home/phedex/init.d/phedex_Debug status >> $SUMMARYFILE
/home/phedex/PHEDEX/Utilities/InspectPhedexLog -c 300 -es "-12 hours" /home/phedex/log/Prod/download-$HOST /home/phedex/log/Debug/download-$HOST >> $SUMMARYFILE 2>/dev/null
Cron scripts/cron_proxy.sh
#!/bin/bash
HOST=$(hostname)
HOST=${HOST%%\.*}
unset X509_USER_PROXY
voms-proxy-init
myproxy-get-delegation -s myproxy.cern.ch -v -l cscs_phedex_${HOST}_dm_2014 -a /home/phedex/gridcert/proxy.cert -o /home/phedex/gridcert/proxy.cert
export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert
voms-proxy-init -noregen -voms cms
Cron scripts/cron_spacemon.sh
#!/bin/bash
cd /lhome/phedex/spacemon
unset PERL5LIB
source /lhome/phedex/PHEDEX/etc/profile.d/init.sh
export PHEDEX_ROOT=/lhome/phedex/spacemon/PHEDEX
export PERL5LIB=$PHEDEX_ROOT/perl_lib:$PERL5LIB
export PATH=$PHEDEX_ROOT/Utilities:$PHEDEX_ROOT/Utilities/testSpace:$PATH
export X509_USER_PROXY=/lhome/phedex/gridcert/proxy.cert
DUMPDATE=$(date '+%Y-%m-%d')
wget --quiet http://ganglia.lcg.cscs.ch/dcache/dcache-cms-dump-$DUMPDATE.xml.bz2 -O dumps/dcache-cms-dump-$DUMPDATE.xml.bz2
if [ $? -gt 0 ]; then
echo "no dCache dump available for $DUMPDATE; abort..."
exit 1
fi
spacecount dcache --dump dumps/dcache-cms-dump-$DUMPDATE.xml.bz2 --node T2_CH_CSCS
if [ $? -gt 0 ]; then
echo "uploading dump for $DUMPDATE failed; keeping the file..."
exit 1
fi
rm dumps/dcache-cms-dump-$DUMPDATE.xml.bz2
Cron scripts/cron_clean.sh
CMS dirs in /pnfs to be regularly cleaned up by a set of crons
#!/bin/bash
# 2 weeks: store/unmerged, store/temp, store/backfill/1, store/backfill/2
find /pnfs/lcg.cscs.ch/cms/trivcat/store/unmerged/ -mindepth 1 -mtime +14 -delete
find /pnfs/lcg.cscs.ch/cms/trivcat/store/temp/ -mindepth 1 -mtime +14 -delete
find /pnfs/lcg.cscs.ch/cms/trivcat/store/backfill/1/ -mindepth 1 -mtime +14 -delete
find /pnfs/lcg.cscs.ch/cms/trivcat/store/backfill/2/ -mindepth 1 -mtime +14 -delete
# 1 week: store/temp/user
find /pnfs/lcg.cscs.ch/cms/trivcat/store/temp/user/ -mindepth 1 -mtime +7 -delete
Cron final setups
chmod +x scripts/*.sh
run
config/siteconf/T2_CH_CSCS/PhEDEx/CreateLogrotConf.pl
crontab -e
05 0 * * * /usr/sbin/logrotate -s /home/phedex/state/logrotate.state /home/phedex/config/logrotate.conf
13 5,17 * * * /home/phedex/scripts/cron_restart.sh
*/15 * * * * /home/phedex/scripts/cron_stats.sh
0 */4 * * * /home/phedex/scripts/cron_proxy.sh
0 7 * * * /lhome/phedex/scripts/cron_spacemon.sh
and for
root
crontab -l
30 2 * * * /lhome/phedex/scripts/cron_clean.sh
X509 Proxy into MyProxy at CERN
setup myproxy service from a UI machine
voms-proxy-init -voms cms
myproxy-init -s myproxy.cern.ch -l cscs_phedex_cms0X_dm_2014 -x -R "/DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=hosts/C=CH/ST=Zuerich/L=Zuerich/O=ETH Zuerich/CN=cms0X.lcg.cscs.ch" -c 6400
and copy initial proxy to
gridcert/proxy.cert
PhEDEx locally generated stats
Make a couple of SSH RSA keys
[phedex@cms02 ~]$ ssh-keygen -C "To push the cms02 phedex stats into mysql@ganglia:/var/www/html/ganglia/phedex/" -t rsa
Make a phedex cron pushing :
[phedex@cms02 ~]$ ll /var/www/html/ganglia/phedex/
total 588
-rw-r--r-- 1 phedex phedex 15043 Feb 1 23:45 statistics.DONEm01-HELPM.txt
-rw-r--r-- 1 phedex phedex 17526 Feb 2 23:45 statistics.DONEm02-HELPM.txt
-rw-r--r-- 1 phedex phedex 14815 Feb 3 23:45 statistics.DONEm03-HELPM.txt
...
into
mysql@ganglia.lcg.cscs.ch:/var/www/html/ganglia/phedex/
Check if you can browse them :
http://ganglia.lcg.cscs.ch/ganglia/phedex/