CMS Box

Real machine name cms02.lcg.cscs.ch

Firewall requirements

port open to reasonSorted ascending
3128/tcp WN access from WNs to FroNtier squid proxy
1094/tcp * access to Xrootd redirector that forwards requests to the native dCache Xrootd door
3401/udp 128.142.0.0/16, 188.185.0.0/17 central SNMP monitoring of FroNtier service

For further, but older, information look at CmsVObox



Setup

3rd party tasks

CMS GitLab SITECONF

Be aware of https://twiki.cern.ch/twiki/bin/view/CMSPublic/SiteConfInGitlab and always keep both https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/ and https://gitlab.cern.ch/SITECONF/T2_CH_CSCS_HPC updated ; clone the repos into a /tmp or /opt dir, make a new branch and try there your changes because Puppet will constantly checkout the Master branch !

Monitoring the current installation

LCGTier2/CMSMonitoring

Dino Conciatore's Puppet recipes ( after Dec 2016 )

Fully puppet installation

To simplify the cms vo box installation here at CSCS we prepared a puppet recipe to fully install and configure the CMS vo box.

Hiera config

We use Hiera to keep the puppet code more dynamic, and to lookup some key variables.

---
role: role_wlcg_cms_vobox
environment: dev1
cluster: phoenix4

profile_cscs_base::network::interfaces_hash:
  eth0:
    enable_dhcp: true
    hwaddr:
    mtu: '1500'
  eth1:
    ipaddress: 148.187.66.63
    netmask: 255.255.252.0
    mtu: '1500'
    gateway: 148.187.64.2

profile_cscs_base::network::hostname: 'cms02.lcg.cscs.ch'

profile_monitoring::ganglia::gmond_cluster_options:
  cluster_name: 'PHOENIX-services'
  udp_send_channel: [{'bind_hostname' : 'yes', 'port' : '8693', 'host' : 'ganglia.lcg.cscs.ch'}]
  udp_recv_channel:   [ { mcast_join: '239.2.11.71', port: '8649', bind: '239.2.11.71' } ]
  tcp_accept_channel: [ port: '8649' ]


# crontab
cron::hourly:
  'cron_proxy':
    command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_proxy.sh'
    user: 'phedex'
    environment:
      - 'MAILTO=root'
      - 'PATH="/usr/bin:/bin:/usr/local/sbin"'
cron::daily:
  'cron_stats':
    command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_stats.sh'
    user: 'phedex'
    environment:
      - 'MAILTO=root'
      - 'PATH="/usr/bin:/bin:/usr/local/sbin"'
# Phedex SITECONF
profile_wlcg_cms_vobox::phedex::myproxy_user: 'cscs_cms02_phedex_xxxxx_user_2017'
# git clone ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS
## Change in hash and use puppet git
profile_wlcg_cms_vobox::phedex::siteconf:
  'T2_CH_CSCS':
    path: '/home/phedex/config/T2_CH_CSCS'
    source: 'ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS'
    owner: 'phedex'
    group: 'phedex'
    #environment: ["HOME=/home/phedex"]
    branch: 'master'
    update: true
  'wlcg_cms_vobox':
    path: '/localgit/profile_wlcg_cms_vobox'
    source: 'ssh://git@git.cscs.ch/puppet_profiles/profile_wlcg_cms_vobox.git'
    branch: 'dev1'
    update: true
  'dmwm_PHEDEX':
    path: '/localgit/dmwm_PHEDEX'
    source: 'https://github.com/dmwm/PHEDEX.git'
    branch: 'master'
    update: true

profile_cscs_base::extra_mounts:
  '/users':
    ensure:   'mounted'
    device:   'nas.lcg.cscs.ch:/ifs/LCG/shared/phoenix4/users'
    atboot:   true
    fstype:   'nfs'
    options:  'rw,bg,proto=tcp,rsize=32768,wsize=32768,soft,intr,nfsvers=3'
  '/pnfs':
      ensure:   'mounted'
      device:   'storage02.lcg.cscs.ch:/pnfs'
      atboot:   true
      fstype:   'nfs'
      options:  'ro,intr,noac,hard,proto=tcp,nfsvers=3'

profile_cscs_base::ssh_host_dsa_key: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_dsa_key_pub: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key_pub: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key_pub: >
      ENC[PKCS7,.....]
profile_wlcg_base::grid_hostcert: >
      ENC[PKCS7,.....]
profile_wlcg_base::grid_hostkey: >
 ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa: >
    ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa_pub: >
    ENC[PKCS7,.....]

Puppet module profile_wlcg_cms_vobox

This module configure:

  • Firewall
  • CVMFS
  • Frontier Squid
  • Xrootd
  • Phedex
Puppet will run automatically every 30 minutes and check if the those service are running:

  • cmsd
  • xrootd
  • frontier-squid (managed by the included frontier::squid module)

Run the installation

Currently we still use foreman to run the initial OS setup:

hammer host create --name "cms02.lcg.cscs.ch" --hostgroup-id 13 --environment "dev1" --puppet-ca-proxy-id 1 --puppet-proxy-id 1 --puppetclass-ids 697 --operatingsystem-id 9 --medium "Scientific Linux" --partition-table-id 7 --build yes --mac "00:10:3e:66:00:63" --ip=10.10.66.63 --domain-id 4

If you want reinstall the machine you have yust to set:

hammer host update --name cms02.lcg.cscs.ch --build yes

Phedex certificate

To update an expired certificate just edit profile_wlcg_cms_vobox::phedex::proxy_cert in the cms02 hiera file.

The myproxy user (if you have to re-init the myproxy) is located in the script: https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/edit/master/PhEDEx/tools/cron/cron_proxy.sh

Puppet will automatically pull the repo every 30 min.

Pre Dino Conciatore's Puppet recipes ( before Dec 2016 )

UMD3

Make sure UMD3 Yum repo are setup :

yum install http://repository.egi.eu/sw/production/umd/3/sl6/x86_64/updates/umd-release-3.0.1-1.el6.noarch.rpm

FroNtier

Installation

create /etc/squid/squidconf

export FRONTIER_USER=dbfrontier
export FRONTIER_GROUP=dbfrontier    

run installation as described at https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid

rpm -Uvh http://frontier.cern.ch/dist/rpms/RPMS/noarch/frontier-release-1.0-1.noarch.rpm
yum install frontier-squid
chkconfig frontier-squid on

create folders on special partition

mkdir /home/dbfrontier/cache
mkdir /home/dbfrontier/log
chown dbfrontier:dbfrontier /home/dbfrontier/cache/
chown dbfrontier:dbfrontier /home/dbfrontier/log/

Configuration

edit /etc/squid/customize.sh

#!/bin/bash
awk --file `dirname $0`/customhelps.awk --source '{
setoption("acl NET_LOCAL src", "10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 148.187.0.0/16")
setoption("cache_mem", "128 MB")
setoptionparameter("cache_dir", 3, "15000")                                             
setoption("cache_log", "/home/dbfrontier/log/cache.log")
setoption("coredump_dir", "/home/dbfrontier/cache/")
setoptionparameter("cache_dir", 2, "/home/dbfrontier/cache/")
setoptionparameter("access_log", 1, "/home/dbfrontier/log/access.log")
setoption("logfile_rotate", "1")
print
}'

start the service and then move the log files

service frontier-squid start
rmdir /var/cache/squid
rmdir /var/log/squid
ln -s /home/dbfrontier/cache /var/cache/squid
ln -s /home/dbfrontier/log /var/log/squid

create /etc/sysconfig/frontier-squid

export LARGE_ACCESS_LOG=500000000

restart the service

service frontier-squid reload

Allow SNMP monitoring of squid service

edit /etc/sysconfig/iptables add:

-A INPUT -s 128.142.0.0/16 -p udp --dport 3401 -j ACCEPT
-A INPUT -s 188.185.0.0/17 -p udp --dport 3401 -j ACCEPT

service iptables reload

Test squid proxy

Connect for instance on ui.lcg.cscs.ch : (look into cms01:/home/dbfrontier/log/access.log and cms02:/var/log/squid/access.log after executing)

wget http://frontier.cern.ch/dist/fnget.py
chmod +x fnget.py
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms01.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms02.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"

Expected Output :
More... Close

Using Frontier URL:  http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier
Query:  select 1 from dual
Decode results:  True
Refresh cache:  False

Frontier Request:
http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip&p1=eNorTs1JTS5RMFRIK8rPVUgpTcwBAD0rBmw_

Query started:  02/29/16 22:32:47 CET
Query ended:  02/29/16 22:32:47 CET
Query time: 0.00365591049194 [seconds]

Query result:



 
  
   eF5jY2BgYDRkA5JsfqG+Tq5B7GxgEXYAGs0CVA==
   
  
 


Fields: 
     1     NUMBER
Records:
     1

Xrootd

Installation

Install Xrootd on SL6 for dCache according to

rpm -Uhv http://repo.grid.iu.edu/osg-el6-release-latest.rpm

install packages and copy host certificate/key

yum install --disablerepo='*' --enablerepo=osg-contrib,osg-testing cms-xrootd-dcache
yum install xrootd-cmstfc
yum 

mount cvmfs and make sure that this file is available /cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml ; that's the cvmfs version of https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/raw/master/PhEDEx/storage.xml

chkconfig xrootd on
chkconfig cmsd on
service xrootd start
service cmsd start

Configuration

Redirector Setup (from June 2013, dCache 2.10)

Detailed reporting requires a plugin on the dCache side

Verify that /pnfs is properly mounted both at :

  • runtime :
    $ df -h /pnfs Filesystem Size Used Avail Use% Mounted on storage02.lcg.cscs.ch:/pnfs 1.0E 1.7P 1023P 1% /pnfs
  • boot time :
    $ grep pnfs /etc/fstab storage02.lcg.cscs.ch:/pnfs /pnfs nfs rw,intr,noac,hard,proto=tcp,nfsvers=3 0 0
edit /etc/xrootd/xrootd-clustered.cfg

xrd.port 1094
all.role server
all.sitename T2_CH_CSCS
all.manager any xrootd-cms.infn.it+ 1213

oss.localroot /pnfs/lcg.cscs.ch/cms/trivcat/
xrootd.redirect storage01.lcg.cscs.ch:1095 /

all.export / nostage

cms.allow host *

xrootd.trace emsg login stall redirect
ofs.trace none
xrd.trace conn
cms.trace all

oss.namelib /usr/lib64/libXrdCmsTfc.so file:/cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml?protocol=direct

xrootd.seclib /usr/lib64/libXrdSec.so
xrootd.fslib /usr/lib64/libXrdOfs.so
all.adminpath /var/run/xrootd
all.pidpath /var/run/xrootd

cms.delay startup 10
cms.fxhold 60s

xrd.report xrootd.t2.ucsd.edu:9931 every 60s all sync
xrootd.monitor all auth flush io 60s ident 5m mbuff 8k rbuff 4k rnums 3 window 10s dest files io info user redir xrootd.t2.ucsd.edu:9930

corresponding dCache door configuration

[xrootd-CMS-Domain2]
[xrootd-CMS-Domain2/xrootd]
loginBroker=srm-LoginBroker
useGPlazmaAuthorizationModule=true
xrootdRootPath=/pnfs/lcg.cscs.ch/cms/trivcat/
xrootdAllowedReadPaths=/pnfs/lcg.cscs.ch/cms/trivcat/
xrootdAuthNPlugin=gsi
xrootdPort=1095
xrootdMoverTimeout=28800000
xrootdThreads=400

Allow Xrootd requests

edit /etc/sysconfig/iptables, add:

-A INPUT -p tcp -m tcp --dport 1094 -m state --state NEW -j ACCEPT

Test Xrootd file access

From a UI machine, for instance from PSI or LXPLUS :

xrdcp --debug 2 root://cms01.lcg.cscs.ch//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
xrdcp --debug 2 root://cms02.lcg.cscs.ch//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f
xrdcp --debug 2 root://xrootd-cms.infn.it//store/mc/SAM/GenericTTbar/GEN-SIM-RECO/CMSSW_5_3_1_START53_V5-v1/0013/CE4D66EB-5AAE-E111-96D6-003048D37524.root /dev/null -f

PhEDEx on cms02

Read https://twiki.cern.ch/twiki/bin/view/CMSPublic/PhedexAdminDocsInstallation#PhEDEx_Agent_Installation

yum install zsh perl-ExtUtils-Embed libXmu libXpm tcl tk compat-libstdc++-33 git perl-XML-LibXML
adduser phedex
yum install httpd
mkdir /var/www/html/phedexlog
chown phedex:phedex /var/www/html/phedexlog/
su - phedex

mkdir -p state log sw gridcert config
chmod 700 gridcert
export sw=$PWD/sw
myarch=slc6_amd64_gcc461
wget -O $sw/bootstrap.sh http://cmsrep.cern.ch/cmssw/comp/bootstrap.sh
sh -x $sw/bootstrap.sh setup -path $sw -arch $myarch -repository comp 2>&1|tee $sw/bootstrap_$myarch.log
source $sw/$myarch/external/apt/*/etc/profile.d/init.sh
apt-get update
apt-cache search PHEDEX|grep PHEDEX
version=4.1.3-comp3
apt-get install cms+PHEDEX+$version
unlink PHEDEX
ln -s /home/phedex/sw/$myarch/cms/PHEDEX/$version/ PHEDEX

reset environment (zlib required for git is missing in slc6_amd64_gcc461)

cd config
# pre gitlab era
# git clone https://dmeister@git.cern.ch/reps/siteconf
git clone https://dconciat@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://mgila@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://jpata@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
# git clone https://dpetrusi@gitlab.cern.ch/SITECONF/T2_CH_CSCS.git
cd ..

run ./config/siteconf/T2_CH_CSCS/PhEDEx/FixHostnames.sh to fix hostnames in PhEDEX config

get helper scripts from SVN

mkdir svn-sandbox
cd svn-sandbox
svn co https://svn.cscs.ch/LCG/VO-specific/cms/phedex
cd ..
cp -R svn-sandbox/phedex/init.d ./
cd init.d
rm -Rf .svn
ln -s phedex_Prod phedex_Debug
ln -s phedex_Prod phedex_Dev

and copy host certificates

mkdir .globus
chmod 700 .globus
cp /etc/grid-security/hostcert.pem .globus/usercert.pem
cp /etc/grid-security/hostkey.pem .globus/userkey.pem
chown -R phedex:phedex .globus
chmod 600 .globus/*

and setup proxy and DB logins

touch config/DBParam.CSCS
chmod 600 config/DBParam.CSCS
# write secret config to file

Cron scripts/cron_restart.sh

#!/bin/bash

HOST=$(hostname)
HOST=${HOST%%\.*}

/home/phedex/init.d/phedex_Debug download-$HOST stop
/home/phedex/init.d/phedex_Debug download-$HOST start
/home/phedex/init.d/phedex_Prod download-$HOST stop
/home/phedex/init.d/phedex_Prod download-$HOST start

Cron scripts/cron_stats.sh

#!/bin/bash

HOST=$(hostname)
HOST=${HOST%%\.*}

SUMMARYFILE=/var/www/html/phedexlog/statistics.$(date +DONEm%d-HELPM).txt

#source /etc/profile.d/grid-env.sh
source /home/phedex/PHEDEX/etc/profile.d/init.sh
echo -e generated on `date` "\n------------------------" > $SUMMARYFILE
echo "Prod:" >> $SUMMARYFILE
/home/phedex/init.d/phedex_Prod status >> $SUMMARYFILE
echo "Debug:" >> $SUMMARYFILE
/home/phedex/init.d/phedex_Debug status >> $SUMMARYFILE
/home/phedex/PHEDEX/Utilities/InspectPhedexLog -c 300 -es "-12 hours" /home/phedex/log/Prod/download-$HOST /home/phedex/log/Debug/download-$HOST >> $SUMMARYFILE 2>/dev/null

Cron scripts/cron_proxy.sh

#!/bin/bash

HOST=$(hostname)
HOST=${HOST%%\.*}

unset X509_USER_PROXY
voms-proxy-init
myproxy-get-delegation -s myproxy.cern.ch -v -l cscs_phedex_${HOST}_dm_2014 -a /home/phedex/gridcert/proxy.cert -o /home/phedex/gridcert/proxy.cert
export X509_USER_PROXY=/home/phedex/gridcert/proxy.cert
voms-proxy-init -noregen -voms cms

Cron scripts/cron_spacemon.sh

#!/bin/bash

cd /lhome/phedex/spacemon

unset PERL5LIB
source /lhome/phedex/PHEDEX/etc/profile.d/init.sh
export PHEDEX_ROOT=/lhome/phedex/spacemon/PHEDEX
export PERL5LIB=$PHEDEX_ROOT/perl_lib:$PERL5LIB
export PATH=$PHEDEX_ROOT/Utilities:$PHEDEX_ROOT/Utilities/testSpace:$PATH

export X509_USER_PROXY=/lhome/phedex/gridcert/proxy.cert

DUMPDATE=$(date '+%Y-%m-%d')

wget --quiet http://ganglia.lcg.cscs.ch/dcache/dcache-cms-dump-$DUMPDATE.xml.bz2 -O dumps/dcache-cms-dump-$DUMPDATE.xml.bz2

if [ $? -gt 0 ]; then
    echo "no dCache dump available for $DUMPDATE; abort..."
    exit 1
fi

spacecount dcache --dump dumps/dcache-cms-dump-$DUMPDATE.xml.bz2 --node T2_CH_CSCS
if [ $? -gt 0 ]; then
    echo "uploading dump for $DUMPDATE failed; keeping the file..."
    exit 1
fi

rm dumps/dcache-cms-dump-$DUMPDATE.xml.bz2

Cron scripts/cron_clean.sh

CMS dirs in /pnfs to be regularly cleaned up by a set of crons

#!/bin/bash

# 2 weeks: store/unmerged, store/temp, store/backfill/1, store/backfill/2
find /pnfs/lcg.cscs.ch/cms/trivcat/store/unmerged/   -mindepth 1 -mtime +14 -delete       
find /pnfs/lcg.cscs.ch/cms/trivcat/store/temp/       -mindepth 1 -mtime +14 -delete 
find /pnfs/lcg.cscs.ch/cms/trivcat/store/backfill/1/ -mindepth 1 -mtime +14 -delete 
find /pnfs/lcg.cscs.ch/cms/trivcat/store/backfill/2/ -mindepth 1 -mtime +14 -delete 

# 1 week:  store/temp/user
find /pnfs/lcg.cscs.ch/cms/trivcat/store/temp/user/  -mindepth 1 -mtime +7  -delete 

Cron final setups

chmod +x scripts/*.sh

run config/siteconf/T2_CH_CSCS/PhEDEx/CreateLogrotConf.pl

crontab -e

05   0    * * *    /usr/sbin/logrotate -s /home/phedex/state/logrotate.state /home/phedex/config/logrotate.conf
13   5,17 * * *    /home/phedex/scripts/cron_restart.sh
*/15 *    * * *    /home/phedex/scripts/cron_stats.sh
0    */4  * * *    /home/phedex/scripts/cron_proxy.sh
0      7  * * *    /lhome/phedex/scripts/cron_spacemon.sh

and for root

crontab -l
30      2  * * *    /lhome/phedex/scripts/cron_clean.sh

X509 Proxy into MyProxy at CERN

setup myproxy service from a UI machine

voms-proxy-init -voms cms
myproxy-init -s myproxy.cern.ch -l cscs_phedex_cms0X_dm_2014 -x -R "/DC=com/DC=quovadisglobal/DC=grid/DC=switch/DC=hosts/C=CH/ST=Zuerich/L=Zuerich/O=ETH Zuerich/CN=cms0X.lcg.cscs.ch" -c 6400

and copy initial proxy to gridcert/proxy.cert

PhEDEx DDM stats

PhEDEx locally generated stats

Make a couple of SSH RSA keys

[phedex@cms02 ~]$ ssh-keygen -C "To push the cms02 phedex stats into mysql@ganglia:/var/www/html/ganglia/phedex/" -t rsa 

Make a phedex cron pushing :

[phedex@cms02 ~]$ ll /var/www/html/ganglia/phedex/
total 588
-rw-r--r-- 1 phedex phedex 15043 Feb  1 23:45 statistics.DONEm01-HELPM.txt
-rw-r--r-- 1 phedex phedex 17526 Feb  2 23:45 statistics.DONEm02-HELPM.txt
-rw-r--r-- 1 phedex phedex 14815 Feb  3 23:45 statistics.DONEm03-HELPM.txt
...

into mysql@ganglia.lcg.cscs.ch:/var/www/html/ganglia/phedex/

Check if you can browse them : http://ganglia.lcg.cscs.ch/ganglia/phedex/

Edit | Attach | Watch | Print version | History: r46 | r42 < r41 < r40 < r39 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r40 - 2017-02-09 - DinoConciatore
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback