CMS Box
Real machine name |
cms02.lcg.cscs.ch |
Firewall requirements
port |
open to |
reason |
3128/tcp |
WN |
access from WNs to FroNtier squid proxy |
3401/udp |
128.142.0.0/16, 188.185.0.0/17 |
central SNMP monitoring of FroNtier service |
1094/tcp |
* |
access to Xrootd redirector that forwards requests to the native dCache Xrootd door |
For further, but older, information look at CmsVObox
Running the services
installation of xrootd and cmsd from epel + comilation of cmstfc
- Documentation
- need a host cert for cms03
The OSG documentation suggests installing the basif OSG repo RPM and then installing from there. Since this brings in a plethora of OSG repos that potentially would require prioritization inside of the CSCS context in order to not interfere with their installation, I decided to go by EPEL and finally built my own RPM for the plugin.
- Install xrootd
[dfeich@cms03 ~]$ sudo yum install xrootd-server
Installed:
xrootd-server.x86_64 1:5.1.1-1.el7
Dependency Installed:
expect.x86_64 0:5.45-14.el7_1 libmacaroons.x86_64 0:0.3.0-1.el7
tinyxml.x86_64 0:2.6.2-3.el7 xrootd-client-libs.x86_64 1:5.1.1-1.el7
xrootd-libs.x86_64 1:5.1.1-1.el7 xrootd-server-libs.x86_64 1:5.1.1-1.el7
- I had to create the
/var/run/
directories for both services owned by the xrootd user
mkdir -p /var/run/xrootd
sudo chown xrootd:xrootd /var/run/xrootd
- compile xrootd-cmstfc from sources
- Firewall requirements
- From the running services I see
sudo ss -rlp
tcp LISTEN 0 255 :::46398 :::* users:(("cmsd",pid=6426,fd=16))
tcp LISTEN 0 255 :::rootd :::* users:(("xrootd",pid=6257,fd=16))
I think that only xrootd itself needs inbound connectivity. cmsd's listening port seems not to be used from outside for these cases.
- Install xrootd/cmsd config file file:/ssh:cms03.lcg.cscs.ch|sudo:cms03.lcg.cscs.ch:/etc/xrootd/xrootd-clustered.cfg The most essential directives from that file are listed here
all.manager any xrootd-cms.infn.it+ 1213
xrootd.redirect storage01.lcg.cscs.ch:1095 /
all.export / nostage
cms.allow host *
oss.namelib /usr/lib64/libXrdCmsTfc.so file:/etc/xrootd/storage.xml?protocol=direct
- Install TFC file for CMS TFC plugin
- access to CVMFS needed, so that the CMS TFC file can be found by the xrootd-cmstfc pugin
- both xrootd and cmsd read the plugin file (from log messages)
- DANGER: I started xrootd with the plugin while the TDC storage.xml could not be found. This resulted in xrootd exporting the root fs!!!!!!
[feichtinger@t3ui01 ~]$ xrdfs cms03.lcg.cscs.ch:1094 ls -l /
-r-- 2018-07-11 11:45:17 0 /.autorelabel
dr-x 2021-04-23 07:02:11 36864 /bin
dr-x 2021-04-06 07:57:37 4096 /boot
dr-x 2021-04-06 07:57:06 3140 /dev
After making storage.xml available on another path, I correctly see the /store path
[feichtinger@t3ui01 ~]$ xrdfs cms03.lcg.cscs.ch:1094 ls -l /
dr-x 2021-02-08 18:47:46 512 /store
If I use these settings and I disable the namespace plugin
oss.localroot /pnfs/lcg.cscs.ch/cms/trivcat/
xrootd.redirect storage01.lcg.cscs.ch:1095 /
all.export / nostage
# oss.namelib /usr/lib64/libXrdCmsTfc.so file:/etc/xrootd/storage.xml?protocol=direct
try listing the filesystem
[feichtinger@t3ui01 ~]$ xrdfs cms03.lcg.cscs.ch:1094 ls -l /
dr-x 2021-02-08 18:47:46 512 /store
This also still works when I comment out the oss.localroot directive.
- Install systemd timer to copy TFC from CVMFS to local storage to evade the security problem should CVMFS be not available.
# /etc/systemd/system/xrootd-copy-cms-tfc-config.timer
[Unit]
Description=Regularly copies CMS TFC from CVMFS to /etc/xrootd/storage.xml
Requires=xrootd-copy-cms-tfc-config.service
[Timer]
Unit=xrootd-copy-cms-tfc-config.service
OnCalendar=*-*-* 09:34:00
[Install]
WantedBy=timers.target
and the xrootd-copy-cms-tfc-config.service file
# copy CMS TFC configuration from CVMFS location to /etc/xrootd
# Since xrootd exports all in case of TFC file being unavailable, we have to guard against
# the service starting when CVMFS is unavailable.
# Derek Feichtinger 2021-04-26
#
[Unit]
Description=Copy
Wants=xrootd-copy-cms-tfc-config.timer
[Service]
Type=simple
User=root
ExecStart=/usr/bin/cp /cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml /etc/xrootd/storage.xml
[Install]
WantedBy=multi-user.target
- Start services
sudo systemctl start xrootd@clustered.service
sudo systemctl start cmsdd@clustered.service
- Test
- direct read from cms03
[feichtinger@t3ui01 ~]$ xrdcp -f -d 2 root://cms03.lcg.cscs.ch//store/user/dfeichti/testfile-df3 file:////tmp/derek5
...
[2021-04-23 17:12:53.152202 +0200][Debug ][XRootD ] Redirect trace-back:
[2021-04-23 17:12:53.152202 +0200][Debug ][XRootD ] 0. Redirected from: root://cms03.lcg.cscs.ch:1094/ to: root://storage01.lcg.cscs.ch:1095/
[2021-04-23 17:12:53.152202 +0200][Debug ][XRootD ] 1. Redirected from: root://storage01.lcg.cscs.ch:1095/ to: root://se30.cscs.ch:33506/
...
[7.08kB/7.08kB][100%][==================================================][7.08kB/s]
- read via global CMS redirector (I shut down the old cmsd on cms02 before doing this)
[feichtinger@t3ui01 ~]$ xrdcp -f -d 2 root://cms-xrd-global.cern.ch//store/user/dfeichti/testfile-df3 file:////tmp/derek5
...
[2021-04-23 15:47:03.220384 +0200][Debug ][XRootD ] Redirect trace-back:
[2021-04-23 15:47:03.220384 +0200][Debug ][XRootD ] 0. Redirected from: root://cms-xrd-global.cern.ch:1094/ to: root://llrxrd-redir.in2p3.fr:1094/
[2021-04-23 15:47:03.220384 +0200][Debug ][XRootD ] 1. Redirected from: root://llrxrd-redir.in2p3.fr:1094/ to: root://cms03.lcg.cscs.ch:1094/
[2021-04-23 15:47:03.220384 +0200][Debug ][XRootD ] 2. Redirected from: root://cms03.lcg.cscs.ch:1094/ to: root://storage01.lcg.cscs.ch:1095/
[2021-04-23 15:47:03.220384 +0200][Debug ][XRootD ] 3. Redirected from: root://storage01.lcg.cscs.ch:1095/ to: root://se30.cscs.ch:33506/
...
[7.08kB/7.08kB][100%][==================================================][7.08kB/s]
...
TODO (probably adapt) Dino Conciatore's Puppet recipes ( after Dec 2016 )
Fully puppet installation
To simplify the cms vo box installation here at CSCS we prepared a puppet recipe to fully install and configure the CMS vo box.
VM
cms02.lcg.cscs.ch is a VM hosted on the CSCS vmware cluster
For any
VirtualHW modification, hard restart, etc. just ping us we have fully access to the admin infrastructure.
Hiera config
We use Hiera to keep the puppet code more dynamic, and to lookup some key variables.
---
role: role_wlcg_cms_vobox
environment: dev1
cluster: phoenix4
profile_cscs_base::network::interfaces_hash:
eth0:
enable_dhcp: true
hwaddr:
mtu: '1500'
eth1:
ipaddress: 148.187.66.63
netmask: 255.255.252.0
mtu: '1500'
gateway: 148.187.64.2
profile_cscs_base::network::hostname: 'cms02.lcg.cscs.ch'
profile_monitoring::ganglia::gmond_cluster_options:
cluster_name: 'PHOENIX-services'
udp_send_channel: [{'bind_hostname' : 'yes', 'port' : '8693', 'host' : 'ganglia.lcg.cscs.ch'}]
udp_recv_channel: [ { mcast_join: '239.2.11.71', port: '8649', bind: '239.2.11.71' } ]
tcp_accept_channel: [ port: '8649' ]
# crontab
cron::hourly:
'cron_proxy':
command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_proxy.sh'
user: 'phedex'
environment:
- 'MAILTO=root'
- 'PATH="/usr/bin:/bin:/usr/local/sbin"'
cron::daily:
'cron_stats':
command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_stats.sh'
user: 'phedex'
environment:
- 'MAILTO=root'
- 'PATH="/usr/bin:/bin:/usr/local/sbin"'
# Phedex SITECONF
profile_wlcg_cms_vobox::phedex::myproxy_user: 'cscs_cms02_phedex_xxxxx_user_2017'
# git clone ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS
## Change in hash and use puppet git
profile_wlcg_cms_vobox::phedex::siteconf:
'T2_CH_CSCS':
path: '/home/phedex/config/T2_CH_CSCS'
source: 'ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS'
owner: 'phedex'
group: 'phedex'
#environment: ["HOME=/home/phedex"]
branch: 'master'
update: true
'wlcg_cms_vobox':
path: '/localgit/profile_wlcg_cms_vobox'
source: 'ssh://git@git.cscs.ch/puppet_profiles/profile_wlcg_cms_vobox.git'
branch: 'dev1'
update: true
'dmwm_PHEDEX':
path: '/localgit/dmwm_PHEDEX'
source: 'https://github.com/dmwm/PHEDEX.git'
branch: 'master'
update: true
profile_cscs_base::extra_mounts:
'/users':
ensure: 'mounted'
device: 'nas.lcg.cscs.ch:/ifs/LCG/shared/phoenix4/users'
atboot: true
fstype: 'nfs'
options: 'rw,bg,proto=tcp,rsize=32768,wsize=32768,soft,intr,nfsvers=3'
'/pnfs':
ensure: 'mounted'
device: 'storage02.lcg.cscs.ch:/pnfs'
atboot: true
fstype: 'nfs'
options: 'ro,intr,noac,hard,proto=tcp,nfsvers=3'
profile_cscs_base::ssh_host_dsa_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_dsa_key_pub: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key_pub: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key: >
ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key_pub: >
ENC[PKCS7,.....]
profile_wlcg_base::grid_hostcert: >
ENC[PKCS7,.....]
profile_wlcg_base::grid_hostkey: >
ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa: >
ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa_pub: >
ENC[PKCS7,.....]
Puppet module profile_wlcg_cms_vobox
This module configure:
- Firewall
- CVMFS
- Frontier Squid
- Xrootd
- Phedex
Puppet will run automatically every 30 minutes and check if the those service are running:
- cmsd
- xrootd
- frontier-squid (managed by the included frontier::squid module)
Run the installation
Currently we still use foreman to run the initial OS setup:
hammer host create --name "cms02.lcg.cscs.ch" --hostgroup-id 13 --environment "dev1" --puppet-ca-proxy-id 1 --puppet-proxy-id 1 --puppetclass-ids 697 --operatingsystem-id 9 --medium "Scientific Linux" --partition-table-id 7 --build yes --mac "00:10:3e:66:00:63" --ip=10.10.66.63 --domain-id 4
If you want reinstall the machine you have yust to set:
hammer host update --name cms02.lcg.cscs.ch --build yes
Phedex certificate
To update an expired certificate just edit profile_wlcg_cms_vobox::phedex::proxy_cert in the cms02 hiera file.
The myproxy user (if you have to re-init the myproxy) is located in the script:
https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/edit/master/PhEDEx/tools/cron/cron_proxy.sh
Current user is: cscs_cms02_phedex_jpata_2017 and certificate is located here: /home/phedex/gridcert/x509_new The certificate is not managed by puppet because need to be updated by cern myproxy
Puppet will automatically pull the repo every 30 min.
Pre Dino Conciatore's Puppet recipes ( before Dec 2016 )
UMD3
Make sure UMD3 Yum repo are setup :
yum install http://repository.egi.eu/sw/production/umd/3/sl6/x86_64/updates/umd-release-3.0.1-1.el6.noarch.rpm
THE INFORMATION FOR TESTING IS PARTLY OUTDATED - THEREFORE I ADDED A TODO
Installation
create
/etc/squid/squidconf
export FRONTIER_USER=dbfrontier
export FRONTIER_GROUP=dbfrontier
run installation as described at
https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid
rpm -Uvh http://frontier.cern.ch/dist/rpms/RPMS/noarch/frontier-release-1.0-1.noarch.rpm
yum install frontier-squid
chkconfig frontier-squid on
create folders on special partition
mkdir /home/dbfrontier/cache
mkdir /home/dbfrontier/log
chown dbfrontier:dbfrontier /home/dbfrontier/cache/
chown dbfrontier:dbfrontier /home/dbfrontier/log/
Configuration
edit /etc/squid/customize.sh
#!/bin/bash
awk --file `dirname $0`/customhelps.awk --source '{
setoption("acl NET_LOCAL src", "10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 148.187.0.0/16")
setoption("cache_mem", "128 MB")
setoptionparameter("cache_dir", 3, "15000")
setoption("cache_log", "/home/dbfrontier/log/cache.log")
setoption("coredump_dir", "/home/dbfrontier/cache/")
setoptionparameter("cache_dir", 2, "/home/dbfrontier/cache/")
setoptionparameter("access_log", 1, "/home/dbfrontier/log/access.log")
setoption("logfile_rotate", "1")
print
}'
start the service and then move the log files
service frontier-squid start
rmdir /var/cache/squid
rmdir /var/log/squid
ln -s /home/dbfrontier/cache /var/cache/squid
ln -s /home/dbfrontier/log /var/log/squid
create /etc/sysconfig/frontier-squid
export LARGE_ACCESS_LOG=500000000
restart the service
service frontier-squid reload
Allow SNMP monitoring of squid service
edit /etc/sysconfig/iptables add:
-A INPUT -s 128.142.0.0/16 -p udp --dport 3401 -j ACCEPT
-A INPUT -s 188.185.0.0/17 -p udp --dport 3401 -j ACCEPT
service iptables reload
Test squid proxy
Log file locations:
- /home/squid/log/access.log
- /home/squid/log/cache.log
THIS POTENTIALLY NEEDS TO BE REWRITTEN
Connect for instance on ui.lcg.cscs.ch :
wget http://frontier.cern.ch/dist/fnget.py
chmod +x fnget.py
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms01.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms02.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
Expected Output :
More... Close
Using Frontier URL: http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier
Query: select 1 from dual
Decode results: True
Refresh cache: False
Frontier Request:
http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip&p1=eNorTs1JTS5RMFRIK8rPVUgpTcwBAD0rBmw_
Query started: 02/29/16 22:32:47 CET
Query ended: 02/29/16 22:32:47 CET
Query time: 0.00365591049194 [seconds]
Query result:
eF5jY2BgYDRkA5JsfqG+Tq5B7GxgEXYAGs0CVA==
Fields:
1 NUMBER
Records:
1