Tags: view all tags

CMS Box

Real machine name	cms02.lcg.cscs.ch

Firewall requirements

port	open to	reason
3128/tcp	WN	access from WNs to FroNtier squid proxy
3401/udp	128.142.0.0/16, 188.185.0.0/17	central SNMP monitoring of FroNtier service
1094/tcp	*	access to Xrootd redirector that forwards requests to the native dCache Xrootd door

For further, but older, information look at CmsVObox

Running the services
- installation of xrootd and cmsd from epel + comilation of cmstfc
TODO (probably adapt) Dino Conciatore's Puppet recipes ( after Dec 2016 )
- Fully puppet installation
Pre Dino Conciatore's Puppet recipes ( before Dec 2016 )
- UMD3
- FroNtier

Running the services

installation of xrootd and cmsd from epel + comilation of cmstfc

Documentation
- https://twiki.cern.ch/twiki/bin/view/CMSPublic/DCacheXRootD
- https://twiki.cern.ch/twiki/bin/view/Main/DcacheXrootdRecent
need a host cert for cms03

The OSG documentation suggests installing the basif OSG repo RPM and then installing from there. Since this brings in a plethora of OSG repos that potentially would require prioritization inside of the CSCS context in order to not interfere with their installation, I decided to go by EPEL and finally built my own RPM for the plugin.

Install xrootd

[dfeich@cms03 ~]$ sudo yum install xrootd-server

Installed:
  xrootd-server.x86_64 1:5.1.1-1.el7

Dependency Installed:
  expect.x86_64 0:5.45-14.el7_1             libmacaroons.x86_64 0:0.3.0-1.el7
  tinyxml.x86_64 0:2.6.2-3.el7              xrootd-client-libs.x86_64 1:5.1.1-1.el7
  xrootd-libs.x86_64 1:5.1.1-1.el7          xrootd-server-libs.x86_64 1:5.1.1-1.el7

I had to create the /var/run/ directories for both services owned by the xrootd user
```
mkdir -p /var/run/xrootd
sudo chown xrootd:xrootd /var/run/xrootd
```
compile xrootd-cmstfc from sources
- Sources from https://github.com/opensciencegrid/xrootd-cmstfc/archive/refs/tags/v1.5.2.tar.gz
```
sudo yum install cmake xrootd-devel xerces-c-devel xerxes-c
```
  Simple compilation test
```
cd xrootd-cmstfc-1.5.2
cmake .
make
```
- clean installation requires RPM building
```
sudo yum install rpm-build rpmdevtools
rpmdev-setuptre   # creates ~/rpmbuild
```
  I had to edit the SPEC file. Some pending pull requests seem to contain similar changes, but I decided for now to fork an own copy. It's relatively trivial changes https://github.com/dfeich/xrootd-cmstfc

Firewall requirements

From the running services I see

sudo ss -rlp
tcp    LISTEN     0      255        :::46398     :::*    users:(("cmsd",pid=6426,fd=16))
tcp    LISTEN     0      255        :::rootd     :::*    users:(("xrootd",pid=6257,fd=16))

I think that only xrootd itself needs inbound connectivity. cmsd's listening port seems not to be used from outside for these cases.

Install xrootd/cmsd config file file:/ssh:cms03.lcg.cscs.ch|sudo:cms03.lcg.cscs.ch:/etc/xrootd/xrootd-clustered.cfg

The most essential directives from that file are listed here

all.manager any xrootd-cms.infn.it+ 1213
xrootd.redirect storage01.lcg.cscs.ch:1095 /
all.export / nostage
cms.allow host *
oss.namelib /usr/lib64/libXrdCmsTfc.so file:/etc/xrootd/storage.xml?protocol=direct

Install TFC file for CMS TFC plugin

access to CVMFS needed, so that the CMS TFC file can be found by the xrootd-cmstfc pugin
both xrootd and cmsd read the plugin file (from log messages)

DANGER: I started xrootd with the plugin while the TDC storage.xml could not be found. This resulted in xrootd exporting the root fs!!!!!!

[feichtinger@t3ui01 ~]$ xrdfs cms03.lcg.cscs.ch:1094 ls -l /
-r-- 2018-07-11 11:45:17           0 /.autorelabel
dr-x 2021-04-23 07:02:11       36864 /bin
dr-x 2021-04-06 07:57:37        4096 /boot
dr-x 2021-04-06 07:57:06        3140 /dev

After making storage.xml available on another path, I correctly see the /store path

[feichtinger@t3ui01 ~]$ xrdfs cms03.lcg.cscs.ch:1094 ls -l /
dr-x 2021-02-08 18:47:46         512 /store

If I use these settings and I disable the namespace plugin

oss.localroot /pnfs/lcg.cscs.ch/cms/trivcat/
xrootd.redirect storage01.lcg.cscs.ch:1095 /
all.export / nostage
# oss.namelib /usr/lib64/libXrdCmsTfc.so file:/etc/xrootd/storage.xml?protocol=direct

try listing the filesystem

[feichtinger@t3ui01 ~]$ xrdfs cms03.lcg.cscs.ch:1094 ls -l /
dr-x 2021-02-08 18:47:46         512 /store

This also still works when I comment out the oss.localroot directive.

Install systemd timer to copy TFC from CVMFS to local storage to evade the security problem should CVMFS be not available.

# /etc/systemd/system/xrootd-copy-cms-tfc-config.timer
[Unit]
Description=Regularly copies CMS TFC from CVMFS to /etc/xrootd/storage.xml
Requires=xrootd-copy-cms-tfc-config.service

[Timer]
Unit=xrootd-copy-cms-tfc-config.service
OnCalendar=*-*-* 09:34:00

[Install]
WantedBy=timers.target

and the xrootd-copy-cms-tfc-config.service file

# copy CMS TFC configuration from CVMFS location to /etc/xrootd
# Since xrootd exports all in case of TFC file being unavailable, we have to guard against
# the service starting when CVMFS is unavailable. 
# Derek Feichtinger 2021-04-26
#

[Unit]
Description=Copy 
Wants=xrootd-copy-cms-tfc-config.timer

[Service]
Type=simple
User=root
ExecStart=/usr/bin/cp /cvmfs/cms.cern.ch/SITECONF/local/PhEDEx/storage.xml /etc/xrootd/storage.xml

[Install]
WantedBy=multi-user.target

Start services

sudo systemctl start xrootd@clustered.service
sudo systemctl start cmsdd@clustered.service

Test

direct read from cms03

[feichtinger@t3ui01 ~]$ xrdcp -f -d 2 root://cms03.lcg.cscs.ch//store/user/dfeichti/testfile-df3  file:////tmp/derek5
...
[2021-04-23 17:12:53.152202 +0200][Debug  ][XRootD            ] Redirect trace-back:
[2021-04-23 17:12:53.152202 +0200][Debug  ][XRootD            ]         0. Redirected from: root://cms03.lcg.cscs.ch:1094/ to: root://storage01.lcg.cscs.ch:1095/
[2021-04-23 17:12:53.152202 +0200][Debug  ][XRootD            ]         1. Redirected from: root://storage01.lcg.cscs.ch:1095/ to: root://se30.cscs.ch:33506/
...
[7.08kB/7.08kB][100%][==================================================][7.08kB/s]

read via global CMS redirector (I shut down the old cmsd on cms02 before doing this)

[feichtinger@t3ui01 ~]$ xrdcp -f -d 2 root://cms-xrd-global.cern.ch//store/user/dfeichti/testfile-df3  file:////tmp/derek5
...
[2021-04-23 15:47:03.220384 +0200][Debug  ][XRootD            ] Redirect trace-back:
[2021-04-23 15:47:03.220384 +0200][Debug  ][XRootD            ]         0. Redirected from: root://cms-xrd-global.cern.ch:1094/ to: root://llrxrd-redir.in2p3.fr:1094/
[2021-04-23 15:47:03.220384 +0200][Debug  ][XRootD            ]         1. Redirected from: root://llrxrd-redir.in2p3.fr:1094/ to: root://cms03.lcg.cscs.ch:1094/
[2021-04-23 15:47:03.220384 +0200][Debug  ][XRootD            ]         2. Redirected from: root://cms03.lcg.cscs.ch:1094/ to: root://storage01.lcg.cscs.ch:1095/
[2021-04-23 15:47:03.220384 +0200][Debug  ][XRootD            ]         3. Redirected from: root://storage01.lcg.cscs.ch:1095/ to: root://se30.cscs.ch:33506/
...
[7.08kB/7.08kB][100%][==================================================][7.08kB/s]
...

TODO (probably adapt) Dino Conciatore's Puppet recipes ( after Dec 2016 )

Fully puppet installation

To simplify the cms vo box installation here at CSCS we prepared a puppet recipe to fully install and configure the CMS vo box.

VM

cms02.lcg.cscs.ch is a VM hosted on the CSCS vmware cluster

For any VirtualHW modification, hard restart, etc. just ping us we have fully access to the admin infrastructure.

Hiera config

We use Hiera to keep the puppet code more dynamic, and to lookup some key variables.

More... Close

---
role: role_wlcg_cms_vobox
environment: dev1
cluster: phoenix4

profile_cscs_base::network::interfaces_hash:
  eth0:
    enable_dhcp: true
    hwaddr:
    mtu: '1500'
  eth1:
    ipaddress: 148.187.66.63
    netmask: 255.255.252.0
    mtu: '1500'
    gateway: 148.187.64.2

profile_cscs_base::network::hostname: 'cms02.lcg.cscs.ch'

profile_monitoring::ganglia::gmond_cluster_options:
  cluster_name: 'PHOENIX-services'
  udp_send_channel: [{'bind_hostname' : 'yes', 'port' : '8693', 'host' : 'ganglia.lcg.cscs.ch'}]
  udp_recv_channel:   [ { mcast_join: '239.2.11.71', port: '8649', bind: '239.2.11.71' } ]
  tcp_accept_channel: [ port: '8649' ]


# crontab
cron::hourly:
  'cron_proxy':
    command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_proxy.sh'
    user: 'phedex'
    environment:
      - 'MAILTO=root'
      - 'PATH="/usr/bin:/bin:/usr/local/sbin"'
cron::daily:
  'cron_stats':
    command: '/home/phedex/config/T2_CH_CSCS/PhEDEx/tools/cron/cron_stats.sh'
    user: 'phedex'
    environment:
      - 'MAILTO=root'
      - 'PATH="/usr/bin:/bin:/usr/local/sbin"'
# Phedex SITECONF

profile_wlcg_cms_vobox::phedex::myproxy_user: 'cscs_cms02_phedex_xxxxx_user_2017'

# git clone ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS
## Change in hash and use puppet git
profile_wlcg_cms_vobox::phedex::siteconf:
  'T2_CH_CSCS':
    path: '/home/phedex/config/T2_CH_CSCS'
    source: 'ssh://git@gitlab.cern.ch:7999/SITECONF/T2_CH_CSCS'
    owner: 'phedex'
    group: 'phedex'
    #environment: ["HOME=/home/phedex"]
    branch: 'master'
    update: true
  'wlcg_cms_vobox':
    path: '/localgit/profile_wlcg_cms_vobox'
    source: 'ssh://git@git.cscs.ch/puppet_profiles/profile_wlcg_cms_vobox.git'
    branch: 'dev1'
    update: true
  'dmwm_PHEDEX':
    path: '/localgit/dmwm_PHEDEX'
    source: 'https://github.com/dmwm/PHEDEX.git'
    branch: 'master'
    update: true

profile_cscs_base::extra_mounts:
  '/users':
    ensure:   'mounted'
    device:   'nas.lcg.cscs.ch:/ifs/LCG/shared/phoenix4/users'
    atboot:   true
    fstype:   'nfs'
    options:  'rw,bg,proto=tcp,rsize=32768,wsize=32768,soft,intr,nfsvers=3'
  '/pnfs':
      ensure:   'mounted'
      device:   'storage02.lcg.cscs.ch:/pnfs'
      atboot:   true
      fstype:   'nfs'
      options:  'ro,intr,noac,hard,proto=tcp,nfsvers=3'

profile_cscs_base::ssh_host_dsa_key: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_dsa_key_pub: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_key_pub: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key: >
      ENC[PKCS7,.....]
profile_cscs_base::ssh_host_rsa_key_pub: >
      ENC[PKCS7,.....]
profile_wlcg_base::grid_hostcert: >
      ENC[PKCS7,.....]
profile_wlcg_base::grid_hostkey: >
 ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa: >
    ENC[PKCS7,.....]
profile_wlcg_cms_vobox::phedex::id_rsa_pub: >
    ENC[PKCS7,.....]

Puppet module profile_wlcg_cms_vobox

This module configure:

Firewall
CVMFS
Frontier Squid
Xrootd
Phedex

Puppet will run automatically every 30 minutes and check if the those service are running:

cmsd
xrootd
frontier-squid (managed by the included frontier::squid module)

Run the installation

Currently we still use foreman to run the initial OS setup:

hammer host create --name "cms02.lcg.cscs.ch" --hostgroup-id 13 --environment "dev1" --puppet-ca-proxy-id 1 --puppet-proxy-id 1 --puppetclass-ids 697 --operatingsystem-id 9 --medium "Scientific Linux" --partition-table-id 7 --build yes --mac "00:10:3e:66:00:63" --ip=10.10.66.63 --domain-id 4

If you want reinstall the machine you have yust to set:

hammer host update --name cms02.lcg.cscs.ch --build yes

Phedex certificate

To update an expired certificate just edit profile_wlcg_cms_vobox::phedex::proxy_cert in the cms02 hiera file.

The myproxy user (if you have to re-init the myproxy) is located in the script: https://gitlab.cern.ch/SITECONF/T2_CH_CSCS/edit/master/PhEDEx/tools/cron/cron_proxy.sh

Current user is: cscs_cms02_phedex_jpata_2017 and certificate is located here: /home/phedex/gridcert/x509_new The certificate is not managed by puppet because need to be updated by cern myproxy

Puppet will automatically pull the repo every 30 min.

Pre Dino Conciatore's Puppet recipes ( before Dec 2016 )

UMD3

Make sure UMD3 Yum repo are setup :

yum install http://repository.egi.eu/sw/production/umd/3/sl6/x86_64/updates/umd-release-3.0.1-1.el6.noarch.rpm

FroNtier

THE INFORMATION FOR TESTING IS PARTLY OUTDATED - THEREFORE I ADDED A TODO

Installation

create /etc/squid/squidconf

export FRONTIER_USER=dbfrontier
export FRONTIER_GROUP=dbfrontier

run installation as described at https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid

rpm -Uvh http://frontier.cern.ch/dist/rpms/RPMS/noarch/frontier-release-1.0-1.noarch.rpm
yum install frontier-squid
chkconfig frontier-squid on

create folders on special partition

mkdir /home/dbfrontier/cache
mkdir /home/dbfrontier/log
chown dbfrontier:dbfrontier /home/dbfrontier/cache/
chown dbfrontier:dbfrontier /home/dbfrontier/log/

Configuration

edit /etc/squid/customize.sh

/etc/squid/customize.sh

#!/bin/bash
awk --file `dirname $0`/customhelps.awk --source '{
setoption("acl NET_LOCAL src", "10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 148.187.0.0/16")
setoption("cache_mem", "128 MB")
setoptionparameter("cache_dir", 3, "15000")                                             
setoption("cache_log", "/home/dbfrontier/log/cache.log")
setoption("coredump_dir", "/home/dbfrontier/cache/")
setoptionparameter("cache_dir", 2, "/home/dbfrontier/cache/")
setoptionparameter("access_log", 1, "/home/dbfrontier/log/access.log")
setoption("logfile_rotate", "1")
print
}'

start the service and then move the log files

service frontier-squid start
rmdir /var/cache/squid
rmdir /var/log/squid
ln -s /home/dbfrontier/cache /var/cache/squid
ln -s /home/dbfrontier/log /var/log/squid

create /etc/sysconfig/frontier-squid

export LARGE_ACCESS_LOG=500000000

restart the service

service frontier-squid reload

Allow SNMP monitoring of squid service

edit /etc/sysconfig/iptables add:

-A INPUT -s 128.142.0.0/16 -p udp --dport 3401 -j ACCEPT
-A INPUT -s 188.185.0.0/17 -p udp --dport 3401 -j ACCEPT

service iptables reload

Test squid proxy

Log file locations:

/home/squid/log/access.log
/home/squid/log/cache.log

THIS POTENTIALLY NEEDS TO BE REWRITTEN

Connect for instance on ui.lcg.cscs.ch :

wget http://frontier.cern.ch/dist/fnget.py
chmod +x fnget.py
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms01.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"
export http_proxy=http://cms02.lcg.cscs.ch:3128
./fnget.py --url=http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier --sql="select 1 from dual"

Expected Output :

More... Close

Using Frontier URL:  http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier
Query:  select 1 from dual
Decode results:  True
Refresh cache:  False

Frontier Request:
http://cmsfrontier.cern.ch:8000/FrontierProd/Frontier/type=frontier_request:1:DEFAULT&encoding=BLOBzip&p1=eNorTs1JTS5RMFRIK8rPVUgpTcwBAD0rBmw_

Query started:  02/29/16 22:32:47 CET
Query ended:  02/29/16 22:32:47 CET
Query time: 0.00365591049194 [seconds]

Query result:



 
  
   eF5jY2BgYDRkA5JsfqG+Tq5B7GxgEXYAGs0CVA==
   
  
 


Fields: 
     1     NUMBER
Records:
     1