Swiss Grid Operations Meeting on 2013-04-04
Agenda
Status
- CSCS (reports Pablo):
- F2F meeting held @ CSCS: MeetingCHIPPCSCSFaceToFace20130321
- SRM security issue striked again, still no answer from dCache support. Upgrade in one month.
- CMS Argus authentication issues still under investigation
- Cream03 with EMI-2 and SL6 in production. Feedback?
- cms01 ready
- Preparation for Maintenance on Tuesday next week: SiteMaintenance20130409
- PSI (reports Fabio):
- SL6 kernels upgraded to
2.6.32-358.2.1
but 2 files servers are affected by a load reporting issue xfsaild always in D-state.
- RDAC driver upgraded from
09.03.0C08.0535
to 09.03.0C05.0642
on these 2 files servers.
- SL5 kernels upgraded to
2.6.18-348.3.1
.
- Installed by Yum the xrootd clients at version
3.3.1-1
on each UI and WN.
- Postgresql
9.2.2
upgraded to the latest 9.2.3-2
.
- dCache migrated from
1.9.12-23
to 2.2.9-1
: major news ( look PSI files reported below ):
- Replaced
ssh1
admin door with the ssh2
door.
- Adapted
dCache_gmetric.py
to manage the new ssh2
case.
- Replaced gPlazma1 with gPlazma2: migration was not transparent and I got a GGUS Ticket, now solved.
- New Pool Selection algorithm = Weighted Available Space Selection. Needs two changes, the wass and the weights.
- The Derek's tools are still compatible.
- News about BDII, look
/etc/dcache/dcache.conf
. Does not work on the infoprovider layout.
- When you start dCache you have to wait 10s and then:
/etc/init.d/dcache-server restart t3se01-Domain-utility
. It's a bug they are working out.
- I can't make work file
/var/log/dcache/adminshell_history
- I can't make work gPlazma2 plugin
nsswitch
.
- Need to test Secondary groups with
gPlazma2
, it was failing with gPlazma1
.
- Enabled Xrootd at PSI, I can copy files and I dedicated an I/O queue in dCache but I'm still missing all the details, so it's not at the production level now. It's also not strictly required for a T3 so it has low priority.
- UNIBE (reports Gianfranco - won't be able to attend):
- 5 out of 6 DPM disk servers are now SLC6.3, emi-dpm_disk 1.8.6-1.el6 (EMI-2)
- New disk server with 120TB delivered, running acceptance tests, will deploy on Monday
- We'll have 502TB (after RAID6): 400TB ATLASDATADISK (350TB pledged), 80TB LOCALGROUPDISK, 22TB for ops, SMSCG VOs
- SLC5 and SLC6 kernel upgrade on interactive nodes/grid UI.
- Cluster not upgraded, don't want to risk yet another ARC bug (there is already one related to the new kernel, which should not affect us). Risk very low:
- all WNs in a private network
- no interactive submission allowed
- no users except grid pool accounts
- only atlas/ch, smscg VOs, Andrej Filipcic allowed via arc.conf and grid-mapfile
Next meeting date: 2nd of May
Attendants
- CSCS: Pablo
- CMS: Fabio, Daniel, Derek
- ATLAS:
- LHCb: Roland
- EGI:
Action items
PSI dCache 2.2 files
/var/lib/dcache/config/poolmanager.conf
Show Hide
#
# Created by PoolManager(PoolManager) at Fri Mar 01 16:05:25 CET 2013
#
#
# Submodule CostModule (cm) : diskCacheV111.poolManager.CostModuleV1
# $Revision: 16335 $
#
cm set debug off
cm set update on
cm set magic on
# note new wass type
pm create -type=wass default
pm set default -cpucostfactor=1.0 -spacecostfactor=1.0
pm set default -idle=0.0 -p2p=90.0% -alert=0.0 -halt=0.0 -fallback=0.0
pm set default -p2p-allowed=yes -p2p-oncost=yes -p2p-fortransfer=yes
pm set default -stage-allowed=no
pm set default -max-copies=500 -slope=0.0
#
# Setup of PoolManager (diskCacheV111.poolManager.PoolManagerV5) at Fri Mar 01 16:05:25 CET 2013
#
#
# Printed by diskCacheV111.poolManager.PoolSelectionUnitV2 at Fri Mar 01 16:05:25 CET 2013
#
#
psu set regex off
psu set allpoolsactive off
#
# The units ...
#
psu create unit -protocol */*
psu create unit -store dteam:dteam@osm
psu create unit -store ops:ops@osm
psu create unit -net 0.0.0.0/0.0.0.0
psu create unit -store cms:cms@osm
psu create unit -store *@*
#
# The unit Groups ...
#
psu create ugroup any-store
psu addto ugroup any-store *@*
psu create ugroup cms_unit_group
psu addto ugroup cms_unit_group cms:cms@osm
psu create ugroup dteam_unit_group
psu addto ugroup dteam_unit_group dteam:dteam@osm
psu create ugroup world-net
psu addto ugroup world-net 0.0.0.0/0.0.0.0
psu create ugroup any-protocol
psu addto ugroup any-protocol */*
psu create ugroup ops_unit_group
psu addto ugroup ops_unit_group ops:ops@osm
#
# The pools ...
# CMS POOLs
#psu create pool t3se01_cms
#psu create pool t3se01_cms_1
#psu create pool t3se01_ops
psu create pool t3fs01_cms
psu create pool t3fs02_cms
psu create pool t3fs03_cms
psu create pool t3fs04_cms
psu create pool t3fs04_cms_1
psu create pool t3fs07_cms
psu create pool t3fs07_cms_1
psu create pool t3fs08_cms
psu create pool t3fs08_cms_1
psu create pool t3fs09_cms
psu create pool t3fs09_cms_1
psu create pool t3fs10_cms
psu create pool t3fs10_cms_1
psu create pool t3fs11_cms
psu create pool t3fs11_cms_1
psu create pool t3fs13_cms
psu create pool t3fs13_cms_1
psu create pool t3fs13_cms_2
psu create pool t3fs13_cms_3
psu create pool t3fs13_cms_4
psu create pool t3fs13_cms_5
psu create pool t3fs13_cms_6
psu create pool t3fs14_cms
psu create pool t3fs14_cms_1
psu create pool t3fs14_cms_2
psu create pool t3fs14_cms_3
psu create pool t3fs14_cms_4
psu create pool t3fs14_cms_5
psu create pool t3fs14_cms_6
# OPS POOLs
psu create pool t3fs01_ops
psu create pool t3fs02_ops
psu create pool t3fs03_ops
psu create pool t3fs04_ops
psu create pool t3fs07_ops
psu create pool t3fs08_ops
psu create pool t3fs09_ops
psu create pool t3fs10_ops
psu create pool t3fs11_ops
psu create pool t3fs13_ops
psu create pool t3fs13_ops_1
psu create pool t3fs13_ops_2
psu create pool t3fs13_ops_3
psu create pool t3fs13_ops_4
psu create pool t3fs13_ops_5
psu create pool t3fs14_ops
psu create pool t3fs14_ops_1
psu create pool t3fs14_ops_2
psu create pool t3fs14_ops_3
psu create pool t3fs14_ops_4
psu create pool t3fs14_ops_5
#
# The pool groups ...
#
psu create pgroup dteam
#psu addto pgroup dteam t3se01_ops
psu addto pgroup dteam t3fs01_ops
psu addto pgroup dteam t3fs02_ops
psu addto pgroup dteam t3fs03_ops
psu addto pgroup dteam t3fs04_ops
psu addto pgroup dteam t3fs07_ops
psu addto pgroup dteam t3fs08_ops
psu addto pgroup dteam t3fs09_ops
psu addto pgroup dteam t3fs10_ops
psu addto pgroup dteam t3fs11_ops
psu addto pgroup dteam t3fs13_ops
psu addto pgroup dteam t3fs13_ops_1
psu addto pgroup dteam t3fs13_ops_2
psu addto pgroup dteam t3fs13_ops_3
psu addto pgroup dteam t3fs13_ops_4
psu addto pgroup dteam t3fs13_ops_5
psu addto pgroup dteam t3fs14_ops
psu addto pgroup dteam t3fs14_ops_1
psu addto pgroup dteam t3fs14_ops_2
psu addto pgroup dteam t3fs14_ops_3
psu addto pgroup dteam t3fs14_ops_4
psu addto pgroup dteam t3fs14_ops_5
#
psu create pgroup cms
#psu addto pgroup cms t3se01_cms
#psu addto pgroup cms t3se01_cms_1
psu addto pgroup cms t3fs01_cms
psu addto pgroup cms t3fs02_cms
psu addto pgroup cms t3fs03_cms
psu addto pgroup cms t3fs04_cms
psu addto pgroup cms t3fs04_cms_1
psu addto pgroup cms t3fs07_cms
psu addto pgroup cms t3fs07_cms_1
psu addto pgroup cms t3fs08_cms
psu addto pgroup cms t3fs08_cms_1
psu addto pgroup cms t3fs09_cms
psu addto pgroup cms t3fs09_cms_1
psu addto pgroup cms t3fs10_cms
psu addto pgroup cms t3fs10_cms_1
psu addto pgroup cms t3fs11_cms
psu addto pgroup cms t3fs11_cms_1
psu addto pgroup cms t3fs13_cms
psu addto pgroup cms t3fs13_cms_1
psu addto pgroup cms t3fs13_cms_2
psu addto pgroup cms t3fs13_cms_3
psu addto pgroup cms t3fs13_cms_4
psu addto pgroup cms t3fs13_cms_5
psu addto pgroup cms t3fs13_cms_6
psu addto pgroup cms t3fs14_cms
psu addto pgroup cms t3fs14_cms_1
psu addto pgroup cms t3fs14_cms_2
psu addto pgroup cms t3fs14_cms_3
psu addto pgroup cms t3fs14_cms_4
psu addto pgroup cms t3fs14_cms_5
psu addto pgroup cms t3fs14_cms_6
psu create pgroup ops
#psu addto pgroup ops t3se01_ops
psu addto pgroup ops t3fs01_ops
psu addto pgroup ops t3fs02_ops
psu addto pgroup ops t3fs03_ops
psu addto pgroup ops t3fs04_ops
psu addto pgroup ops t3fs07_ops
psu addto pgroup ops t3fs08_ops
psu addto pgroup ops t3fs09_ops
psu addto pgroup ops t3fs10_ops
psu addto pgroup ops t3fs11_ops
psu addto pgroup ops t3fs13_ops
psu addto pgroup ops t3fs13_ops_1
psu addto pgroup ops t3fs13_ops_2
psu addto pgroup ops t3fs13_ops_3
psu addto pgroup ops t3fs13_ops_4
psu addto pgroup ops t3fs13_ops_5
psu addto pgroup ops t3fs14_ops
psu addto pgroup ops t3fs14_ops_1
psu addto pgroup ops t3fs14_ops_2
psu addto pgroup ops t3fs14_ops_3
psu addto pgroup ops t3fs14_ops_4
psu addto pgroup ops t3fs14_ops_5
#
# The links ...
#
psu create link cms-link cms_unit_group world-net
psu set link cms-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=10
psu add link cms-link cms
psu create link ops-link ops_unit_group world-net
psu set link ops-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=10
psu add link ops-link ops
psu create link dteam-link dteam_unit_group world-net
psu set link dteam-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=10
psu add link dteam-link dteam
#
# The link Groups ...
#
psu create linkGroup cms-linkGroup
psu set linkGroup custodialAllowed cms-linkGroup false
psu set linkGroup replicaAllowed cms-linkGroup true
psu set linkGroup nearlineAllowed cms-linkGroup false
psu set linkGroup outputAllowed cms-linkGroup true
psu set linkGroup onlineAllowed cms-linkGroup true
psu addto linkGroup cms-linkGroup cms-link
psu create linkGroup dteam-linkGroup
psu set linkGroup custodialAllowed dteam-linkGroup false
psu set linkGroup replicaAllowed dteam-linkGroup true
psu set linkGroup nearlineAllowed dteam-linkGroup false
psu set linkGroup outputAllowed dteam-linkGroup true
psu set linkGroup onlineAllowed dteam-linkGroup true
psu addto linkGroup dteam-linkGroup dteam-link
psu create linkGroup ops-linkGroup
psu set linkGroup custodialAllowed ops-linkGroup false
psu set linkGroup replicaAllowed ops-linkGroup true
psu set linkGroup nearlineAllowed ops-linkGroup false
psu set linkGroup outputAllowed ops-linkGroup true
psu set linkGroup onlineAllowed ops-linkGroup true
psu addto linkGroup ops-linkGroup ops-link
#
# Submodule [rc] : class diskCacheV111.poolManager.RequestContainerV5
#
rc onerror suspend
rc set max retries 3
rc set retry 900
rc set warning path billing
rc set poolpingtimer 600
rc set max restore unlimited
rc set sameHostCopy besteffort
rc set sameHostRetry notchecked
rc set max threads 2147483647
#
/etc/dcache/LinkGroupAuthorization.conf
Show Hide
# Puppet Managed File
LinkGroup cms-linkGroup
/cms
LinkGroup ops-linkGroup
/ops/NGI/Germany*
/ops/NGI/Switzerland*
LinkGroup dteam-linkGroup
/dteam
/dteam/Role=NULL/Capability=NULL
/etc/dcache/dcache.conf
Show Hide
# cat /etc/dcache/dcache.conf
# Puppet Managed File
dcache.layout=${host.name}
dcache.namespace=chimera
#chimera.db.user = postgres
#chimera.db.url = jdbc:postgresql://localhost/chimera?prepareThreshold=3
chimera.db.user = chimera
chimera.db.url = jdbc:postgresql://t3dcachedb03.psi.ch/chimera?prepareThreshold=3
# The following is taken from the old dCacheSetup file.
# Some configuration parameters may no longer apply.
# Dedicated user for dcache, not anymore root
dcache.user=dcache
dcache.paths.billing=/var/log/dcache
# To check permissions not just in the dir where we are but also in the upper dirs
pnfsVerifyAllLookups=true
dcache.java.memory.heap=1024m
dcache.java.memory.direct=1024m
#pool.dcap.port=0
net.inetaddr.lifetime=1800
net.wan.port.min=20000
net.wan.port.max=25000
net.lan.port.min=33115
net.lan.port.max=33145
broker.host=t3se01.psi.ch
###### POOL VARs ######
# poolIoQueue is defined in share/defaults/pool.properties and used by share/services/pool.batch ; the queue 'regular' is always created by default
poolIoQueue=wan,xrootd
waitForFiles=${path}/setup
lfs=precious
tags=hostname=${host.name}
#######################
metaDataRepository=org.dcache.pool.repository.meta.db.BerkeleyDBMetaDataRepository
useGPlazmaAuthorizationModule=false
useGPlazmaAuthorizationCell=true
# gsidcapIoQueue is defined in share/defaults/dcap.properties and used by share/services/gsidcap.batch
#gsidcapIoQueue=default
# dcapIoQueue is defined in share/defaults/dcap.properties and used by share/services/dcap.batch
#dcapIoQueue=default
##performanceMarkerPeriod=10
# gsiftpIoQueue is used by share/services/gridftp.batch
gsiftpIoQueue=wan
xrootdIoQueue=xrootd
###### SRM VARs ######
# remoteGsiftpIoQueue is defined in share/defaults/srm.properties and used by share/services/srm.batch
remoteGsiftpIoQueue=wan
srmDatabaseHost=t3dcachedb03.psi.ch
srmDbName=dcache
srmDbUser=srmdcache
srmDbPassword=
srmSpaceManagerEnabled=yes
# ---- Log to database
#
#
# If set to true, the transfer services log transfers to the srm
# database.
srmDbLogEnabled=true
# ---- Enables SRM request transition history logging
#
# Enables logging of transition history of SRM request in the
# database. The request transitions can be examined through the
# command line interface or through the the srmWatch monitoring tool.
#
# Enabling this feature increases the size and load of the database.
srmRequestHistoryDatabaseEnabled=true
######################
ftpPort=${portBase}126
kerberosFtpPort=${portBase}127
companionDatabaseHost=t3dcachedb03.psi.ch
spaceManagerDatabaseHost=t3dcachedb03.psi.ch
pinManagerDbHost=t3dcachedb03.psi.ch
defaultPnfsServer=t3dcachedb03.psi.ch
SpaceManagerReserveSpaceForNonSRMTransfers=true
SpaceManagerLinkGroupAuthorizationFileName=/etc/dcache/LinkGroupAuthorization.conf
dcache.log.dir=/var/log/dcache
billingToDb=yes
billingDbHost=t3dcachedb03.psi.ch
billingDbUser=srmdcache
billingDbPass=
billingDbName=billing
billingMaxInsertsBeforeCommit=10000
billingMaxTimeBeforeCommitInSecs=5
# info service properties
info-provider.site-unique-id=T3_CH_PSI
info-provider.se-unique-id=t3se01.psi.ch
info-provider.se-name=SRM endpoint for T3_CH_PSI
info-provider.glue-se-status=Production
info-provider.dcache-quality-level=production
info-provider.dcache-architecture=multidisk
/etc/dcache/layouts/t3se.conf
Show Hide
# Puppet Managed File
[${host.name}-Domain-dcap]
[${host.name}-Domain-dcap/dcap]
[${host.name}-Domain-gridftp]
[${host.name}-Domain-gridftp/gridftp]
[${host.name}-Domain-gsidcap]
[${host.name}-Domain-gsidcap/gsidcap]
[${host.name}-Domain-srm]
[${host.name}-Domain-srm/srm]
[${host.name}-Domain-srm/spacemanager]
[${host.name}-Domain-srm/transfermanagers]
[${host.name}-Domain-httpd]
[${host.name}-Domain-httpd/httpd]
[${host.name}-Domain-httpd/statistics]
[${host.name}-Domain-httpd/billing]
[${host.name}-Domain-httpd/srm-loginbroker]
[${host.name}-Domain-utility]
[${host.name}-Domain-utility/gsi-pam]
[${host.name}-Domain-utility/pinmanager]
[${host.name}-Domain-dir]
[${host.name}-Domain-dir/dir]
[${host.name}-Domain-info]
[${host.name}-Domain-info/info]
# to use WebDav you need to run on the Linux client
# yum install --enablerepo=dag,epel davfs2
# mount.davfs http://t3se03:2880/pnfs/psi.ch/cms/ /pnfs/psi.ch/cms
#
#[${host.name}-Domain-webdav]
#[${host.name}-Domain-webdav/webdav]
#webdav.redirect.on-read=false
#webdav.redirect.on-write=false
#webdavRootPath=/
#webdavAllowedPaths=/pnfs/psi.ch/cms
#webdavAnonymousAccess=READONLY
#webdavReadOnly=true
#webdavIoQueue=webdav
#webdavProtocol=http
[dCacheDomain]
[dCacheDomain/poolmanager]
poolmanager.cache-hit-messages.enabled=true
[dCacheDomain/broadcast]
[dCacheDomain/loginbroker]
[dCacheDomain/topo]
[${host.name}-Domain-xrootd]
[${host.name}-Domain-xrootd/xrootd]
useGPlazmaAuthorizationCell=false
useGPlazmaAuthorizationModule=true
poolmanager=${spacemanager}
xrootdAuthNPlugin=gsi
xrootdAllowedWritePaths=
xrootdMoverTimeout=28800000
# Unauthenticated
# xrootdPlugins=gplazma:none,authz:cms-tfc
# Authenticated according to gplazma
xrootdPlugins=gplazma:gsi,authz:cms-tfc
# # Change this according to your location:
xrootd.cms.tfc.path=/etc/xrootd/storage.xml
# # Must be coherent with your TFC in storage.xml:
xrootd.cms.tfc.protocol=root
[${host.name}-Domain-CMS]
[${host.name}-Domain-CMS/xrootd]
loginBroker=srm-LoginBroker
xrootdRootPath=/pnfs/psi.ch/cms/trivcat
xrootdPort=1096
xrootdMoverTimeout=28800000
/etc/dcache/layouts/t3dcachedb.conf
Show Hide
# Puppet Managed File
[${host.name}-Domain-gPlazma]
[${host.name}-Domain-gPlazma/gplazma]
gplazma.vorolemap.file=/etc/grid-security/grid-vorolemap
[${host.name}-Domain-namespace]
[${host.name}-Domain-namespace/pnfsmanager]
[${host.name}-Domain-namespace/cleaner]
[${host.name}-Domain-namespace/acl]
[${host.name}-Domain-adminDoor]
[${host.name}-Domain-adminDoor/admin]
sshVersion=ssh2
admin.ssh2AdminPort=22224
adminHistoryFile=/var/log/dcache/adminshell_history
[${host.name}-Domain-nfs]
dcache.user=root
[${host.name}-Domain-nfs/nfsv3]
## webadmin needs info Domain
#[${host.name}-Domain-webadmin]
#[${host.name}-Domain-webadmin/webadmin]
#webadminHttpsPort=8082
#webadminHttpPort=8081
#webadminDCacheInstanceName=${host.name}
##webadminAuthenticated=true
#webadminAdminGid=1000
#webadminAuthenticated=false
gPlazma2 /etc/dcache/gplazma.conf
Show Hide
auth optional x509
auth optional voms
map requisite vorolemap
map requisite authzdb
session requisite authzdb
vomsdir needed by gPlazma2
Show Hide
# find /etc/grid-security/vomsdir/
/etc/grid-security/vomsdir/
/etc/grid-security/vomsdir/ops -> symbolic link to cms
/etc/grid-security/vomsdir/cms
/etc/grid-security/vomsdir/cms/voms.fnal.gov.lsc
/etc/grid-security/vomsdir/cms/voms.cern.ch.lsc
/etc/grid-security/vomsdir/cms/lcg-voms.cern.ch.lsc
/etc/grid-security/vomsdir/dteam
/etc/grid-security/vomsdir/dteam/voms2.hellasgrid.gr.lsc
/etc/grid-security/vomsdir/dteam/voms.hellasgrid.gr.lsc
This topic: LCGTier2
> WebHome >
MeetingsBoard > MeetingSwissGridOperations20130404
Topic revision: r11 - 2013-04-05 - GianfrancoSciacca