Tags:
create new tag
view all tags

Swiss Grid Operations Meeting on 2013-04-04

Agenda

Status

  • CSCS (reports Pablo):
    • F2F meeting held @ CSCS: MeetingCHIPPCSCSFaceToFace20130321
    • SRM security issue striked again, still no answer from dCache support. Upgrade in one month.
    • CMS Argus authentication issues still under investigation
    • Cream03 with EMI-2 and SL6 in production. Feedback?
    • cms01 ready
    • Preparation for Maintenance on Tuesday next week: SiteMaintenance20130409

  • PSI (reports Fabio):
    • SL6 kernels upgraded to 2.6.32-358.2.1 but 2 files servers are affected by a load reporting issue xfsaild always in D-state.
    • RDAC driver upgraded from 09.03.0C08.0535 to 09.03.0C05.0642 on these 2 files servers.
    • SL5 kernels upgraded to 2.6.18-348.3.1.
    • Installed by Yum the xrootd clients at version 3.3.1-1 on each UI and WN.
    • Postgresql 9.2.2 upgraded to, at that time, latest 9.2.3-2 while now the latest is 9.2.4.
    • dCache migrated from 1.9.12-23 to 2.2.9-1: major news ( look PSI files reported below ):
      • Postgresl max # connections raised from 200 to 300, you can use this Nagios check to be alerted about them /usr/bin/check_postgres.pl --action=backends
      • Replaced ssh1 admin door with the ssh2 door.
      • Adapted dCache_gmetric.py to manage the new ssh2 case.
      • Replaced gPlazma1 with gPlazma2: migration was not transparent and I got a GGUS Ticket, now solved.
      • New Pool Selection algorithm = Weighted Available Space Selection. Needs two changes, the wass and the weights.
      • The Derek's tools are still compatible.
      • News about BDII, look /etc/dcache/dcache.conf. Does not work on the infoprovider layout.
      • When you start dCache you have to wait 10s and then: /etc/init.d/dcache-server restart t3se01-Domain-utility. It's a bug they are working out.
      • I can't make work file /var/log/dcache/adminshell_history
      • I can't make work gPlazma2 plugin nsswitch.
      • Need to test Secondary groups with gPlazma2, it fails with gPlazma1.
      • Raised to the info level each cell, nice to follow what was happening before a crash/error, look the logback.xml file.
      • Enabled Xrootd at PSI, I can copy files and I dedicated an I/O queue in dCache but I'm still missing all the details, so it's not at the production level now. It's also not strictly required for a T3 so it has low priority.

  • UNIBE (reports Gianfranco - won't be able to attend):
    • 5 out of 6 DPM disk servers are now SLC6.3, emi-dpm_disk 1.8.6-1.el6 (EMI-2)
    • New disk server with 120TB delivered, running acceptance tests, will deploy on Monday
    • We'll have 502TB (after RAID6): 400TB ATLASDATADISK (350TB pledged), 80TB LOCALGROUPDISK, 22TB for ops, SMSCG VOs
    • SLC5 and SLC6 kernel upgrade on interactive nodes/grid UI.
    • Cluster not upgraded, don't want to risk yet another ARC bug (there is already one related to the new kernel, which should not affect us). Risk very low:
      • all WNs in a private network
      • no interactive submission allowed
      • no users except grid pool accounts
      • only atlas/ch, smscg VOs, Andrej Filipcic allowed via arc.conf and grid-mapfile
Next meeting date: 2nd of May

Attendants

  • CSCS: Pablo
  • CMS: Fabio, Daniel, Derek
  • ATLAS:
  • LHCb: Roland
  • EGI:

Action items

  • Item1

PSI dCache 2.2 files

/etc/dcache/logback.xml

Show Hide
/etc/dcache/logback.xml:177:      stdout
/etc/dcache/logback.xml-178-      root
/etc/dcache/logback.xml-179-      info   <----------

/var/lib/dcache/config/poolmanager.conf

Show Hide
#
# Created by PoolManager(PoolManager) at Fri Mar 01 16:05:25 CET 2013
#
#
# Submodule CostModule (cm) : diskCacheV111.poolManager.CostModuleV1
# $Revision: 16335 $ 
#

cm set debug off
cm set update on
cm set magic on
# note new wass type
pm create -type=wass default
pm set default  -cpucostfactor=1.0 -spacecostfactor=1.0
pm set default  -idle=0.0 -p2p=0.4 -alert=0.0 -halt=0.0 -fallback=0.0
pm set default  -p2p-allowed=yes -p2p-oncost=yes -p2p-fortransfer=yes
pm set default  -stage-allowed=no
pm set default  -max-copies=500 -slope=0.0
#
# Setup of PoolManager (diskCacheV111.poolManager.PoolManagerV5) at Fri Mar 01 16:05:25 CET 2013
#
#
# Printed by diskCacheV111.poolManager.PoolSelectionUnitV2 at Fri Mar 01 16:05:25 CET 2013
#
#
psu set regex off
psu set allpoolsactive off
#
# The units ...
#
psu create unit -protocol */*
psu create unit -store  dteam:dteam@osm
psu create unit -store  ops:ops@osm
psu create unit -net    0.0.0.0/0.0.0.0
psu create unit -store  cms:cms@osm
psu create unit -store  *@*
#
# The unit Groups ...
#
psu create ugroup any-store
psu addto  ugroup any-store *@*
psu create ugroup cms_unit_group
psu addto  ugroup cms_unit_group cms:cms@osm
psu create ugroup dteam_unit_group
psu addto  ugroup dteam_unit_group dteam:dteam@osm
psu create ugroup world-net
psu addto  ugroup world-net 0.0.0.0/0.0.0.0
psu create ugroup any-protocol
psu addto  ugroup any-protocol */*
psu create ugroup ops_unit_group
psu addto  ugroup ops_unit_group ops:ops@osm
#
# The pools ...
# CMS POOLs
#psu create pool t3se01_cms
#psu create pool t3se01_cms_1
#psu create pool t3se01_ops
psu create pool t3fs01_cms
psu create pool t3fs02_cms
psu create pool t3fs03_cms
psu create pool t3fs04_cms
psu create pool t3fs04_cms_1
psu create pool t3fs07_cms
psu create pool t3fs07_cms_1
psu create pool t3fs08_cms
psu create pool t3fs08_cms_1
psu create pool t3fs09_cms
psu create pool t3fs09_cms_1
psu create pool t3fs10_cms
psu create pool t3fs10_cms_1
psu create pool t3fs11_cms
psu create pool t3fs11_cms_1
psu create pool t3fs13_cms
psu create pool t3fs13_cms_1
psu create pool t3fs13_cms_2
psu create pool t3fs13_cms_3
psu create pool t3fs13_cms_4
psu create pool t3fs13_cms_5
psu create pool t3fs13_cms_6
psu create pool t3fs14_cms
psu create pool t3fs14_cms_1
psu create pool t3fs14_cms_2
psu create pool t3fs14_cms_3
psu create pool t3fs14_cms_4
psu create pool t3fs14_cms_5
psu create pool t3fs14_cms_6
# OPS POOLs
psu create pool t3fs01_ops
psu create pool t3fs02_ops
psu create pool t3fs03_ops
psu create pool t3fs04_ops
psu create pool t3fs07_ops
psu create pool t3fs08_ops
psu create pool t3fs09_ops
psu create pool t3fs10_ops
psu create pool t3fs11_ops
psu create pool t3fs13_ops
psu create pool t3fs13_ops_1
psu create pool t3fs13_ops_2
psu create pool t3fs13_ops_3
psu create pool t3fs13_ops_4
psu create pool t3fs13_ops_5
psu create pool t3fs14_ops
psu create pool t3fs14_ops_1
psu create pool t3fs14_ops_2
psu create pool t3fs14_ops_3
psu create pool t3fs14_ops_4
psu create pool t3fs14_ops_5


#
# The pool groups ...
#
psu create pgroup dteam
#psu addto pgroup dteam t3se01_ops
psu addto pgroup dteam t3fs01_ops
psu addto pgroup dteam t3fs02_ops
psu addto pgroup dteam t3fs03_ops
psu addto pgroup dteam t3fs04_ops
psu addto pgroup dteam t3fs07_ops
psu addto pgroup dteam t3fs08_ops
psu addto pgroup dteam t3fs09_ops
psu addto pgroup dteam t3fs10_ops
psu addto pgroup dteam t3fs11_ops
psu addto pgroup dteam t3fs13_ops
psu addto pgroup dteam t3fs13_ops_1
psu addto pgroup dteam t3fs13_ops_2
psu addto pgroup dteam t3fs13_ops_3
psu addto pgroup dteam t3fs13_ops_4
psu addto pgroup dteam t3fs13_ops_5
psu addto pgroup dteam t3fs14_ops
psu addto pgroup dteam t3fs14_ops_1
psu addto pgroup dteam t3fs14_ops_2
psu addto pgroup dteam t3fs14_ops_3
psu addto pgroup dteam t3fs14_ops_4
psu addto pgroup dteam t3fs14_ops_5
#
psu create pgroup cms
#psu addto pgroup cms t3se01_cms
#psu addto pgroup cms t3se01_cms_1
psu addto pgroup cms t3fs01_cms
psu addto pgroup cms t3fs02_cms
psu addto pgroup cms t3fs03_cms
psu addto pgroup cms t3fs04_cms
psu addto pgroup cms t3fs04_cms_1
psu addto pgroup cms t3fs07_cms
psu addto pgroup cms t3fs07_cms_1
psu addto pgroup cms t3fs08_cms
psu addto pgroup cms t3fs08_cms_1
psu addto pgroup cms t3fs09_cms
psu addto pgroup cms t3fs09_cms_1
psu addto pgroup cms t3fs10_cms
psu addto pgroup cms t3fs10_cms_1
psu addto pgroup cms t3fs11_cms
psu addto pgroup cms t3fs11_cms_1
psu addto pgroup cms t3fs13_cms
psu addto pgroup cms t3fs13_cms_1
psu addto pgroup cms t3fs13_cms_2
psu addto pgroup cms t3fs13_cms_3
psu addto pgroup cms t3fs13_cms_4
psu addto pgroup cms t3fs13_cms_5
psu addto pgroup cms t3fs13_cms_6
psu addto pgroup cms t3fs14_cms
psu addto pgroup cms t3fs14_cms_1
psu addto pgroup cms t3fs14_cms_2
psu addto pgroup cms t3fs14_cms_3
psu addto pgroup cms t3fs14_cms_4
psu addto pgroup cms t3fs14_cms_5
psu addto pgroup cms t3fs14_cms_6

psu create pgroup ops
#psu addto pgroup ops t3se01_ops
psu addto pgroup ops t3fs01_ops
psu addto pgroup ops t3fs02_ops
psu addto pgroup ops t3fs03_ops
psu addto pgroup ops t3fs04_ops
psu addto pgroup ops t3fs07_ops
psu addto pgroup ops t3fs08_ops
psu addto pgroup ops t3fs09_ops
psu addto pgroup ops t3fs10_ops
psu addto pgroup ops t3fs11_ops
psu addto pgroup ops t3fs13_ops
psu addto pgroup ops t3fs13_ops_1
psu addto pgroup ops t3fs13_ops_2
psu addto pgroup ops t3fs13_ops_3
psu addto pgroup ops t3fs13_ops_4
psu addto pgroup ops t3fs13_ops_5
psu addto pgroup ops t3fs14_ops
psu addto pgroup ops t3fs14_ops_1
psu addto pgroup ops t3fs14_ops_2
psu addto pgroup ops t3fs14_ops_3
psu addto pgroup ops t3fs14_ops_4
psu addto pgroup ops t3fs14_ops_5
#
# The links ...
#
psu create link cms-link cms_unit_group world-net
psu set link cms-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=10
psu add link cms-link cms

psu create link ops-link ops_unit_group world-net
psu set link ops-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=10
psu add link ops-link ops

psu create link dteam-link dteam_unit_group world-net
psu set link dteam-link -readpref=10 -writepref=10 -cachepref=0 -p2ppref=10
psu add link dteam-link dteam
#
# The link Groups ...
#
psu create linkGroup cms-linkGroup
psu set linkGroup custodialAllowed cms-linkGroup false
psu set linkGroup replicaAllowed cms-linkGroup true
psu set linkGroup nearlineAllowed cms-linkGroup false
psu set linkGroup outputAllowed cms-linkGroup true
psu set linkGroup onlineAllowed cms-linkGroup true
psu addto linkGroup cms-linkGroup cms-link

psu create linkGroup dteam-linkGroup
psu set linkGroup custodialAllowed dteam-linkGroup false
psu set linkGroup replicaAllowed dteam-linkGroup true
psu set linkGroup nearlineAllowed dteam-linkGroup false
psu set linkGroup outputAllowed dteam-linkGroup true
psu set linkGroup onlineAllowed dteam-linkGroup true
psu addto linkGroup dteam-linkGroup dteam-link

psu create linkGroup ops-linkGroup
psu set linkGroup custodialAllowed ops-linkGroup false
psu set linkGroup replicaAllowed ops-linkGroup true
psu set linkGroup nearlineAllowed ops-linkGroup false
psu set linkGroup outputAllowed ops-linkGroup true
psu set linkGroup onlineAllowed ops-linkGroup true
psu addto linkGroup ops-linkGroup ops-link
#
# Submodule [rc] : class diskCacheV111.poolManager.RequestContainerV5
#
rc onerror suspend
rc set max retries 3
rc set retry 900
rc set warning path billing
rc set poolpingtimer 600
rc set max restore unlimited
rc set sameHostCopy besteffort
rc set sameHostRetry notchecked
rc set max threads 2147483647
#

/etc/dcache/LinkGroupAuthorization.conf ( note the * )

Show Hide
# Puppet Managed File
LinkGroup cms-linkGroup
/cms

LinkGroup ops-linkGroup
/ops/NGI/Germany*
/ops/NGI/Switzerland*

LinkGroup dteam-linkGroup
/dteam
/dteam/Role=NULL/Capability=NULL

/etc/dcache/dcache.conf

Show Hide
# cat /etc/dcache/dcache.conf 
# Puppet Managed File 

dcache.layout=${host.name}
dcache.namespace=chimera
#chimera.db.user = postgres
#chimera.db.url = jdbc:postgresql://localhost/chimera?prepareThreshold=3
chimera.db.user = chimera
chimera.db.url = jdbc:postgresql://t3dcachedb03.psi.ch/chimera?prepareThreshold=3
# The following is taken from the old dCacheSetup file.
# Some configuration parameters may no longer apply.

# Dedicated user for dcache, not anymore root
dcache.user=dcache
dcache.paths.billing=/var/log/dcache

# To check permissions not just in the dir where we are but also in the upper dirs
pnfsVerifyAllLookups=true

dcache.java.memory.heap=1024m
dcache.java.memory.direct=1024m
#pool.dcap.port=0
net.inetaddr.lifetime=1800
net.wan.port.min=20000
net.wan.port.max=25000
net.lan.port.min=33115
net.lan.port.max=33145


broker.host=t3se01.psi.ch

###### POOL VARs ######
# poolIoQueue is defined in share/defaults/pool.properties and used by share/services/pool.batch ; the queue 'regular' is always created by default 
poolIoQueue=wan,xrootd
waitForFiles=${path}/setup
lfs=precious
tags=hostname=${host.name}
#######################

metaDataRepository=org.dcache.pool.repository.meta.db.BerkeleyDBMetaDataRepository
useGPlazmaAuthorizationModule=false
useGPlazmaAuthorizationCell=true

# gsidcapIoQueue is defined in share/defaults/dcap.properties and used by share/services/gsidcap.batch
#gsidcapIoQueue=default
# dcapIoQueue is defined in share/defaults/dcap.properties and used by share/services/dcap.batch
#dcapIoQueue=default
##performanceMarkerPeriod=10
# gsiftpIoQueue is used by share/services/gridftp.batch
gsiftpIoQueue=wan

xrootdIoQueue=xrootd

###### SRM VARs ######
# remoteGsiftpIoQueue is defined in share/defaults/srm.properties and used by share/services/srm.batch
remoteGsiftpIoQueue=wan
srmDatabaseHost=t3dcachedb03.psi.ch
srmDbName=dcache
srmDbUser=srmdcache
srmDbPassword=
srmSpaceManagerEnabled=yes

# ---- Log to database
#
# 
#    If set to true, the transfer services log transfers to the srm
#    database.
srmDbLogEnabled=true
# ---- Enables SRM request transition history logging
# 
#  Enables logging of transition history of SRM request in the
#  database. The request transitions can be examined through the
#  command line interface or through the the srmWatch monitoring tool.
# 
#  Enabling this feature increases the size and load of the database.
srmRequestHistoryDatabaseEnabled=true
######################


ftpPort=${portBase}126
kerberosFtpPort=${portBase}127

companionDatabaseHost=t3dcachedb03.psi.ch
spaceManagerDatabaseHost=t3dcachedb03.psi.ch
pinManagerDbHost=t3dcachedb03.psi.ch
defaultPnfsServer=t3dcachedb03.psi.ch
SpaceManagerReserveSpaceForNonSRMTransfers=true
SpaceManagerLinkGroupAuthorizationFileName=/etc/dcache/LinkGroupAuthorization.conf
dcache.log.dir=/var/log/dcache

billingToDb=yes
billingDbHost=t3dcachedb03.psi.ch
billingDbUser=srmdcache
billingDbPass=
billingDbName=billing
billingMaxInsertsBeforeCommit=10000
billingMaxTimeBeforeCommitInSecs=5

# info service properties
info-provider.site-unique-id=T3_CH_PSI
info-provider.se-unique-id=t3se01.psi.ch
info-provider.se-name=SRM endpoint for T3_CH_PSI
info-provider.glue-se-status=Production
info-provider.dcache-quality-level=production
info-provider.dcache-architecture=multidisk

/etc/dcache/layouts/t3se.conf

Show Hide
# Puppet Managed File 

[${host.name}-Domain-dcap]
[${host.name}-Domain-dcap/dcap]

[${host.name}-Domain-gridftp]
[${host.name}-Domain-gridftp/gridftp]

[${host.name}-Domain-gsidcap]
[${host.name}-Domain-gsidcap/gsidcap]

[${host.name}-Domain-srm]
[${host.name}-Domain-srm/srm]
[${host.name}-Domain-srm/spacemanager]
[${host.name}-Domain-srm/transfermanagers]

[${host.name}-Domain-httpd]
[${host.name}-Domain-httpd/httpd]
[${host.name}-Domain-httpd/statistics]
[${host.name}-Domain-httpd/billing]
[${host.name}-Domain-httpd/srm-loginbroker]

[${host.name}-Domain-utility]
[${host.name}-Domain-utility/gsi-pam]
[${host.name}-Domain-utility/pinmanager]

[${host.name}-Domain-dir]
[${host.name}-Domain-dir/dir]

[${host.name}-Domain-info]
[${host.name}-Domain-info/info]

# to use WebDav you need to run on the Linux client
# yum install --enablerepo=dag,epel davfs2
# mount.davfs http://t3se03:2880/pnfs/psi.ch/cms/     /pnfs/psi.ch/cms
#
#[${host.name}-Domain-webdav]
#[${host.name}-Domain-webdav/webdav]
#webdav.redirect.on-read=false
#webdav.redirect.on-write=false
#webdavRootPath=/
#webdavAllowedPaths=/pnfs/psi.ch/cms
#webdavAnonymousAccess=READONLY
#webdavReadOnly=true
#webdavIoQueue=webdav
#webdavProtocol=http

[dCacheDomain]
[dCacheDomain/poolmanager]
poolmanager.cache-hit-messages.enabled=true
[dCacheDomain/broadcast]
[dCacheDomain/loginbroker]
[dCacheDomain/topo]

[${host.name}-Domain-xrootd]
[${host.name}-Domain-xrootd/xrootd]
useGPlazmaAuthorizationCell=false
useGPlazmaAuthorizationModule=true
poolmanager=${spacemanager}
xrootdAuthNPlugin=gsi
xrootdAllowedWritePaths=
xrootdMoverTimeout=28800000
# Unauthenticated
# xrootdPlugins=gplazma:none,authz:cms-tfc
# Authenticated according to gplazma
xrootdPlugins=gplazma:gsi,authz:cms-tfc
# # Change this according to your location:
xrootd.cms.tfc.path=/etc/xrootd/storage.xml
# # Must be coherent with your TFC in storage.xml:
xrootd.cms.tfc.protocol=root

[${host.name}-Domain-CMS]
[${host.name}-Domain-CMS/xrootd]
loginBroker=srm-LoginBroker
xrootdRootPath=/pnfs/psi.ch/cms/trivcat
xrootdPort=1096
xrootdMoverTimeout=28800000

/etc/dcache/layouts/t3dcachedb.conf

Show Hide
# Puppet Managed File 

[${host.name}-Domain-gPlazma]
[${host.name}-Domain-gPlazma/gplazma]
gplazma.vorolemap.file=/etc/grid-security/grid-vorolemap

[${host.name}-Domain-namespace]
[${host.name}-Domain-namespace/pnfsmanager]
[${host.name}-Domain-namespace/cleaner]
[${host.name}-Domain-namespace/acl]

[${host.name}-Domain-adminDoor]
[${host.name}-Domain-adminDoor/admin]
sshVersion=ssh2
admin.ssh2AdminPort=22224
adminHistoryFile=/var/log/dcache/adminshell_history

[${host.name}-Domain-nfs]
dcache.user=root
[${host.name}-Domain-nfs/nfsv3]

## webadmin needs info Domain
#[${host.name}-Domain-webadmin]
#[${host.name}-Domain-webadmin/webadmin]
#webadminHttpsPort=8082
#webadminHttpPort=8081
#webadminDCacheInstanceName=${host.name}
##webadminAuthenticated=true
#webadminAdminGid=1000
#webadminAuthenticated=false

gPlazma2 /etc/dcache/gplazma.conf

Show Hide
auth     optional   x509 
auth     optional   voms 
map      requisite  vorolemap 
map      requisite  authzdb 
session  requisite  authzdb

vomsdir needed by gPlazma2

Show Hide
# find /etc/grid-security/vomsdir/
/etc/grid-security/vomsdir/
/etc/grid-security/vomsdir/ops -> symbolic link to cms
/etc/grid-security/vomsdir/cms
/etc/grid-security/vomsdir/cms/voms.fnal.gov.lsc
/etc/grid-security/vomsdir/cms/voms.cern.ch.lsc
/etc/grid-security/vomsdir/cms/lcg-voms.cern.ch.lsc
/etc/grid-security/vomsdir/dteam
/etc/grid-security/vomsdir/dteam/voms2.hellasgrid.gr.lsc
/etc/grid-security/vomsdir/dteam/voms.hellasgrid.gr.lsc

/etc/xrootd/xrootd-clustered.cfg

Show Hide
# grep 64 /etc/xrootd/xrootd-clustered.cfg
oss.namelib /usr/lib64/libXrdCmsTfc.so file:/etc/xrootd/storage.xml?protocol=direct
xrootd.seclib /usr/lib64/libXrdSec.so
xrootd.fslib /usr/lib64/libXrdOfs.so
Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r13 - 2013-04-10 - FabioMartinelli
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback