KeyWords:
SysAdmin
Issues of our 1.9.3-3 dcache installation
gPlazma problems
Note: Filed as dcache support request #5119 on 2009-09-24.
I am trying to debug an issue of failing phedex CMS exports. They all fail with authentication problems as indicated by the errors from SRM/FTS. My first suspicion is that the users who fail do not use real VOMS proxies. Therefore I also generated myself a naked grid proxy and started to test:
To have a deeper look into the problem, I am executing srm requests against our dcache while following the gPlazma log, and stracing the gPlazma process:
From the UI
srmls -l srm://storage01.lcg.cscs.ch:8443/srm/managerv2?SFN=/pnfs/lcg.cscs.ch/cms
Every such event leaves in the
/var/log/d-cache/gPlazma-storage02Domain.log
:
Exception thrown by gplazma.authz.plugins.vorolemap.VORoleMapAuthzPlugin: java.lang.NullPointerException
22 Sep 2009 16:47:42 (gPlazma) [v2:srmLs:71788691 SRM-storage01] caught exception:
Following the cell and its siblings for some sys calls with strace
strace -e trace=open,stat64 -fp 4749
Process 4749 attached with 84 threads - interrupt to quit
every invocation of srmls with the non-VOMS-proxy resulted in the following outputs from strace (sometimes there were additional files read, but always there was a SIGSEGV):
[pid 6925] stat64("/opt/d-cache/etc/dcachesrm-gplazma.policy", {st_mode=S_IFREG|0444, st_size=3859, ...}) = 0
[pid 6925] stat64("/etc/grid-security/grid-vorolemap", {st_mode=S_IFREG|0444, st_size=42556, ...}) = 0
[pid 6925] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Note that the files are not really read - probably because dcache only tests whether they have changed (when I touch a file, I can see an open on the next run).
I am surprised... Segmentation Errors in Java VM. That should not be possible, right? (but I'm just a C/C++ guy).
I see log lines like the above ones quite often in the gPlazma log, and therefore at least this is a hint as to a number of non-VOMS proxies trying to access our site. However, I would like to get a log line about which DN is responsible, instead of a SIGSEGV.
dCache gPlazma config
Excerpt from our
dcachesrm-gplazma.policy
file:
# Switches
xacml-vo-mapping="OFF"
saml-vo-mapping="OFF"
# we want no more kpwd mapping at CSCS. Note that this may cause problems with
# users relying on non-voms proxies
kpwd="OFF"
grid-mapfile="OFF"
gplazmalite-vorole-mapping="ON"
# Priorities
xacml-vo-mapping-priority="5"
saml-vo-mapping-priority="1"
kpwd-priority="3"
grid-mapfile-priority="4"
gplazmalite-vorole-mapping-priority="2"
-
DerekFeichtinger - 2009-09-22