Symptoms
Summary: SRM returns a TURL without a domain extension
Occurrences
At what times did this problem occur (used to estimate frequency):
Observations
SRM read requests from external sources keep hanging and timeout. Looking at the debug output reveals that the returned TURL contains a gridFTP door host name without domain extension:
srmcp -2 -debug srm://t3se01.psi.ch:8443/srm/managerv2?SFN=//pnfs/psi.ch/cms/testing/derek file:////tmp/derek501 | grep "gsiftp://"
Storage Resource Manager (SRM) Client version 2.0.9
Copyright (c) 2002-2008 Fermi National Accelerator Laboratory
copying CopyJob, source = gsiftp://t3fs01:2811//pnfs/psi.ch/cms/testing/derek destination = file:////tmp/derek501
This happens consistently for certain fileservers, but not for all of them.
The
/etc/hosts
file on an affected fileserver has an entry with a local name, only. dCache regrettably only does a local name resolution:
192.33.123.41 t3fs01 # Added by DHCP
Solution or Workaround
Either one needs to comment the
/etc/hosts
entry out or modify it by adding a leading FQDN.
Need to study how to convince DHCP to make the right entry automatically.
Monitoring for this condition
--
DerekFeichtinger - 11 May 2009