Random Failures of the data management lcg-utils

As seen on LCG-ROLLOUT:

The thing is that since LFC-based tests became the official rm tests in the
SFTs, our site (BIFI) were experiencing random failures in the lcg-utils
commands (lcg-cr, lcg-cp, lcg-rep, lcg-del) reporting the classical "Invalid
argument" error message.

After days of exhaustive debugging, we found the problem to be in the
LFC-related environment variables (LCG_CATALOG_TYPE, LFC_HOST, LFC_HOME).
According to the CSH test of the SFTs, those vars were supposed to be always
correctly set and thus was confirmed after studying the SFTs sources (a
 mixture of perl and shell scripts). However, we don't know why neither under
 which circumstamces, sometimes it happens that those vars are not properly
 set, and consequently the commands fail.

The recipe we aplied to solve the problem is quite straightforward; create in
all WNs a script under /etc/profile.d/ which sets those vars for users mapped
 to the dteamsgm account (the one used for the execution of official SFTs):

Vincenzo has created a script called it sft.sh and placed it under /etc/profile.d/ on every WN:

   #!/bin/bash
   if [ `whoami` = "dteamsgm" ]
   then
      export LCG_CATALOG_TYPE=lfc
      export LFC_HOST=lfc-dteam.cern.ch
      export LFC_HOME=/grid/dteam/SFT
   fi

This fixes the problem and now the datamgmt tests light up green. -- PeterKunszt - 03 Apr 2006

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2006-04-03 - PeterKunszt
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback