Symptoms
Summary: Cannot move ISB, Problem to detect the lifetime of the proxy
Occurrences
At what times did this problem occur (used to estimate frequency):
2011-08-05 |
After wiping Scratch FS |
Observations
This happens when the
CreamCE thinks there is a delegated proxy in the Cream Sandbox, but it was deleted from there for whatever reason.
You can detect it happening because jobs fail inmediately after starting (exit_status=1) and Cream reports this error in *glite-ce-cream.log*"
05 Aug 2011 16:45:49,917 INFO org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor (AbstractJobExecutor.java:2163) -
(Worker Thread 49) JOB CREAM459165977 STATUS CHANGED: RUNNING => DONE-FAILED
[failureReason=Cannot move ISB (retry_copy ${globus_transfer_cmd} gsiftp://rb1.cyf-kr.edu.pl:2811/var/glite/SandboxDir/x2/
https_3a_2f_2flb.reef.man.poznan.pl_3a9000_2fx2r6TFgNA_5f7pIJOPpag2Sg/input/h1mcLauncher_perl_ftt.sh
file:///lustre/scratch/home/egee/honeprd/home_cre01_459165977/CREAM459165977/h1mcLauncher_perl_ftt.sh):
Problem to detect the lifetime of the proxy] [localUser=honeprd] [gridJobId=https://lb.reef.man.poznan.pl:9000/x2r6TFgNA_7pIJOPpag2Sg]
[workerNode=wn122.lcg.cscs.ch] [delegationId=13041121772E840574rb12Ecyf2Dkr2Eedu2Epl]
Solution or Workaround
This can be solved by removing the entry from the
CreamCE where it says there's an existing delegated proxy. If you run this script in the
CreamCE log dir:
grep lifetime glite-ce-cream.log glite-ce-cream.log.? | grep DONE-FAILED | egrep -o "delegationId=[^]]*" | \
sed 's/dele.*\=//' | sort -u | while read i; do echo "delete from t_credential where dlg_id='$i';"; done
It will tell you the sql commands you need to execute in the cream DB:
mysql -u cream -p delegationdb
Deleting delegated proxies on bad files
Another possibility is to detect bad files and delete the entries in the database. This may be useful:
for i in `find /lustre/scratch/CREAM_CE/cream02 -type d -maxdepth 3`; \
do ls -la $i/proxy;done 2> /dev/null | awk '/\?/ {print $7}' | while read i; do echo "delete from t_credential where dlg_id='$i';"; done
Monitoring for this condition
--
PabloFernandez - 2011-08-05