Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSSiteLogXX][previous page]] / [[CMSSiteLogXX][next page]] of CMS site log %M% ---+ 08. 10. 2010 Phedex agent dying at irregular intervals Since the security kernel and SW packages update of last week we experienced instability of our PhEDEx agents. In the prod instance the download agent had died twice, and the block-verify agent once. The services start up fine without any problem. Today, I finally found some useful lines in the download agent log. Also it only happened in the prod instance which sees high loads. <pre> 2010-10-07 18:07:47: FileDownload[6929]: copy job job.1286209863.137 assigned to link T1_FR_CCIN2P3_Buffer -> T2_CH_CSCS with 20 tasks and p=0.281 and W=0.020 and 476 tasks in queue 2010-10-07 18:07:47: FileDownload[6929]: balancing transfers on 1 links 2010-10-07 18:07:47: FileDownload[6929]: backend busy: maximum link pending files for T1_FR_CCIN2P3_Buffer -> T2_CH_CSCS (5) reached 2010-10-07 18:07:47: FileDownload[6929]: link T1_FR_CCIN2P3_Buffer -> T2_CH_CSCS is busy at the moment, not allocating transfers 2010-10-07 18:08:02: FileDownload[6929]: balancing transfers on 1 links 2010-10-07 18:08:02: FileDownload[6929]: backend busy: maximum link pending files for T1_FR_CCIN2P3_Buffer -> T2_CH_CSCS (5) reached 2010-10-07 18:08:02: FileDownload[6929]: link T1_FR_CCIN2P3_Buffer -> T2_CH_CSCS is busy at the moment, not allocating transfers Use of uninitialized value in hash element at /home/phedex/sw/slc5_amd64_gcc434/cms/PHEDEX/PHEDEX_3_3_1/perl_lib/PHEDEX/Core/JobManager.pm line 135. Use of uninitialized value in hash element at /home/phedex/sw/slc5_amd64_gcc434/cms/PHEDEX/PHEDEX_3_3_1/perl_lib/PHEDEX/Core/JobManager.pm line 136. 6929: !!! Child process PID:22305 reaped: 6929: !!! Child process PID:22228 reaped: 6929: !!! Child process PID:22186 reaped: 6929: !!! Child process PID:22258 reaped: 6929: !!! Child process PID:22162 reaped: 6929: !!! Child process PID:22138 reaped: 6929: !!! Child process PID:22085 reaped: 6929: !!! Child process PID:22207 reaped: 6929: !!! Child process PID:22325 reaped: 6929: !!! Your program may not be using sig_child() to reap processes. 6929: !!! In extreme cases, your program can force a system reboot 6929: !!! if this resource leakage is not corrected. couldn't fork: Cannot allocate memory at /home/phedex/sw/slc5_amd64_gcc434/external/p5-poe-component-child/1.39-cmp2/lib/site_perl/5.8.8/POE/Component/Child.pm line 181 </pre> -- Main.DerekFeichtinger - 2010-10-08 ---------------- %ICON{arrowleft}% Go to [[CMSSiteLogXX][previous page]] / [[CMSSiteLogXX][next page]] of CMS site log %M%
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2010-10-08
-
DerekFeichtinger
LCGTier2
Log In
(Topic)
LCGTier2 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Users
Entry point / Contact
RoadMap
ATLAS Pages
CMS Pages
CMS User Howto
CHIPP CB
Outreach
Technical
Cluster details
Services
Hardware and OS
Tools & Tips
Monitoring
Logs
Maintenances
Meetings
Tests
Issues
Blog
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
LCGTier2 Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Warning: Can't find topic "".""
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback