<!-- keep this as a security measure: * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.LCGAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.LCGAdminGroup #uncomment this if you want the page only be viewable by the internal people #* Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.LCGAdminGroup --> KeyWords: HostCe01, [[Maui]] ---+ MAUI blocked for 15 minutes from 11:51 to 12:06 MAUI blocked for about 15 minutes; by the time I noticed this and started to investigate, it resumed spontaneously and started scheduling jobs again. I suspected that this was another instance of the issue described in IssueMauiBlocks and in [[https://twiki.cscs.ch/twiki/bin/view/LCGTier2/OldPhoenixBlog#Upgrade_to_gLite_3_1_update_27_D an Old Phoenix Blog post]]. However, all suggested workarounds were in place: =nscd= was running fine, and the suggested timeout in the PBS config is still there. The node =ce01= is overloaded due to some intense gridftp/grid-job-monitor activity; this suggests that 15-minutes pauses are MAUI's "physiological" response to timeouts, and that we probably should not care much. We could, however, schedule Nagios checks to see if PBS and MAUI's logs are updated - if they are not modified every minute or so, something is probably going wrong. Attached is a quick perl script to find these 15-minutes pauses in MAUI logs. -- Main.RiccardoMurri - 25 Feb 2009 ---++ Readers' comments %COMMENT{type="below"}%
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
txt
find_15min_pauses_in_maui_log.pl.txt
r1
manage
0.4 K
2009-02-25 - 12:41
RiccardoMurri
Script to find "gaps" in MAUI logs.
This topic: LCGTier2
>
WebHome
>
PhoenixClusterBlog
>
PhoenixBlog20090225x1147
Topic revision: r2 - 2009-02-27 - RiccardoMurri
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback