<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSTier3Log62][previous page]] / [[CMSTier3Log64][next page]] of Tier3 site log %M% ---+ 19. 03. 2014 t3fs05 unresponsive ---++ Symptoms The =t3fs05= fileserver hosting =/swshare= became unresponsive during the night; this made various Nagios checks and basically all interactive operation fail (some folders in =/swshare= are in the default =$PATH=) ---++ Solution The host seemed to be running (power supply live) but connecting through the IPMI console was not possible. <pre> -> show /SYS/PS0/PWROK /SYS/PS0/PWROK Targets: Properties: type = Power Supply class = Discrete Sensor value = State Asserted Commands: cd show </pre> Restarting with <pre> # ipmitool -I lanplus -H rmfs05 -U root -f /root/private/ipmi-pw chassis power reset </pre> made the server boot again. During the boot sequence the serial console printed the following lines: <pre> SunOS Release 5.10 Version Generic_141445-09 64-bit Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Hostname: t3fs05.psi.ch Reading ZFS config: done. Mounting ZFS filesystems: (8/8) Mar 19 07:39:47 svc.startd[7]: network/cswsmartd:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details) Mar 19 07:39:47 svc.startd[7]: failed to abandon contract 66: Permission denied t3fs05.psi.ch console login: Mar 19 07:40:09 t3fs05.psi.ch xntpd[650]: getnetnum: "dmztime1.psi.ch" invalid host number, line ignored Mar 19 07:40:09 t3fs05.psi.ch xntpd[650]: getnetnum: "dmztime2.psi.ch" invalid host number, line ignored </pre> ---++ Lessons Learned * The hardware of =t3fs05= and =t3fs06= is getting old; we may see more failures * Some Nagios checks (even on other hosts) depend on =t3fs05= (again due to the =$PATH= environment variable) -- Main.DanielMeister - 2014-03-20 ---------------- %ICON{arrowleft}% Go to [[CMSTier3Log62][previous page]] / [[CMSTier3Log64][next page]] of Tier3 site log %M%
This topic: CmsTier3
>
WebHome
>
CMSTier3Log
>
CMSTier3Log63
Topic revision: r1 - 2014-03-20 - DanielMeister
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback