Summary: warning LEDs on machines are on


At what times did this problem occur (used to estimate frequency):


t3fs05 fileserver (SUN X4500)

An earlier occurence is mentioned as a side issue in IssueDcachePoolHangs

On 2010-01-06 the rear LED (on the front panel) was lighting up yellow and the warning LED was blinking. Listing the LED status using IPMI showed no problem:

ipmitool -I lanplus -H rmfs05 -U root -f /root/private/ipmi-pw sdr list generic

bp.alert.led     | Generic @20:2C.2  | ok
bp.locate.led    | Generic @20:2C.1  | ok
bp.power.led     | Generic @20:2C.0  | ok
fp.alert.led     | Generic @20:18.2  | ok
fp.locate.led    | Generic @20:18.1  | ok
fp.power.led     | Generic @20:18.0  | ok
sys.rear_svc.led | Generic @20:18.3  | ok

Using the web front end, I see differing information


The event logs show

2401 | 12/11/2009 | 13:08:49 | System Firmware Progress | System boot initiated | Asserted
2501 | 12/31/2009 | 14:42:51 | Voltage io.v_-12v | Lower Non-critical going low  | Reading -13.08 < Threshold -13.01 Volts

-> version
SP firmware 1.1.8
SP firmware build number: 19341
SP firmware date: Fri May 25 14:31:22 PDT 2007
SP filesystem version: 0.1.14

This ILOM still allows login through the sunservice account. Looking into the embedded Linux does not reveal problems as for the X4150 problems with the old ILOMs, where a process on the embedded Linux had a memory leak that caused the kernel to kill other processes, leading to various problems.

One sad fact: The IPMI information on the LEDs does not match the actual LEDs

Solution or Workaround

If there does not seem to be a real error condition and the ILOM seems to be at fault, then a reset of the ILOM service processor will bring the LEDs to a sane state.

reset /SP

Monitoring for this condition

Since the IMPI output does not reflect this condition, the easiest check is actually to have a direct look at the machines in the compute center. Using the web frontend is too inefficient.

-- DerekFeichtinger - 2010-01-06

-- DerekFeichtinger - 29 Aug 2008

Affected Service various
Symptom summary warning LEDs on machines are on
Reason Understood yes
Solution Exists workaround
Obsolete no
Topic revision: r1 - 2010-01-06 - DerekFeichtinger
