<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M% ---+ 10. 03. 2014 Lost Sensor n.60 on each SUN Thor fileserver For some reason today in parallel we got this error; after 1h of investigation I could reproduce the error but I couldn't understand its cause; </br>I'll disable the Nagios check of Temp sensor n. 60 on the servers =t3fs[07-11]=. ---++ /opt/nagios/check_ipmi_sensor Nagios invocation <pre> [root@t3admin01 ~]# /opt/nagios/check_ipmi_sensor -vvv -f /opt/nagios/check_ipmi_sensor.user.pwd.privilege -H rmfs07 -O '--interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,%RED%60%ENDCOLOR%' ------------- begin of debug output (-vvv is set): ------------ script was executed with the following parameters: /opt/nagios/check_ipmi_sensor -vvv -f /opt/nagios/check_ipmi_sensor.user.pwd.privilege -H rmfs07 -O --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,60 check_ipmi_sensor version: 3.1 2012-05-24 FreeIPMI version: ipmi-sensors - 1.3.4 FreeIPMI was executed with the following parameters: %BLUE%/usr/sbin/ipmi-sensors -h rmfs07 --config-file /opt/nagios/check_ipmi_sensor.user.pwd.privilege --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,60 --quiet-cache --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors%ENDCOLOR% FreeIPMI return code: 0 output of FreeIPMI: ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 31.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 45.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% --------------------- end of debug output --------------------- IPMI Status: Critical [PCIE0/F20C/PRSNT = Critical ('Device Removed/Device Absent')] | 'PROC/FRONT_T_AMB'=33.00 'PROC/REAR_T_AMB'=31.00 'P0/T_CORE'=17.00 'P1/T_CORE'=19.00 'IO/REAR_T_AMB'=54.00 'IO/FRONT_T_AMB'=45.00 PROC/FRONT_T_AMB = 33.00 (Status: Nominal) PROC/REAR_T_AMB = 31.00 (Status: Nominal) P0/T_CORE = 17.00 (Status: Nominal) P1/T_CORE = 19.00 (Status: Nominal) IO/REAR_T_AMB = 54.00 (Status: Nominal) IO/FRONT_T_AMB = 45.00 (Status: Nominal) %RED%PCIE0/F20C/PRSNT = 'Device Removed/Device Absent' (Status: Critical)%ENDCOLOR% </pre> ---++ /usr/sbin/ipmi-sensors direct invocation <pre> [root@t3admin01 ~]# for i in 07 08 09 10 11 ; do echo rmfs$i ; /usr/sbin/ipmi-sensors -h rmfs$i --config-file /opt/nagios/check_ipmi_sensor.user.pwd.privilege --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,%RED%60%ENDCOLOR% --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors ; done %BLUE%rmfs07%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs07 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 31.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 45.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs08%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs08 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 32.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 46.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs09%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs09 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 34.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 23.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 47.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs10%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs10 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 36.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 37.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 21.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 21.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 58.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 49.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs11%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs11 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 18.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 55.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 48.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% [root@t3admin01 ~]# </pre> ---------------- %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M%
This topic: CmsTier3
>
WebHome
>
CMSTier3Log
>
CMSTier3Log62
Topic revision: r2 - 2014-03-10 - FabioMartinelli
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback