Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M% ---+ 10. 03. 2014 Lost Sensor n.60 on each SUN Thor fileserver For some reason today in parallel we got this error; after 1h of investigation I could reproduce the error but I couldn't understand its cause; </br>I'll disable the Nagios check of Temp sensor n. 60 on the servers =t3fs[07-11]=. ---++ /opt/nagios/check_ipmi_sensor Nagios invocation <pre> [root@t3admin01 ~]# /opt/nagios/check_ipmi_sensor -vvv -f /opt/nagios/check_ipmi_sensor.user.pwd.privilege -H rmfs07 -O '--interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,%RED%60%ENDCOLOR%' ------------- begin of debug output (-vvv is set): ------------ script was executed with the following parameters: /opt/nagios/check_ipmi_sensor -vvv -f /opt/nagios/check_ipmi_sensor.user.pwd.privilege -H rmfs07 -O --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,60 check_ipmi_sensor version: 3.1 2012-05-24 FreeIPMI version: ipmi-sensors - 1.3.4 FreeIPMI was executed with the following parameters: %BLUE%/usr/sbin/ipmi-sensors -h rmfs07 --config-file /opt/nagios/check_ipmi_sensor.user.pwd.privilege --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,60 --quiet-cache --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors%ENDCOLOR% FreeIPMI return code: 0 output of FreeIPMI: ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 31.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 45.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% --------------------- end of debug output --------------------- IPMI Status: Critical [PCIE0/F20C/PRSNT = Critical ('Device Removed/Device Absent')] | 'PROC/FRONT_T_AMB'=33.00 'PROC/REAR_T_AMB'=31.00 'P0/T_CORE'=17.00 'P1/T_CORE'=19.00 'IO/REAR_T_AMB'=54.00 'IO/FRONT_T_AMB'=45.00 PROC/FRONT_T_AMB = 33.00 (Status: Nominal) PROC/REAR_T_AMB = 31.00 (Status: Nominal) P0/T_CORE = 17.00 (Status: Nominal) P1/T_CORE = 19.00 (Status: Nominal) IO/REAR_T_AMB = 54.00 (Status: Nominal) IO/FRONT_T_AMB = 45.00 (Status: Nominal) %RED%PCIE0/F20C/PRSNT = 'Device Removed/Device Absent' (Status: Critical)%ENDCOLOR% </pre> ---++ /usr/sbin/ipmi-sensors direct invocation <pre> [root@t3admin01 ~]# for i in 07 08 09 10 11 ; do echo rmfs$i ; /usr/sbin/ipmi-sensors -h rmfs$i --config-file /opt/nagios/check_ipmi_sensor.user.pwd.privilege --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,%RED%60%ENDCOLOR% --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors ; done %BLUE%rmfs07%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs07 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 31.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 45.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs08%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs08 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 32.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 46.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs09%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs09 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 34.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 23.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 47.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs10%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs10 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 36.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 37.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 21.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 21.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 58.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 49.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% %BLUE%rmfs11%ENDCOLOR% Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs11 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 18.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 55.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 48.00 | degrees C | 'OK' %RED%60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent'%ENDCOLOR% [root@t3admin01 ~]# </pre> ---------------- %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M%
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r2 - 2014-03-10
-
FabioMartinelli
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback