Tags:
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> %TOC% %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M% ---+ 10. 03. 2014 Lost Sensor n.60 on each SUN Thor fileserver For some reason today in parallel we got this: <pre> [root@t3admin01 ~]# /opt/nagios/check_ipmi_sensor -vvv -f /opt/nagios/check_ipmi_sensor.user.pwd.privilege -H rmfs07 -O '--interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,%RED%60%ENDCOLOR%' ------------- begin of debug output (-vvv is set): ------------ script was executed with the following parameters: /opt/nagios/check_ipmi_sensor -vvv -f /opt/nagios/check_ipmi_sensor.user.pwd.privilege -H rmfs07 -O --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,60 check_ipmi_sensor version: 3.1 2012-05-24 FreeIPMI version: ipmi-sensors - 1.3.4 FreeIPMI was executed with the following parameters: %BLUE%/usr/sbin/ipmi-sensors -h rmfs07 --config-file /opt/nagios/check_ipmi_sensor.user.pwd.privilege --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,60 --quiet-cache --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors%ENDCOLOR% FreeIPMI return code: 0 output of FreeIPMI: ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 31.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 45.00 | degrees C | 'OK' 60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent' --------------------- end of debug output --------------------- IPMI Status: Critical [PCIE0/F20C/PRSNT = Critical ('Device Removed/Device Absent')] | 'PROC/FRONT_T_AMB'=33.00 'PROC/REAR_T_AMB'=31.00 'P0/T_CORE'=17.00 'P1/T_CORE'=19.00 'IO/REAR_T_AMB'=54.00 'IO/FRONT_T_AMB'=45.00 PROC/FRONT_T_AMB = 33.00 (Status: Nominal) PROC/REAR_T_AMB = 31.00 (Status: Nominal) P0/T_CORE = 17.00 (Status: Nominal) P1/T_CORE = 19.00 (Status: Nominal) IO/REAR_T_AMB = 54.00 (Status: Nominal) IO/FRONT_T_AMB = 45.00 (Status: Nominal) %RED%PCIE0/F20C/PRSNT = 'Device Removed/Device Absent' (Status: Critical)%ENDCOLOR% </pre> ---+ /usr/sbin/ipmi-sensors invocation <pre> [root@t3admin01 ~]# for i in 07 08 09 10 11 ; do echo rmfs$i ; /usr/sbin/ipmi-sensors -h rmfs$i --config-file /opt/nagios/check_ipmi_sensor.user.pwd.privilege --interpret-oem-data --ignore-not-available-sensors --non-abbreviated-units --record-ids=34,35,38,45,48,49,%RED%60%ENDCOLOR% --sdr-cache-recreate --interpret-oem-data --output-sensor-state --ignore-not-available-sensors ; done rmfs07 Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs07 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 31.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 45.00 | degrees C | 'OK' 60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent' rmfs08 Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs08 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 32.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 46.00 | degrees C | 'OK' 60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent' rmfs09 Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs09 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 34.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 17.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 23.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 54.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 47.00 | degrees C | 'OK' 60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent' rmfs10 Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs10 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 36.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 37.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 21.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 21.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 58.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 49.00 | degrees C | 'OK' 60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent' rmfs11 Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-t3admin01.rmfs11 Caching SDR record 396 of 396 (current record ID 396) ID | Name | Type | State | Reading | Units | Event 34 | PROC/FRONT_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 35 | PROC/REAR_T_AMB | Temperature | Nominal | 33.00 | degrees C | 'OK' 38 | P0/T_CORE | Temperature | Nominal | 18.00 | degrees C | 'OK' 45 | P1/T_CORE | Temperature | Nominal | 19.00 | degrees C | 'OK' 48 | IO/REAR_T_AMB | Temperature | Nominal | 55.00 | degrees C | 'OK' 49 | IO/FRONT_T_AMB | Temperature | Nominal | 48.00 | degrees C | 'OK' 60 | PCIE0/F20C/PRSNT | Entity Presence | Critical | N/A | N/A | 'Device Removed/Device Absent' [root@t3admin01 ~]# </pre> -- Main.FabioMartinelli - 2014-03-10 ---------------- %ICON{arrowleft}% Go to [[CMSTier3LogXX][previous page]] / [[CMSTier3LogXX][next page]] of Tier3 site log %M%
Edit
|
Attach
|
Watch
|
P
rint version
|
H
istory
:
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Raw edit
|
More topic actions...
Topic revision: r1 - 2014-03-10
-
FabioMartinelli
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Edit
Attach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback