Tags:
tag this topic
create new tag
view all tags
<!-- keep this as a security measure: #uncomment if the subject should only be modifiable by the listed groups # * Set ALLOWTOPICCHANGE = Main.TWikiAdminGroup,Main.CMSAdminGroup # * Set ALLOWTOPICRENAME = Main.TWikiAdminGroup,Main.CMSAdminGroup #uncomment this if you want the page only be viewable by the listed groups # * Set ALLOWTOPICVIEW = Main.TWikiAdminGroup,Main.CMSAdminGroup --> ---+!! %TOPIC% %TOC% ---++ Symptoms Summary: %FORMFIELD{"Symptom summary"}% ---++ Occurrences At what times did this problem occur (used to estimate frequency): <pre> [root@t3service01 ~]# grep Detected /var/log/remote-archive/2012/ -R | grep --color Hardware </pre> ---++ Observations <!-- #collect here the information which may help to better understand the state of the system or services, e.g. #log excerpts, strace output, etc. #this also may help to identify the problem if similar conditions arise again --> <pre> it seems to me that since we've changed the network switches the t3ui* ( most of them ) report this error: ******************* e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang: TDH <59> TDT <5a> next_to_use <5a> next_to_clean <58> buffer_info[next_to_clean]: time_stamp <1feb94e36> next_to_watch <59> jiffies <1feb952e4> next_to_watch.status <0> MAC Status <2080783> PHY Status <792d> PHY 1000BASE-T Status <7800> PHY Extended Status <3000> PCI Status <10> ******************* I'm trying to understand what that means. you can check the # of occurrences with: [root@t3service01 ~]# grep Detected /var/log/remote-archive/2012/ -R | grep --color Hardware ancient day is 2012/03/14 that matches with our 1st day of network rearrangement </pre> ---++ Solution or Workaround ---++ Monitoring for this condition <!-- #how can this condition be recognized automatically, if at all? --> <pre> [root@t3service01 ~]# grep Detected /var/log/remote-archive/2012/ -R | grep --color Hardware </pre> -- Main.FabioMartinelli - 2012-05-02
IssueForm
Affected Service
t3ui*
Symptom summary
e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang:
Reason Understood
no
Solution Exists
no
Obsolete
no
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2012-05-02
-
FabioMartinelli
CmsTier3
Log In
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
User Pages
Main Page
Policies
Monitoring Storage Space
Monitoring Slurm Usage
Physics Groups
Steering Board Meetings
Admin Pages
AdminArea
Cluster Specs
Home
Site map
CmsTier3 web
LCGTier2 web
PhaseC web
Main web
Sandbox web
TWiki web
CmsTier3 Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
E
dit
A
ttach
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback