Link | Service | Summary | Understood? | Solution? |
---|---|---|---|---|
IssueT3NFS01halt | /shome | All the T3 gets locked | no | no |
IssueNscdAliasedHostsNotCached | all host resolution dependent services | name resolution fails sometimes causing applications to fail. The nscd does not cache hosts which resolve to multiple IP addresses | yes | workaround |
IssueCvmfsFailsToMount | CVMFS | CVMFS fails to mount - Fuse: Failed to initialize root file catalog | no | yes |
IssueSolarisFileServerPerformance | dcache pool | Slow local read throughput to worker nodes. The t3fs05 raidz2 configured system exhibits only 50% of the throughput of the raidz t3fs01 system | no | no |
IssueX4500ControllerFailure | dcache pool | kernel panic. System standstill | yes | workaround |
IssueDcachePoolHangs | dcache pool | connections timeout. Pool shows 0 size. Pool cannot be shut down even with kill -9. OS update required | yes | yes |
IssueDcacheCorruptOSM | dcache reading | User cannot retrieve certain files which clearly exist on our fileservers and also are correctly listed in the pnfs namespace | yes | yes |
IssueSrmWrongTurl | dCache SRM | SRM returns a TURL without a domain extension | yes | yes |
IssueDcacheSrmRead2 | dCache SRM read | SRM server returns wrong TURL to client | yes | yes |
IssueDcacheSrmRead | dcache SRM read | SRM client receives no data channel gridftp connection | yes | yes |
IssueDcacheTooManySQLConnections | dcache, srm | too many pgsql connections, error 53300 | yes | yes |
IssueDcacheRead | gsiftp | gridftp (and srmcp) file read hangs forever | no | workaround |
IssueZfsNfsExportAcls | NFS exported ZFS directories | cp -p fails with Operation not supported | yes | workaround |
IssueSenseKey | None | An IS5500 Error has been propagated up to Linux | no | no |
IssueNxSessionStartupFailure | NX | after connection is establiished NX fails to start up X | yes | workaround |
IssueWhiteScreenNXClient | NX Servers running over our t3ui* | Terminal output could become invisible (white on white) in a NX session | yes | yes |
IssueSolarisPatchFail1 | Solaris System | Machine unbootable due to patch 137138-09 | yes | no |
IssueSRMcallsFailOrGetUnresponsive | SRM | SRM calls issued to t3se01.psi.ch fail or get unresponsive |
yes | yes |
Issueeth0DetectedHardwareUnitHang | t3ui* | e1000e 0000:04:00.0: eth0: Detected Hardware Unit Hang: | no | no |
IssueT3WN51reboots | t3wn51 | t3wn51 reboots | yes | yes |
IssueNewWNsKernelPanic | The whole OS | OOM Killer is invoked but sometimes it's not enoguh to make survive the OS. | yes | yes |
IssueLEDproblems | various | warning LEDs on machines are on | yes | workaround |
|
|