Tags:
dcache1Add my vote for this tag create new tag
view all tags

Node Type: dCachet3fs13t3fs14

Firewall requirements

local port open to reason
2811/tcp * gridftp control connection
22125/tcp 192.33.123.0/24 unauthenticated dcap (read only)
22128/tcp 192.33.123.0/24 gsidcap (GSI authenticated dcap)
20000-25000/tcp * Globus port range for gridftp/xrootd data streams


Regular Maintenance work

Once put in production the configuration of these servers is basically frozen, so you can forget them or simply keep an eye on our Nagios t3fs13, t3fs14. There you'll also find the pnp4nagios graphs about the XFS filesystems usage

Emergency Measures

HP warranties

  • HP Support WebSite to open a call
  • SN t3fs13: CZ31513SBR ; Product number 583914-B21
  • SN t3fs14: CZ31513SBT ; Product number 583914-B21
  • Product description: HP ProLiant DL380 G7 Server
  • Date of warranty check : 2011-12-16
  • Entitlement type: Base Warranty
  • Start date: 2011-12-16
  • Title: Wty: HP HW Maintenance Onsite SupportSupport
  • Status: Active
  • Start date: Dec 16, 2011
  • End date: 31-07-2018
  • Service level: Standard Parts Logistics
  • Deliverables: Onsite Support Parts and Material provided Hardware Problem Diagnosis
  • Title: Wty: HP Support for Initial SetupSupport
  • Status: Active
  • Start date: Dec 16, 2011
  • End date: Apr 13, 2012
  • Service level: Unlimited Named Callers
  • Deliverables: Initial Setup Assistance

Checking failures by Nagios

Nagios will notice a SW/HW failure : if it's a HW failure open a case on the HP Support WebSite

These CLI tools show the status of HP components:

  • hpasmcli: Status about the HP HW
  • hpacucli: specific for the HP RAID controllers

/usr/local/nagios/libexec/check_hpasm

More... Close
/usr/local/nagios/libexec/check_hpasm --perfdata=short --timeout=20 --servertype proliant -vvv
calling /sbin/hpasmcli

skipping temperature #16       I/O_ZONE              -         70C/158F 
skipping temperature #17       I/O_ZONE              -         70C/158F 
skipping temperature #18       I/O_ZONE              -         70C/158F 
skipping temperature #28       I/O_ZONE              -         70C/158F 
HP::Proliant::Component::DiskSubsystem::Da::CLI controllers und platten zusammenf?hren
has 0 controllers
has 0 accelerators
has 0 physical_drives
has 0 logical_drives
has 0 spare_drives
HP::Proliant::Component::DiskSubsystem::Sas::CLI controllers und platten zusammenf?hren
has 0 controllers
has 0 physical_drives
has 0 logical_drives
has 0 spare_drives
HP::Proliant::Component::DiskSubsystem::Scsi::CLI controllers und platten zusammenf?hren
has 0 controllers
has 0 physical_drives
has 0 logical_drives
has 0 spare_drives
HP::Proliant::Component::DiskSubsystem::Ide::CLI controllers und platten zusammenf?hren
has 0 controllers
has 0 physical_drives
has 0 logical_drives
has 0 spare_drives
HP::Proliant::Component::DiskSubsystem::Fca::CLI controllers und platten zusammenf?hren
has 0 host controllers
has 0 controllers
has 0 physical_drives
has 0 logical_drives
has 0 spare_drives
[CPU_0]
cpqSeCpuSlot: 1
cpqSeCpuUnitIndex: 0
cpqSeCpuName: Intel Xeon
cpqSeCpuStatus: ok
info: cpu 0 is ok

[CPU_1]
cpqSeCpuSlot: 2
cpqSeCpuUnitIndex: 1
cpqSeCpuName: Intel Xeon
cpqSeCpuStatus: ok
info: cpu 1 is ok

[PS_1]
cpqHeFltTolPowerSupplyBay: 1
cpqHeFltTolPowerSupplyChassis: 1
cpqHeFltTolPowerSupplyPresent: present
cpqHeFltTolPowerSupplyCondition: ok
cpqHeFltTolPowerSupplyRedundant: redundant
info: powersupply 1 is ok

[PS_2]
cpqHeFltTolPowerSupplyBay: 2
cpqHeFltTolPowerSupplyChassis: 1
cpqHeFltTolPowerSupplyPresent: present
cpqHeFltTolPowerSupplyCondition: ok
cpqHeFltTolPowerSupplyRedundant: redundant
info: powersupply 2 is ok

[FAN_1]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 1
cpqHeFltTolFanLocale: system
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: other
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: notRedundant
cpqHeFltTolFanRedundantPartner: 0
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: hotPluggable
info: fan 1 is present, speed is normal, pctmax is 29%, location is system, redundance is notRedundant, partner is 0

[FAN_2]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 2
cpqHeFltTolFanLocale: system
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: other
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: notRedundant
cpqHeFltTolFanRedundantPartner: 0
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: hotPluggable
info: fan 2 is present, speed is normal, pctmax is 29%, location is system, redundance is notRedundant, partner is 0

[FAN_3]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 3
cpqHeFltTolFanLocale: system
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: other
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: notRedundant
cpqHeFltTolFanRedundantPartner: 0
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: hotPluggable
info: fan 3 is present, speed is normal, pctmax is 45%, location is system, redundance is notRedundant, partner is 0

[FAN_4]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 4
cpqHeFltTolFanLocale: system
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: other
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: notRedundant
cpqHeFltTolFanRedundantPartner: 0
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: hotPluggable
info: fan 4 is present, speed is normal, pctmax is 53%, location is system, redundance is notRedundant, partner is 0

[FAN_5]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 5
cpqHeFltTolFanLocale: system
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: other
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: notRedundant
cpqHeFltTolFanRedundantPartner: 0
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: hotPluggable
info: fan 5 is present, speed is normal, pctmax is 53%, location is system, redundance is notRedundant, partner is 0

[FAN_6]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 6
cpqHeFltTolFanLocale: system
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: other
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: notRedundant
cpqHeFltTolFanRedundantPartner: 0
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: hotPluggable
info: fan 6 is present, speed is normal, pctmax is 13%, location is system, redundance is notRedundant, partner is 0

[TEMP_1]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 1
cpqHeTemperatureLocale: ambient
cpqHeTemperatureCelsius: 18
cpqHeTemperatureThreshold: 41
cpqHeTemperatureCondition: unknown
info: 1 ambient temperature is 18C (41 max)

[TEMP_2]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 2
cpqHeTemperatureLocale: cpu#1
cpqHeTemperatureCelsius: 40
cpqHeTemperatureThreshold: 82
cpqHeTemperatureCondition: unknown
info: 2 cpu#1 temperature is 40C (82 max)

[TEMP_3]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 3
cpqHeTemperatureLocale: cpu#2
cpqHeTemperatureCelsius: 40
cpqHeTemperatureThreshold: 82
cpqHeTemperatureCondition: unknown
info: 3 cpu#2 temperature is 40C (82 max)

[TEMP_4]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 4
cpqHeTemperatureLocale: memory_bd
cpqHeTemperatureCelsius: 28
cpqHeTemperatureThreshold: 87
cpqHeTemperatureCondition: unknown
info: 4 memory_bd temperature is 28C (87 max)

[TEMP_5]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 5
cpqHeTemperatureLocale: memory_bd
cpqHeTemperatureCelsius: 30
cpqHeTemperatureThreshold: 87
cpqHeTemperatureCondition: unknown
info: 5 memory_bd temperature is 30C (87 max)

[TEMP_6]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 6
cpqHeTemperatureLocale: memory_bd
cpqHeTemperatureCelsius: 28
cpqHeTemperatureThreshold: 87
cpqHeTemperatureCondition: unknown
info: 6 memory_bd temperature is 28C (87 max)

[TEMP_7]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 7
cpqHeTemperatureLocale: memory_bd
cpqHeTemperatureCelsius: 32
cpqHeTemperatureThreshold: 87
cpqHeTemperatureCondition: unknown
info: 7 memory_bd temperature is 32C (87 max)

[TEMP_8]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 8
cpqHeTemperatureLocale: power_supply_bay
cpqHeTemperatureCelsius: 34
cpqHeTemperatureThreshold: 90
cpqHeTemperatureCondition: unknown
info: 8 power_supply_bay temperature is 34C (90 max)

[TEMP_9]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 9
cpqHeTemperatureLocale: power_supply_bay
cpqHeTemperatureCelsius: 29
cpqHeTemperatureThreshold: 65
cpqHeTemperatureCondition: unknown
info: 9 power_supply_bay temperature is 29C (65 max)

[TEMP_10]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 10
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 39
cpqHeTemperatureThreshold: 90
cpqHeTemperatureCondition: unknown
info: 10 system_bd temperature is 39C (90 max)

[TEMP_11]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 11
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 30
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 11 system_bd temperature is 30C (70 max)

[TEMP_12]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 12
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 38
cpqHeTemperatureThreshold: 90
cpqHeTemperatureCondition: unknown
info: 12 system_bd temperature is 38C (90 max)

[TEMP_13]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 13
cpqHeTemperatureLocale: i/o_zone
cpqHeTemperatureCelsius: 26
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 13 i/o_zone temperature is 26C (70 max)

[TEMP_14]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 14
cpqHeTemperatureLocale: i/o_zone
cpqHeTemperatureCelsius: 30
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 14 i/o_zone temperature is 30C (70 max)

[TEMP_15]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 15
cpqHeTemperatureLocale: i/o_zone
cpqHeTemperatureCelsius: 30
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 15 i/o_zone temperature is 30C (70 max)

[TEMP_19]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 19
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 24
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 19 system_bd temperature is 24C (70 max)

[TEMP_20]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 20
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 25
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 20 system_bd temperature is 25C (70 max)

[TEMP_21]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 21
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 27
cpqHeTemperatureThreshold: 80
cpqHeTemperatureCondition: unknown
info: 21 system_bd temperature is 27C (80 max)

[TEMP_22]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 22
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 27
cpqHeTemperatureThreshold: 80
cpqHeTemperatureCondition: unknown
info: 22 system_bd temperature is 27C (80 max)

[TEMP_23]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 23
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 32
cpqHeTemperatureThreshold: 77
cpqHeTemperatureCondition: unknown
info: 23 system_bd temperature is 32C (77 max)

[TEMP_24]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 24
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 30
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 24 system_bd temperature is 30C (70 max)

[TEMP_25]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 25
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 26
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 25 system_bd temperature is 26C (70 max)

[TEMP_26]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 26
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 27
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 26 system_bd temperature is 27C (70 max)

[TEMP_27]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 27
cpqHeTemperatureLocale: i/o_zone
cpqHeTemperatureCelsius: 30
cpqHeTemperatureThreshold: 70
cpqHeTemperatureCondition: unknown
info: 27 i/o_zone temperature is 30C (70 max)

[TEMP_29]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 29
cpqHeTemperatureLocale: scsi_backplane_zone
cpqHeTemperatureCelsius: 35
cpqHeTemperatureThreshold: 60
cpqHeTemperatureCondition: unknown
info: 29 scsi_backplane_zone temperature is 35C (60 max)

[TEMP_30]
cpqHeTemperatureChassis: 1
cpqHeTemperatureIndex: 30
cpqHeTemperatureLocale: system_bd
cpqHeTemperatureCelsius: 58
cpqHeTemperatureThreshold: 110
cpqHeTemperatureCondition: unknown
info: 30 system_bd temperature is 58C (110 max)

dimm module 1:3 (module 3 @ cartridge 1) is ok
dimm module 1:6 (module 6 @ cartridge 1) is ok
dimm module 1:9 (module 9 @ cartridge 1) is ok
dimm module 2:3 (module 3 @ cartridge 2) is ok
dimm module 2:6 (module 6 @ cartridge 2) is ok
dimm module 2:9 (module 9 @ cartridge 2) is ok
i dump the memory
car 01  mod 03  siz 8589934592  sta present       con ok          typ 
car 01  mod 06  siz 8589934592  sta present       con ok          typ 
car 01  mod 09  siz 8589934592  sta present       con ok          typ 
car 02  mod 03  siz 8589934592  sta present       con ok          typ 
car 02  mod 06  siz 8589934592  sta present       con ok          typ 
car 02  mod 09  siz 8589934592  sta present       con ok          typ 
[EVENT_5]
cpqHeEventLogEntryNumber: 5
cpqHeEventLogEntrySeverity: info
cpqHeEventLogEntryCount: 1
cpqHeEventLogInitialTime: Tue Mar 20 15:44:00 2012
cpqHeEventLogUpdateTime: Tue Mar 20 15:44:00 2012
cpqHeEventLogErrorDesc: IML Cleared (iLO 3 user:root).
info: Event: 5 Added: 1332254640 Class: (Maintenance Note) info IML Cleared (iLO 3 user:root).

OK - System: 'proliant dl380 g7', S/N: 'CZ31513SBR', ROM: 'P67 05/05/2011', hardware working fine, cpu_0=ok cpu_1=ok ps_1=ok ps_2=ok fan_1=29% fan_2=29% fan_3=45% fan_4=53% fan_5=53% fan_6=13% temp_1=18 temp_2=40 temp_3=40 temp_4=28 temp_5=30 temp_6=28 temp_7=32 temp_8=34 temp_9=29 temp_10=39 temp_11=30 temp_12=38 temp_13=26 temp_14=30 temp_15=30 temp_19=24 temp_20=25 temp_21=27 temp_22=27 temp_23=32 temp_24=30 temp_25=26 temp_26=27 temp_27=30 temp_29=35 temp_30=58
checking cpus
cpu 0 is ok
cpu 1 is ok
checking power supplies
powersupply 1 is ok
powersupply 2 is ok
checking fans
fan 1 is present, speed is normal, pctmax is 29%, location is system, redundance is notRedundant, partner is 0
fan 2 is present, speed is normal, pctmax is 29%, location is system, redundance is notRedundant, partner is 0
fan 3 is present, speed is normal, pctmax is 45%, location is system, redundance is notRedundant, partner is 0
fan 4 is present, speed is normal, pctmax is 53%, location is system, redundance is notRedundant, partner is 0
fan 5 is present, speed is normal, pctmax is 53%, location is system, redundance is notRedundant, partner is 0
fan 6 is present, speed is normal, pctmax is 13%, location is system, redundance is notRedundant, partner is 0
checking temperatures
1 ambient temperature is 18C (41 max)
2 cpu#1 temperature is 40C (82 max)
3 cpu#2 temperature is 40C (82 max)
4 memory_bd temperature is 28C (87 max)
5 memory_bd temperature is 30C (87 max)
6 memory_bd temperature is 28C (87 max)
7 memory_bd temperature is 32C (87 max)
8 power_supply_bay temperature is 34C (90 max)
9 power_supply_bay temperature is 29C (65 max)
10 system_bd temperature is 39C (90 max)
11 system_bd temperature is 30C (70 max)
12 system_bd temperature is 38C (90 max)
13 i/o_zone temperature is 26C (70 max)
14 i/o_zone temperature is 30C (70 max)
15 i/o_zone temperature is 30C (70 max)
19 system_bd temperature is 24C (70 max)
20 system_bd temperature is 25C (70 max)
21 system_bd temperature is 27C (80 max)
22 system_bd temperature is 27C (80 max)
23 system_bd temperature is 32C (77 max)
24 system_bd temperature is 30C (70 max)
25 system_bd temperature is 26C (70 max)
26 system_bd temperature is 27C (70 max)
27 i/o_zone temperature is 30C (70 max)
29 scsi_backplane_zone temperature is 35C (60 max)
30 system_bd temperature is 58C (110 max)
checking memory
dimm module 1:3 (module 3 @ cartridge 1) is ok
dimm module 1:6 (module 6 @ cartridge 1) is ok
dimm module 1:9 (module 9 @ cartridge 1) is ok
dimm module 2:3 (module 3 @ cartridge 2) is ok
dimm module 2:6 (module 6 @ cartridge 2) is ok
dimm module 2:9 (module 9 @ cartridge 2) is ok
checking disk subsystem
checking ASR
checking events
Event: 5 Added: 1332254640 Class: (Maintenance Note) info IML Cleared (iLO 3 user:root). | fan_1=29% fan_2=29% fan_3=45% fan_4=53% fan_5=53% fan_6=13% temp_1=18;41;41 temp_2=40;82;82 temp_3=40;82;82 temp_4=28;87;87 temp_5=30;87;87 temp_6=28;87;87 temp_7=32;87;87 temp_8=34;90;90 temp_9=29;65;65 temp_10=39;90;90 temp_11=30;70;70 temp_12=38;90;90 temp_13=26;70;70 temp_14=30;70;70 temp_15=30;70;70 temp_19=24;70;70 temp_20=25;70;70 temp_21=27;80;80 temp_22=27;80;80 temp_23=32;77;77 temp_24=30;70;70 temp_25=26;70;70 temp_26=27;70;70 temp_27=30;70;70 temp_29=35;60;60 temp_30=58;110;110

/usr/lib64/nagios/plugins/check_cciss-1.9

More... Close
/usr/lib64/nagios/plugins/check_cciss-1.9 -s -v -p -d
### Check if "HP Smart Array" (/proc/driver/cciss/cciss) is present >>>\ncat: /proc/driver/cciss/cciss*: No such file or directory\n
### Check if "HP Smart Array" (/proc/scsi/scsi) is present >>>\nAttached devices: Host: scsi0 Channel: 03 Id: 00 Lun: 00 Vendor: HP Model: P410i Rev: 5.70 Type: RAID ANSI SCSI revision: 05 Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: HP Model: LOGICAL VOLUME Rev: 5.70 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 01 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 02 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 03 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 04 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 05 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 06 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 07 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 08 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 09 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 10 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 11 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi1 Channel: 00 Id: 00 Lun: 12 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 01 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 02 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 03 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 04 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 05 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 06 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 07 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 08 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 09 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 10 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 11 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi2 Channel: 00 Id: 00 Lun: 12 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 00 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 01 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 02 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 03 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 04 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 05 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 06 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 07 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 08 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 09 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 10 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 11 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi3 Channel: 00 Id: 00 Lun: 12 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 00 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 01 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 02 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 03 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 04 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 05 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 06 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 07 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 08 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 09 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 10 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 11 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi4 Channel: 00 Id: 00 Lun: 12 Vendor: LSI Model: INF-01-00 Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 01 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 02 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 03 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 04 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 05 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 06 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 07 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 08 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 09 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 10 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 11 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 00 Lun: 12 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 01 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 02 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 03 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 04 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 05 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 06 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 07 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 08 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 09 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 10 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 11 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi5 Channel: 00 Id: 01 Lun: 12 Vendor: LSI Model: VirtualDisk Rev: 0786 Type: Direct-Access ANSI SCSI revision: 05\n
### Check if "HP Array Utility CLI" is present >>>\n/usr/sbin/hpacucli\n
### Check if "HP Controller" work correctly >>>\n Smart Array P410i in Slot 0 (Embedded) Controller Status: OK Cache Status: OK Battery/Capacitor Status: OK\n
### Get "Slot" & exclude slot not needed >>>\n0\n
### Get "logicaldrive" for slot >>>\n Smart Array P410i in Slot 0 (Embedded) array A logicaldrive 1 (279.4 GB, RAID 1, OK)\n
### Get "physicaldrive" for slot >>>\n physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 300 GB, OK) physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 300 GB, OK)\n
### Get "Chassis" & exclude chassis not needed >>>\n\n
### Check STATUS >>>
RAID OK:  Smart Array P410i in Slot 0 (Embedded) array A logicaldrive 1 (279.4 GB, RAID 1, OK) physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 300 GB, OK) physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 300 GB, OK) [Controller Status: OK Cache Status: OK Battery/Capacitor Status: OK]

10Gbit/s failure - OLD, 10Gbs was on Fibre at that time

We really got this case during 2014
If the 10Gb card will stop to work you have to:
  • Unplug the Fibre ( nowadays it's a 10GbE Copper )
  • Try to unplug/wait 20s/plug the transceiver this was enough to fix
  • if that doesn't work try a clean server stop/ wait 60s/ server restart
  • If the 10Gb port is really broken try to use the other 10Gbit port by moving the transceiver and the server IP from eth0 to eth1
  • If it's broken not the single port but the whole 10Gb board then you might connect 4 cables from the 4 onboard Gigabit ports to the Switch, create a Linux bonding mode=6 and move onto the bonding device the server IP. But this is a major change, it would be better to open a call on the HP Support WebSite and simply wait for them.

Installation

Puppet coordinates: Fabio uses these aliases + Puppet recipes are in /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests
alias ROOT='. /afs/cern.ch/sw/lcg/external/gcc/4.8/x86_64-slc6/setup.sh && . /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.26/x86_64-slc6-gcc48-opt/root/bin/thisroot.sh'
alias cscsela='ssh -AX fmartine@ela.cscs.ch'
alias cscslogin='ssh -AX fmartine@login.lcg.cscs.ch'
alias cscspub='ssh -AX fmartinelli@pub.lcg.cscs.ch'
alias dcache='ssh -2 -l admin -p 22224 t3dcachedb.psi.ch'
alias dcache04='ssh -2 -l admin -p 22224 t3dcachedb04.psi.ch'
alias gempty='git commit --allow-empty-message -m '\'''\'''
alias kscustom54='cd /afs/psi.ch/software/linux/dist/scientific/54/custom'
alias kscustom57='cd /afs/psi.ch/software/linux/dist/scientific/57/custom'
alias kscustom60='cd /afs/psi.ch/software/linux/dist/scientific/60/custom'
alias kscustom64='cd /afs/psi.ch/software/linux/dist/scientific/64/custom'
alias kscustom66='cd /afs/psi.ch/software/linux/dist/scientific/66/x86_64/custom'
alias ksdir='cd /afs/psi.ch/software/linux/kickstart/configs'
alias ksprepostdir='cd /afs/psi.ch/software/linux/dist/scientific/60/kickstart/bin'
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=tty'
alias mc='. /usr/libexec/mc/mc-wrapper.sh'
alias pdir='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/'
alias pdirf='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/FabioDevelopment/'
alias pdirmanifests='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests/'
alias pdirredhat='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/RedHat'
alias pdirsolaris='cd /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/modules/Tier3/files/Solaris/5.10'
alias vi='vim'
alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'
alias yumdir5='cd /afs/psi.ch/software/linux/dist/scientific/57/scripts'
alias yumdir6='cd /afs/psi.ch/software/linux/dist/scientific/6/scripts'
alias yumdir7='cd /afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/Tier3/all'
alias yumdir7old='cd /afs/psi.ch/software/linux/dist/scientific/70.PLEASE_DO_NOT_USE_AND_DO_NOT_RENAME/scripts'

dCache nowadays runs as the user dcache and not anymore as the user root so you might be hit by a permission denied.
The 2 HP Proliant G7 DL380 servers t3fs[13,14] are the dCache 10Gbit/s gateway to the 6+6 XFS 22TB filesystems offered by our IS5500 360TB storage; before to study the SW details is worth to get an overview about the HW details; HP Website about these server, HP Bulletin about these servers. These servers are HW RAID 1 protected.

HW Raid controller

The servers own 8 external 2.5" hot-plug disks slots managed by a PCI-e x8 1GB RAM Raid controller type Smart Array P410i

Follows an example of a RAID Controller FW update made on 27th Dec 2012:

[root@t3fs14 CP017907.xmlhpsetupLdrImage.binlo100flashlo100.sh]# ./hpsetup 

HP Enclosure ROM  Flash.
Flash Engine Version: 2.06.10
Copyright (c) 2006-2009 Hewlett-Packard Development Company L.P.

Device [P410i]:  FW Ver [ Current:5.14 | Apply:5.70 ?]Flash this device? [NO, yes, quit] yes
Preparing to flash devices on the array controller...
Requesting flash - this could take up to 15 minutes...
Flash complete.
The array flash operation succeeded.
Device [P410i]:  FW Ver [ Current:5.14 | Apply:5.70 ?]Flash this device? [NO, yes, quit] yes
Preparing to flash devices on the array controller...
Requesting flash - this could take up to 15 minutes...
Flash complete.
The array flash operation succeeded.
you need to reboot the server to make active the new FW.

OS installation

The servers are installed like SL6 64bit by Kickstart + Puppet ; see the Puppet SL6_dcache_fs215_fs13fs14.pp file in /afs/psi.ch/service/linux/puppet/var/puppet/environments/DerekDevelopment/manifests
Be aware of the LVM installation upon the HW RAID 1.
[root@t3fs14 ~]# mount  | grep  ext4
/dev/sda2 on / type ext4 (rw)
/dev/sda1 on /boot type ext4 (rw)
/dev/mapper/vg_local_raid1-opt on /opt type ext4 (rw,nosuid,nodev,noatime,barrier)
/dev/mapper/vg_local_raid1-var on /var type ext4 (rw,noexec,nosuid,nodev,noatime,nobarrier)
/dev/mapper/vg_local_raid1-tmp on /tmp type ext4 (rw,noexec,nosuid,nodev,noatime,nobarrier)

qla2xxx driver

Qlogic drivers, see the next LSI RDAC chapter for the details:
[root@t3fs14 ~]# lsmod  | grep ql
qla2xxx               366369  50 
scsi_transport_fc      52241  1 qla2xxx

[root@t3fs14 ~]# find /sys | grep ql

LSI RDAC - Redundant Dual Active Controller

The HP DL380 G7 servers feature 2 Dual Port Qlogic 8Gbit/s FC type QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI Express HBA (rev 02), Tot 4 ports like shown in the following picture; 2 ports are connected to the NetApp E5400 and 2 ports are connected to the SGI IS5500
QlogicDualChannel8GbitFC.JPG
To allow to the XFS filesystems to exploit the 2 distinct paths to the LUNs Linux has to be configured to aggregate the paths in a single virtual path and monitor if one of these paths is down and accordingly exclude the I/O from it; afaik there are 2 major tools on Red Hat to create the virtual path, the former is based on the RHEL6 Multipath Daemon while the latter is based on the LSI RDAC driver; because the LSI RDAC driver is the official tool provided by LSI, that's also the producer of the IS5500 RAID controllers, I've decided to use that one. Also I got that the RHEL6 multipath daemon in turn uses the LSI RDAC driver but it offers a driver independent configuration interface ; since I don't mind about being exposed to the LSI driver details while on the other hand I care about using the latest stable version of the LSI driver I've decided to avoid another software layer and directly compile and use the LSI RDAC driver.

To install the LSI RDAC driver you need to install the RPM kernel-devel and create a special initrd like /boot/mpp-2.6.32-358.2.1.el6.x86_64.img by compiling the SW inside /root/rdac/linuxrdac-09.03.0C00.0652; that also means that if a kernel update is needed then a new initrd file is needed as well. Furthermore take into account that in the future new versions of LSI RDAC driver itself will be released by SGI or NetApp, so it's worth to check if it's available a new LSI RDAC driver during a kernel update and exploit the scheduled downtime to update the kernel and the LSI RDAC driver at the same time.
All this said the online documentation about this important topic is very poor ! This is the best overview that I've found on this topic

mpp drivers

[root@t3fs14 ~]# lsmod  | grep mpp
mppVhba               138253  12 
mppUpper              156950  1 mppVhba

[root@t3fs14 ~]# find /sys | grep mpp

LUNs # to SCSI disk labels corrispondence

To easily understand which sd* disk map to which NetApp =LUN=s : More... Close
[root@t3fs13 ~]# /opt/mpp/lsvdev 
	Array Name      Lun    sd device
	-------------------------------------
	T3_CMS_SGI_STORAGE 1     -> /dev/sdb
	T3_CMS_SGI_STORAGE 2     -> /dev/sdc
	T3_CMS_SGI_STORAGE 3     -> /dev/sdd
	T3_CMS_SGI_STORAGE 4     -> /dev/sde
	T3_CMS_SGI_STORAGE 5     -> /dev/sdf
	T3_CMS_SGI_STORAGE 6     -> /dev/sdg
	T3_CMS_SGI_STORAGE 7     -> /dev/sdh
	T3_CMS_SGI_STORAGE 8     -> /dev/sdi
	T3_CMS_SGI_STORAGE 9     -> /dev/sdj
	T3_CMS_SGI_STORAGE 10    -> /dev/sdk
	T3_CMS_SGI_STORAGE 11    -> /dev/sdl
	T3_CMS_SGI_STORAGE 12    -> /dev/sdm
	T3_CMS_E5460_01 1     -> /dev/sdn
	T3_CMS_E5460_01 2     -> /dev/sdo
	T3_CMS_E5460_01 3     -> /dev/sdp
	T3_CMS_E5460_01 4     -> /dev/sdq
	T3_CMS_E5460_01 5     -> /dev/sdr
	T3_CMS_E5460_01 6     -> /dev/sds
	T3_CMS_E5460_01 7     -> /dev/sdt
	T3_CMS_E5460_01 8     -> /dev/sdu
	T3_CMS_E5460_01 9     -> /dev/sdv
	T3_CMS_E5460_01 10    -> /dev/sdw
	T3_CMS_E5460_01 11    -> /dev/sdx
	T3_CMS_E5460_01 12    -> /dev/sdy

mppUtil utility

The mppUtil tool can be used to interact with the LSI RDAC drivers, run man mppUtill to read its manual, following some outputs:

Connected NetApps

More... Close
[root@t3fs13 ~]# mppUtil -a
Hostname    = t3fs13.psi.ch
Domainname  = (none)
Time        = GMT 11/28/2013 10:06:32 

---------------------------------------------------------------
Info of Array Module's seen by this Host. 
---------------------------------------------------------------
ID		WWN		   	 Type     Name         
---------------------------------------------------------------
 0	60080e50001f98f0000000004f20e355 FC	T3_CMS_SGI_STORAGE     
 1	60080e50001fe1500000000051f234a6 FC	T3_CMS_E5460_01     
---------------------------------------------------------------

Connected LUNs

More... Close
Hostname    = t3fs13.psi.ch
Domainname  = (none)
Time        = GMT 11/28/2013 10:07:14 

MPP Information:
----------------
      ModuleName: T3_CMS_SGI_STORAGE                       SingleController: N
 VirtualTargetID: 0x000                                       ScanTriggered: N
     ObjectCount: 0x000                                          AVTEnabled: N
             WWN: 60080e50001f98f0000000004f20e355               RestoreCfg: N
    ModuleHandle: none                                        Page2CSubPage: Y
 FirmwareVersion: 7.86.33.xx                                 FailoverMethod: C
   ScanTaskState: 0x00000000
        LBPolicy: LeastQueueDepth
  ProtectionType: 0


Controller 'A' Status:
-----------------------
ControllerHandle: none                                    ControllerPresent: Y
    UTMLunExists: N                                                  Failed: N
   NumberOfPaths: 1                                          FailoverInProg: N
                                                                ServiceMode: N

    Path #1
    ---------
 DirectoryVertex: present                                           Present: Y
       PathState: OPTIMAL              
          PathId: 77010000 (hostId: 1, channelId: 0, targetId: 0)
  ProtCapability: 0


Controller 'B' Status:
-----------------------
ControllerHandle: none                                    ControllerPresent: Y
    UTMLunExists: N                                                  Failed: N
   NumberOfPaths: 1                                          FailoverInProg: N
                                                                ServiceMode: N

    Path #1
    ---------
 DirectoryVertex: present                                           Present: Y
       PathState: OPTIMAL              
          PathId: 77030000 (hostId: 3, channelId: 0, targetId: 0)
  ProtCapability: 0



Lun Information
---------------

    Lun #1 - WWN: 60080e50001f98f000000652523271b2
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #2 - WWN: 60080e50001f9920000001014f28bdc4
    ----------------
       LunObject: present                                 CurrentOwningPath: B
  RemoveEligible: N                                          BootOwningPath: B
   NotConfigured: N                                           PreferredPath: B
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #3 - WWN: 60080e50001f98f00000014f4f28bd6d
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #4 - WWN: 60080e50001f9920000001044f28bdd1
    ----------------
       LunObject: present                                 CurrentOwningPath: B
  RemoveEligible: N                                          BootOwningPath: B
   NotConfigured: N                                           PreferredPath: B
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #5 - WWN: 60080e50001f98f0000002264f30e909
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #6 - WWN: 60080e50001f9920000001074f28bddf
    ----------------
       LunObject: present                                 CurrentOwningPath: B
  RemoveEligible: N                                          BootOwningPath: B
   NotConfigured: N                                           PreferredPath: B
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #7 - WWN: 60080e50001f98f00000065052327180
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #8 - WWN: 60080e50001f9920000004d14f4c7cdd
    ----------------
       LunObject: present                                 CurrentOwningPath: B
  RemoveEligible: N                                          BootOwningPath: B
   NotConfigured: N                                           PreferredPath: B
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #9 - WWN: 60080e50001f98f0000005084f4c7cfc
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #10 - WWN: 60080e50001f9920000004d34f4c7d10
    ----------------
       LunObject: present                                 CurrentOwningPath: B
  RemoveEligible: N                                          BootOwningPath: B
   NotConfigured: N                                           PreferredPath: B
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #11 - WWN: 60080e50001f98f00000050a4f4c7d33
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0


    Lun #12 - WWN: 60080e50001f9920000004d54f4c7d5c
    ----------------
       LunObject: present                                 CurrentOwningPath: B
  RemoveEligible: N                                          BootOwningPath: B
   NotConfigured: N                                           PreferredPath: B
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N
                                        VD_Ownership_Transfer_Attempt_Count: 0
                                                             ProtectionType: 0

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

What during 1 FC cable/port failure ?

Because of the 2 redundant paths to the SGIIS5500andE5460andE2760 the XFS filesystems will stay mounted smile ; a dmesg will 1st report : More... Close
qla2xxx [0000:0b:00.0]-500b:1: LOOP DOWN detected (2 3 0 0).
rport-1:0-0: blocked FC remote port time out: removing target and saving binding
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:1: rejecting I/O to offline device
mpp 1:0:0:2: rejecting I/O to offline device
mpp 1:0:0:3: rejecting I/O to offline device
mpp 1:0:0:3: rejecting I/O to offline device
mpp 1:0:0:3: rejecting I/O to offline device
mpp 1:0:0:3: rejecting I/O to offline device
mpp 1:0:0:4: rejecting I/O to offline device
...
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 Selection Retry count exhausted
7 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0 Path Failed
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 No new path: fall to failover controller case. vcmnd SN 154546623 pdev H1:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 Failed controller to 1. retry. vcmnd SN 154546623 pdev H1:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 No new path: fall to failover controller case. vcmnd SN 154546609 pdev H1:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 Failed controller to 1. retry. vcmnd SN 154546609 pdev H1:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 No new path: fall to failover controller case. vcmnd SN 154546608 pdev H1:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:1 Failed controller to 1. retry. vcmnd SN 154546608 pdev H1:C0:T0:L1 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 No new path: fall to failover controller case. vcmnd SN 154546622 pdev H1:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 Failed controller to 1. retry. vcmnd SN 154546622 pdev H1:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 No new path: fall to failover controller case. vcmnd SN 154546599 pdev H1:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 Failed controller to 1. retry. vcmnd SN 154546599 pdev H1:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 No new path: fall to failover controller case. vcmnd SN 154546598 pdev H1:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:3 Failed controller to 1. retry. vcmnd SN 154546598 pdev H1:C0:T0:L3 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 No new path: fall to failover controller case. vcmnd SN 154546621 pdev H1:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 Failed controller to 1. retry. vcmnd SN 154546621 pdev H1:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 No new path: fall to failover controller case. vcmnd SN 154546602 pdev H1:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 Failed controller to 1. retry. vcmnd SN 154546602 pdev H1:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 No new path: fall to failover controller case. vcmnd SN 154546601 pdev H1:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:5 Failed controller to 1. retry. vcmnd SN 154546601 pdev H1:C0:T0:L5 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 No new path: fall to failover controller case. vcmnd SN 154546550 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Failed controller to 1. retry. vcmnd SN 154546550 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 No new path: fall to failover controller case. vcmnd SN 154546549 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Failed controller to 1. retry. vcmnd SN 154546549 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 No new path: fall to failover controller case. vcmnd SN 154546548 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Failed controller to 1. retry. vcmnd SN 154546548 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 No new path: fall to failover controller case. vcmnd SN 154546547 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:7 Failed controller to 1. retry. vcmnd SN 154546547 pdev H1:C0:T0:L7 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 No new path: fall to failover controller case. vcmnd SN 154546558 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Failed controller to 1. retry. vcmnd SN 154546558 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 No new path: fall to failover controller case. vcmnd SN 154546557 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Failed controller to 1. retry. vcmnd SN 154546557 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 No new path: fall to failover controller case. vcmnd SN 154546556 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Failed controller to 1. retry. vcmnd SN 154546556 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 No new path: fall to failover controller case. vcmnd SN 154546555 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:9 Failed controller to 1. retry. vcmnd SN 154546555 pdev H1:C0:T0:L9 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 No new path: fall to failover controller case. vcmnd SN 154546566 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Failed controller to 1. retry. vcmnd SN 154546566 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 No new path: fall to failover controller case. vcmnd SN 154546565 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Failed controller to 1. retry. vcmnd SN 154546565 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 No new path: fall to failover controller case. vcmnd SN 154546564 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Failed controller to 1. retry. vcmnd SN 154546564 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
94 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Selection Retry count exhausted
496 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 No new path: fall to failover controller case. vcmnd SN 154546563 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
497 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:0:0:11 Failed controller to 1. retry. vcmnd SN 154546563 pdev H1:C0:T0:L11 0x00/0x00/0x00 0x00010000 mpp_status:6
10 [RAIDarray.mpp]T3_CMS_SGI_STORAGE:1 Failover command issued
746 [RAIDarray.mpp]Device(0x12e66000) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x11791800) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x11791000) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x11790800) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x11790000) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x117d9800) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x117d9000) is already removed, cannot send Synchronous IO request
746 [RAIDarray.mpp]Device(0x1271a800) is already removed, cannot send Synchronous IO request
801 [RAIDarray.mpp]Failover succeeded to T3_CMS_SGI_STORAGE:1
then when you reconnect the FC cable : More... Close
qla2xxx [0000:0b:00.0]-500a:1: LOOP UP detected (8 Gbps).
scsi 1:0:0:0: Direct-Access     LSI      INF-01-00        0786 PQ: 1 ANSI: 5
736 [RAIDarray.mpp]Host 1 Target 0 Lun 0 Is a physical device but is an Unconfigured Device. 
scsi 1:0:0:1: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:2: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:3: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:4: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:5: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:6: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:7: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:8: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:9: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:10: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:11: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
scsi 1:0:0:12: Direct-Access     LSI      INF-01-00        0786 PQ: 0 ANSI: 5
dCache won't notice anything smile !

iLO3 internal configuration

HPProLiantDL380G7ILO3

iLO3 external configuration ( OS involved )

To inquire about the HW status ( fans, temps, .. ) there are some RPMs that must to be installed, they are present inside /afs/psi.ch/software/linux/dist/scientific/6/others/all/.
[root@t3fs13 ~]# rpm -qa | grep hp 
hp-health-9.25-1551.9.rhel6.x86_64
hpsmh-6.3.1-23.x86_64
hpacucli-9.30-15.0.x86_64

Services:
[root@t3fs13 ~]# chkconfig  --list | grep hp 
hp-asrd        	0:off	1:off	2:on	3:on	4:on	5:on	6:off
hp-health      	0:off	1:off	2:on	3:on	4:on	5:on	6:off
hpsmhd         	0:off	1:off	2:off	3:off	4:off	5:off	6:off
Daemons ON:
[root@t3fs13 ~]# ps fax | grep -i hp
  568 ?        S      0:02  \_ [hpsa]
27596 pts/0    S+     0:00          \_ grep -i hp
 3465 ?        Ssl   10:55 hpasmlited -f /dev/hpilo
 3501 ?        Ss     0:00 /opt/hp/hp-health/bin/hp-asrd -p 1 -t 600
 3502 ?        S      1:30  \_ /opt/hp/hp-health/bin/hp-asrd -p 1 -t 600
When switched ON these HP daemons will be connected to the iLO3:
[root@t3fs13 ~]# ll /dev/hpilo/
total 0
crw-rw---- 1 root root 246, 0 Sep 13 10:25 d0ccb0
crw-rw---- 1 root root 246, 1 Sep 13 10:25 d0ccb1
crw-rw---- 1 root root 246, 2 Sep 13 10:25 d0ccb2
crw-rw---- 1 root root 246, 3 Sep 13 10:25 d0ccb3
crw-rw---- 1 root root 246, 4 Sep 13 10:25 d0ccb4
crw-rw---- 1 root root 246, 5 Sep 13 10:25 d0ccb5
crw-rw---- 1 root root 246, 6 Sep 13 10:25 d0ccb6
crw-rw---- 1 root root 246, 7 Sep 13 10:25 d0ccb7

10Gbit/s HP NC552SFP 10Gb 2-port Ethernet Server Adapter

The servers own a Dual Channel 10Gbit/s Emulex card type HP Model Number 614203-B21, that owns a pci-express x8 bus like highlighted by this lspci -vvv output:
08:00.0 Ethernet controller: ServerEngines Corp. Emulex OneConnect 10Gb NIC (be3) (rev 01)
...
		LnkSta:	Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-     <-----------------------
...
	Kernel driver in use: be2net
	Kernel modules: be2net

According to the IBM Paper about 10Gbit/s links and Linux we've stopped the irqbalance daemon and inside /etc/sysctl.conf we've configured:
More... Close
# cat /etc/sysctl.conf
# Puppet Managed File
# 
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# by martinelli according to IBM Tuning 10Gb network cards on Linux 
# http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cts=1331221888369&ved=0CCcQFjAA&url=http%3A%2F%2Fkernel.org%2Fdoc%2Fols%2F2009%2Fols2009-pages-169-184.pdf&ei=edVYT9bRNPPc4QSugZHWDw&usg=AFQjCNGl95lznOgwBSFzpuZ3QlXohXM1Xw&sig2=ZlZTvy_XY4eTGHam_JflIw
net.core.rmem_max = 16777216
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_rmem = 4096 87380 3526656
net.ipv4.tcp_wmem = 4096 87380 3526656
net.ipv4.tcp_sack = 1   <------ DON'T SWITCH THIS TO 0 !!
net.core.netdev_max_backlog = 300000
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_fin_timeout = 20
net.ipv4.tcp_moderate_rcvbuf = 1

dCache 2.10

Please look the Puppet recipes ( leaf is SL6_dcache_fs210_fs13fs14.pp ) :
  • SL6_dcache_fs210_fs13fs14.pp
  • SL6_dcache_fs210.pp
  • SL6_dcache_.pp
  • tier3-baseclasses.pp

CSCS dCache page

LCGTier2/ServiceDcache Fabio never uses it but we report it as a reference

Important files in a nutshell

find /etc/dcache/

/etc/dcache/dcache.conf  <-- main dCache conf, it should be the same on each node

/etc/dcache/logback.xml   <-- to tune the logging verbosity

/etc/logrotate.d/dcache

/etc/dcache/layouts
/etc/dcache/layouts/t3fs13.conf   <-- specific node conf

# dCache Logs
/var/log/dcache/t3fs13-Domain-dcap.log
/var/log/dcache/t3fs13-Domain-gridftp.log
/var/log/dcache/t3fs13-Domain-gsidcap.log
/var/log/dcache/t3fs13-Domain-gsiftp.log
/var/log/dcache/t3fs13-Domain-pool.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_0.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_10.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_11.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_1.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_2.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_3.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_4.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_5.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_6.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_7.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_8.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms_9.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_cms.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_10.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_11.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_1.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_2.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_3.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_4.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_5.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_6.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_7.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_8.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops_9.log
/var/log/dcache/t3fs13-Domain-pool-t3fs13_ops.log

/etc/dcache/dcache.conf

The same as NodeTypedCacheStorageElement#etc_dcache_dcache_conf

/etc/dcache/layouts/t3fs13.conf

More... Close
# Puppet Managed File 

[${host.name}-Domain-pool-t3fs13_cms]
# t3fs13_cms
[${host.name}-Domain-pool-t3fs13_cms/pool]
pool.name=t3fs13_cms
pool.path=/mnt/data06/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_1]
# t3fs13_cms_1
[${host.name}-Domain-pool-t3fs13_cms_1/pool]
pool.name=t3fs13_cms_1
pool.path=/mnt/data01/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_2]
# t3fs13_cms_2
[${host.name}-Domain-pool-t3fs13_cms_2/pool]
pool.name=t3fs13_cms_2
pool.path=/mnt/data02/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_3]
# t3fs13_cms_3
[${host.name}-Domain-pool-t3fs13_cms_3/pool]
pool.name=t3fs13_cms_3
pool.path=/mnt/data03/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_4]
# t3fs13_cms_4
[${host.name}-Domain-pool-t3fs13_cms_4/pool]
pool.name=t3fs13_cms_4
pool.path=/mnt/data04/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_5]
# t3fs13_cms_5
[${host.name}-Domain-pool-t3fs13_cms_5/pool]
pool.name=t3fs13_cms_5
pool.path=/mnt/data05/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_6]
# t3fs13_cms_6
[${host.name}-Domain-pool-t3fs13_cms_6/pool]
pool.name=t3fs13_cms_6
pool.path=/mnt/data07/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_7]
# t3fs13_cms_7
[${host.name}-Domain-pool-t3fs13_cms_7/pool]
pool.name=t3fs13_cms_7
pool.path=/mnt/data08/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_8]
# t3fs13_cms_8
[${host.name}-Domain-pool-t3fs13_cms_8/pool]
pool.name=t3fs13_cms_8
pool.path=/mnt/data09/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_9]
# t3fs13_cms_9
[${host.name}-Domain-pool-t3fs13_cms_9/pool]
pool.name=t3fs13_cms_9
pool.path=/mnt/data10/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_10]
# t3fs13_cms_10
[${host.name}-Domain-pool-t3fs13_cms_10/pool]
pool.name=t3fs13_cms_10
pool.path=/mnt/data11/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_11]
# t3fs13_cms_11
[${host.name}-Domain-pool-t3fs13_cms_11/pool]
pool.name=t3fs13_cms_11
pool.path=/mnt/data12/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_cms_0]
# t3fs13_cms_0
[${host.name}-Domain-pool-t3fs13_cms_0/pool]
pool.name=t3fs13_cms_0
pool.path=/mnt/data00/t3fs13_cms/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops]
# t3fs13_ops
[${host.name}-Domain-pool-t3fs13_ops/pool]
pool.name=t3fs13_ops
pool.path=/mnt/data06/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_1]
# t3fs13_ops_1
[${host.name}-Domain-pool-t3fs13_ops_1/pool]
pool.name=t3fs13_ops_1
pool.path=/mnt/data01/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_2]
# t3fs13_ops_2
[${host.name}-Domain-pool-t3fs13_ops_2/pool]
pool.name=t3fs13_ops_2
pool.path=/mnt/data02/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_3]
# t3fs13_ops_3
[${host.name}-Domain-pool-t3fs13_ops_3/pool]
pool.name=t3fs13_ops_3
pool.path=/mnt/data03/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_4]
# t3fs13_ops_4
[${host.name}-Domain-pool-t3fs13_ops_4/pool]
pool.name=t3fs13_ops_4
pool.path=/mnt/data04/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_5]
# t3fs13_ops_5
[${host.name}-Domain-pool-t3fs13_ops_5/pool]
pool.name=t3fs13_ops_5
pool.path=/mnt/data05/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_6]
# t3fs13_ops_6
[${host.name}-Domain-pool-t3fs13_ops_6/pool]
pool.name=t3fs13_ops_6
pool.path=/mnt/data07/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_7]
# t3fs13_ops_7
[${host.name}-Domain-pool-t3fs13_ops_7/pool]
pool.name=t3fs13_ops_7
pool.path=/mnt/data08/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_8]
# t3fs13_ops_8
[${host.name}-Domain-pool-t3fs13_ops_8/pool]
pool.name=t3fs13_ops_8
pool.path=/mnt/data09/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_9]
# t3fs13_ops_9
[${host.name}-Domain-pool-t3fs13_ops_9/pool]
pool.name=t3fs13_ops_9
pool.path=/mnt/data10/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_10]
# t3fs13_ops_10
[${host.name}-Domain-pool-t3fs13_ops_10/pool]
pool.name=t3fs13_ops_10
pool.path=/mnt/data11/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-pool-t3fs13_ops_11]
# t3fs13_ops_11
[${host.name}-Domain-pool-t3fs13_ops_11/pool]
pool.name=t3fs13_ops_11
pool.path=/mnt/data12/t3fs13_ops/pool
pool.wait-for-files=${pool.path}/data

[${host.name}-Domain-dcap]
[${host.name}-Domain-dcap/dcap]
dcap.authz.anonymous-operations=READONLY

[${host.name}-Domain-gsidcap]
[${host.name}-Domain-gsidcap/dcap]
dcap.authn.protocol=gsi

[${host.name}-Domain-gsiftp]
[${host.name}-Domain-gsiftp/ftp]
ftp.authn.protocol=gsi
ftp.enable.overwrite=false
ftp.mover.queue=wan

dCache gap tuning according to the type of pool

[root@t3fs13 ~]# grep gap /mnt/data01/t3fs13_cms/pool/setup 
set gap 4g
[root@t3fs13 ~]# grep gap /mnt/data01/t3fs13_ops/pool/setup 
set gap 10485760

dcache services

More... Close
DOMAIN                SERVICE CELL            LOG                                       
t3fs13-Domain-pool    pool    t3fs13_cms      /var/log/dcache/t3fs13-Domain-pool.log    <-- note that the Xrootd traffic is processed by the 'pool' services ; there is not a 'xrootd' service pool side
t3fs13-Domain-pool    pool    t3fs13_cms_1    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_2    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_3    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_4    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_5    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_6    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_7    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_8    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_9    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_10   /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_11   /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_cms_0    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops      /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_1    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_2    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_3    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_4    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_5    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_6    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_7    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_8    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_9    /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_10   /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-pool    pool    t3fs13_ops_11   /var/log/dcache/t3fs13-Domain-pool.log    
t3fs13-Domain-dcap    dcap    DCap-t3fs13     /var/log/dcache/t3fs13-Domain-dcap.log    
t3fs13-Domain-gridftp gridftp GFTP-t3fs13     /var/log/dcache/t3fs13-Domain-gridftp.log 
t3fs13-Domain-gsidcap gsidcap DCap-gsi-t3fs13 /var/log/dcache/t3fs13-Domain-gsidcap.log 

dcache pool ls

More... Close
 
POOL          DOMAIN                           META SIZE   FREE  PATH                        
t3fs13_cms    t3fs13-Domain-pool-t3fs13_cms    db   22000G 1894G /mnt/data06/t3fs13_cms/pool 
t3fs13_cms_1  t3fs13-Domain-pool-t3fs13_cms_1  db   15830G 2916G /mnt/data01/t3fs13_cms/pool 
t3fs13_cms_2  t3fs13-Domain-pool-t3fs13_cms_2  db   22000G 2224G /mnt/data02/t3fs13_cms/pool 
t3fs13_cms_3  t3fs13-Domain-pool-t3fs13_cms_3  db   22000G 2868G /mnt/data03/t3fs13_cms/pool 
t3fs13_cms_4  t3fs13-Domain-pool-t3fs13_cms_4  db   22000G 2669G /mnt/data04/t3fs13_cms/pool 
t3fs13_cms_5  t3fs13-Domain-pool-t3fs13_cms_5  db   22000G 2868G /mnt/data05/t3fs13_cms/pool 
t3fs13_cms_6  t3fs13-Domain-pool-t3fs13_cms_6  db   22000G 2999G /mnt/data07/t3fs13_cms/pool 
t3fs13_cms_7  t3fs13-Domain-pool-t3fs13_cms_7  db   22000G 2815G /mnt/data08/t3fs13_cms/pool 
t3fs13_cms_8  t3fs13-Domain-pool-t3fs13_cms_8  db   22000G 2915G /mnt/data09/t3fs13_cms/pool 
t3fs13_cms_9  t3fs13-Domain-pool-t3fs13_cms_9  db   22000G 3147G /mnt/data10/t3fs13_cms/pool 
t3fs13_cms_10 t3fs13-Domain-pool-t3fs13_cms_10 db   22000G 3193G /mnt/data11/t3fs13_cms/pool 
t3fs13_cms_11 t3fs13-Domain-pool-t3fs13_cms_11 db   22000G 3014G /mnt/data12/t3fs13_cms/pool 
t3fs13_cms_0  t3fs13-Domain-pool-t3fs13_cms_0  db   170G   36G   /mnt/data00/t3fs13_cms/pool 
t3fs13_ops    t3fs13-Domain-pool-t3fs13_ops    db   0G     0G    /mnt/data06/t3fs13_ops/pool 
t3fs13_ops_1  t3fs13-Domain-pool-t3fs13_ops_1  db   0G     0G    /mnt/data01/t3fs13_ops/pool 
t3fs13_ops_2  t3fs13-Domain-pool-t3fs13_ops_2  db   0G     0G    /mnt/data02/t3fs13_ops/pool 
t3fs13_ops_3  t3fs13-Domain-pool-t3fs13_ops_3  db   0G     0G    /mnt/data03/t3fs13_ops/pool 
t3fs13_ops_4  t3fs13-Domain-pool-t3fs13_ops_4  db   0G     0G    /mnt/data04/t3fs13_ops/pool 
t3fs13_ops_5  t3fs13-Domain-pool-t3fs13_ops_5  db   0G     0G    /mnt/data05/t3fs13_ops/pool 
t3fs13_ops_6  t3fs13-Domain-pool-t3fs13_ops_6  db   0G     0G    /mnt/data07/t3fs13_ops/pool 
t3fs13_ops_7  t3fs13-Domain-pool-t3fs13_ops_7  db   0G     0G    /mnt/data08/t3fs13_ops/pool 
t3fs13_ops_8  t3fs13-Domain-pool-t3fs13_ops_8  db   0G     0G    /mnt/data09/t3fs13_ops/pool 
t3fs13_ops_9  t3fs13-Domain-pool-t3fs13_ops_9  db   0G     0G    /mnt/data10/t3fs13_ops/pool 
t3fs13_ops_10 t3fs13-Domain-pool-t3fs13_ops_10 db   0G     0G    /mnt/data11/t3fs13_ops/pool 
t3fs13_ops_11 t3fs13-Domain-pool-t3fs13_ops_11 db   0G     0G    /mnt/data12/t3fs13_ops/pool 

netstat -tpln

More... Close
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:24849               0.0.0.0:*                   LISTEN      29977/java          
tcp        0      0 0.0.0.0:24466               0.0.0.0:*                   LISTEN      30119/java          
tcp        0      0 0.0.0.0:20371               0.0.0.0:*                   LISTEN      29835/java          
tcp        0      0 0.0.0.0:21427               0.0.0.0:*                   LISTEN      30500/java          
tcp        0      0 0.0.0.0:24627               0.0.0.0:*                   LISTEN      30350/java          
tcp        0      0 0.0.0.0:21685               0.0.0.0:*                   LISTEN      29614/java          
tcp        0      0 0.0.0.0:22965               0.0.0.0:*                   LISTEN      29906/java          
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      3102/sshd           
tcp        0      0 0.0.0.0:22007               0.0.0.0:*                   LISTEN      31242/java          
tcp        0      0 0.0.0.0:23255               0.0.0.0:*                   LISTEN      30985/java          
tcp        0      0 0.0.0.0:24344               0.0.0.0:*                   LISTEN      30119/java          
tcp        0      0 0.0.0.0:23512               0.0.0.0:*                   LISTEN      29835/java          
tcp        0      0 0.0.0.0:20504               0.0.0.0:*                   LISTEN      29480/java          
tcp        0      0 0.0.0.0:23960               0.0.0.0:*                   LISTEN      29764/java          
tcp        0      0 0.0.0.0:21465               0.0.0.0:*                   LISTEN      29480/java          
tcp        0      0 0.0.0.0:21561               0.0.0.0:*                   LISTEN      29906/java          
tcp        0      0 0.0.0.0:23706               0.0.0.0:*                   LISTEN      29977/java          
tcp        0      0 127.0.0.1:6010              0.0.0.0:*                   LISTEN      27201/0             
tcp        0      0 0.0.0.0:20634               0.0.0.0:*                   LISTEN      29547/java          
tcp        0      0 0.0.0.0:22842               0.0.0.0:*                   LISTEN      30738/java          
tcp        0      0 0.0.0.0:23707               0.0.0.0:*                   LISTEN      29906/java          
tcp        0      0 0.0.0.0:2811                0.0.0.0:*                   LISTEN      31610/java          
tcp        0      0 0.0.0.0:24895               0.0.0.0:*                   LISTEN      31062/java          
tcp        0      0 0.0.0.0:22463               0.0.0.0:*                   LISTEN      30265/java          
tcp        0      0 0.0.0.0:22560               0.0.0.0:*                   LISTEN      29764/java          
tcp        0      0 0.0.0.0:23904               0.0.0.0:*                   LISTEN      30640/java          
tcp        0      0 0.0.0.0:7937                0.0.0.0:*                   LISTEN      11625/nsrexecd      
tcp        0      0 0.0.0.0:22850               0.0.0.0:*                   LISTEN      29614/java          
tcp        0      0 0.0.0.0:7938                0.0.0.0:*                   LISTEN      11625/nsrexecd      
tcp        0      0 0.0.0.0:5666                0.0.0.0:*                   LISTEN      5880/nrpe           
tcp        0      0 0.0.0.0:23459               0.0.0.0:*                   LISTEN      29681/java          
tcp        0      0 0.0.0.0:7939                0.0.0.0:*                   LISTEN      11625/nsrexecd      
tcp        0      0 0.0.0.0:21188               0.0.0.0:*                   LISTEN      30190/java          
tcp        0      0 0.0.0.0:22948               0.0.0.0:*                   LISTEN      30894/java          
tcp        0      0 0.0.0.0:7940                0.0.0.0:*                   LISTEN      11625/nsrexecd      
tcp        0      0 0.0.0.0:21605               0.0.0.0:*                   LISTEN      30048/java          
tcp        0      0 0.0.0.0:709                 0.0.0.0:*                   LISTEN      3074/qlremote       
tcp        0      0 0.0.0.0:22918               0.0.0.0:*                   LISTEN      30048/java          
tcp        0      0 0.0.0.0:20102               0.0.0.0:*                   LISTEN      31148/java          
tcp        0      0 127.0.0.1:199               0.0.0.0:*                   LISTEN      3086/snmpd          
tcp        0      0 0.0.0.0:20361               0.0.0.0:*                   LISTEN      29764/java          
tcp        0      0 0.0.0.0:22601               0.0.0.0:*                   LISTEN      30265/java          
tcp        0      0 0.0.0.0:22025               0.0.0.0:*                   LISTEN      29681/java          
tcp        0      0 0.0.0.0:20906               0.0.0.0:*                   LISTEN      29413/java          
tcp        0      0 0.0.0.0:24107               0.0.0.0:*                   LISTEN      29413/java          
tcp        0      0 0.0.0.0:24619               0.0.0.0:*                   LISTEN      30575/java          
tcp        0      0 0.0.0.0:20397               0.0.0.0:*                   LISTEN      31331/java          
tcp        0      0 0.0.0.0:22125               0.0.0.0:*                   LISTEN      31423/java          
tcp        0      0 0.0.0.0:22605               0.0.0.0:*                   LISTEN      29547/java          
tcp        0      0 0.0.0.0:21230               0.0.0.0:*                   LISTEN      30425/java          
tcp        0      0 0.0.0.0:21519               0.0.0.0:*                   LISTEN      29977/java          
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      2832/rpcbind        
tcp        0      0 0.0.0.0:24240               0.0.0.0:*                   LISTEN      30190/java          
tcp        0      0 0.0.0.0:22128               0.0.0.0:*                   LISTEN      31512/java      

netstat -upln

  More...  Close    
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
udp        0      0 0.0.0.0:37301               0.0.0.0:*                               31062/java          
udp        0      0 0.0.0.0:38327               0.0.0.0:*                               29835/java          
udp        0      0 0.0.0.0:51384               0.0.0.0:*                               29977/java          
udp        0      0 0.0.0.0:51769               0.0.0.0:*                               31331/java          
udp        0      0 0.0.0.0:43833               0.0.0.0:*                               30190/java          
udp        0      0 0.0.0.0:56378               0.0.0.0:*                               29614/java          
udp        0      0 0.0.0.0:47035               0.0.0.0:*                               29977/java          
udp        0      0 0.0.0.0:42045               0.0.0.0:*                               30190/java          
udp        0      0 0.0.0.0:33981               0.0.0.0:*                               29681/java          
udp        0      0 0.0.0.0:36542               0.0.0.0:*                               30738/java          
udp        0      0 0.0.0.0:53952               0.0.0.0:*                               29977/java          
udp        0      0 0.0.0.0:57152               0.0.0.0:*                               29547/java          
udp        0      0 0.0.0.0:32961               0.0.0.0:*                               30265/java          
udp        0      0 0.0.0.0:36546               0.0.0.0:*                               30425/java          
udp        0      0 0.0.0.0:44611               0.0.0.0:*                               29413/java          
udp        0      0 0.0.0.0:707                 0.0.0.0:*                               3074/qlremote       
udp        0      0 0.0.0.0:68                  0.0.0.0:*                               2658/dhclient       
udp        0      0 0.0.0.0:52299               0.0.0.0:*                               29480/java          
udp        0      0 0.0.0.0:37069               0.0.0.0:*                               30350/java          
udp        0      0 0.0.0.0:34382               0.0.0.0:*                               31331/java          
udp        0      0 0.0.0.0:37074               0.0.0.0:*                               29480/java          
udp        0      0 0.0.0.0:36051               0.0.0.0:*                               30575/java          
udp        0      0 0.0.0.0:59348               0.0.0.0:*                               31062/java          
udp        0      0 0.0.0.0:52949               0.0.0.0:*                               30190/java          
udp        0      0 0.0.0.0:49109               0.0.0.0:*                               31512/java          
udp        0      0 0.0.0.0:39637               0.0.0.0:*                               30640/java          
udp        0      0 0.0.0.0:43353               0.0.0.0:*                               31242/java          
udp        0      0 0.0.0.0:7001                0.0.0.0:*                               -                   
udp        0      0 0.0.0.0:33114               0.0.0.0:*                               30738/java          
udp        0      0 0.0.0.0:33370               0.0.0.0:*                               29681/java          
udp        0      0 0.0.0.0:42970               0.0.0.0:*                               29906/java          
udp        0      0 0.0.0.0:55259               0.0.0.0:*                               31148/java          
udp        0      0 0.0.0.0:48988               0.0.0.0:*                               29764/java          
udp        0      0 0.0.0.0:41694               0.0.0.0:*                               31242/java          
udp        0      0 0.0.0.0:32993               0.0.0.0:*                               29681/java          
udp        0      0 0.0.0.0:35553               0.0.0.0:*                               30265/java          
udp        0      0 0.0.0.0:44131               0.0.0.0:*                               29764/java          
udp        0      0 0.0.0.0:45540               0.0.0.0:*                               31148/java          
udp        0      0 0.0.0.0:59108               0.0.0.0:*                               29906/java          
udp        0      0 0.0.0.0:40680               0.0.0.0:*                               30894/java          
udp        0      0 0.0.0.0:57192               0.0.0.0:*                               31062/java          
udp        0      0 0.0.0.0:5353                0.0.0.0:*                               2988/avahi-daemon:  
udp        0      0 0.0.0.0:55146               0.0.0.0:*                               30575/java          
udp        0      0 0.0.0.0:57707               0.0.0.0:*                               30350/java          
udp        0      0 0.0.0.0:51436               0.0.0.0:*                               30425/java          
udp        0      0 0.0.0.0:41581               0.0.0.0:*                               30985/java          
udp        0      0 0.0.0.0:51951               0.0.0.0:*                               29764/java          
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               2832/rpcbind        
udp        0      0 0.0.0.0:39409               0.0.0.0:*                               30048/java          
udp        0      0 0.0.0.0:42353               0.0.0.0:*                               30048/java          
udp        0      0 0.0.0.0:34034               0.0.0.0:*                               30425/java          
udp        0      0 0.0.0.0:36210               0.0.0.0:*                               29413/java          
udp        0      0 0.0.0.0:58611               0.0.0.0:*                               30119/java          
udp        0      0 0.0.0.0:883                 0.0.0.0:*                               2832/rpcbind        
udp        0      0 0.0.0.0:43508               0.0.0.0:*                               30894/java          
udp        0      0 0.0.0.0:631                 0.0.0.0:*                               2724/portreserve    
udp        0      0 0.0.0.0:57978               0.0.0.0:*                               31423/java          
udp        0      0 0.0.0.0:37883               0.0.0.0:*                               30985/java          
udp        0      0 192.33.123.53:123           0.0.0.0:*                               3112/ntpd           
udp        0      0 127.0.0.1:123               0.0.0.0:*                               3112/ntpd           
udp        0      0 0.0.0.0:123                 0.0.0.0:*                               3112/ntpd           
udp        0      0 0.0.0.0:49918               0.0.0.0:*                               30048/java          
udp        0      0 0.0.0.0:37247               0.0.0.0:*                               29835/java          
udp        0      0 0.0.0.0:40064               0.0.0.0:*                               30500/java          
udp        0      0 0.0.0.0:42625               0.0.0.0:*                               2988/avahi-daemon:  
udp        0      0 0.0.0.0:38786               0.0.0.0:*                               30575/java          
udp        0      0 0.0.0.0:7938                0.0.0.0:*                               11625/nsrexecd      
udp        0      0 127.0.0.1:514               0.0.0.0:*                               2745/syslog-ng      
udp        0      0 0.0.0.0:54790               0.0.0.0:*                               30500/java          
udp        0      0 0.0.0.0:38535               0.0.0.0:*                               30894/java          
udp        0      0 0.0.0.0:50312               0.0.0.0:*                               29480/java          
udp        0      0 0.0.0.0:46345               0.0.0.0:*                               31331/java          
udp        0      0 0.0.0.0:43021               0.0.0.0:*                               29547/java          
udp        0      0 0.0.0.0:54286               0.0.0.0:*                               30119/java          
udp        0      0 0.0.0.0:56335               0.0.0.0:*                               29413/java          
udp        0      0 0.0.0.0:49937               0.0.0.0:*                               31242/java          
udp        0      0 0.0.0.0:51985               0.0.0.0:*                               30119/java          
udp        0      0 0.0.0.0:36241               0.0.0.0:*                               30640/java          
udp        0      0 0.0.0.0:58131               0.0.0.0:*                               29614/java          
udp        0      0 0.0.0.0:58903               0.0.0.0:*                               29906/java          
udp        0      0 0.0.0.0:55965               0.0.0.0:*                               31610/java          
udp        0      0 0.0.0.0:33440               0.0.0.0:*                               30738/java          
udp        0      0 0.0.0.0:52257               0.0.0.0:*                               30350/java          
udp        0      0 0.0.0.0:161                 0.0.0.0:*                               3086/snmpd          
udp        0      0 0.0.0.0:46500               0.0.0.0:*                               29614/java          
udp        0      0 0.0.0.0:45350               0.0.0.0:*                               29835/java          
udp        0      0 0.0.0.0:49321               0.0.0.0:*                               29547/java          
udp        0      0 0.0.0.0:59817               0.0.0.0:*                               30500/java          
udp        0      0 0.0.0.0:51241               0.0.0.0:*                               30640/java          
udp        0      0 0.0.0.0:33321               0.0.0.0:*                               30265/java          
udp        0      0 0.0.0.0:51882               0.0.0.0:*                               30985/java          
udp        0      0 0.0.0.0:34987               0.0.0.0:*                               31148/java         

T3 Site Logs about these servers

Backups

Both t3fs13,14 are protected by the PSI Legato backup infrastructure for partitions:
  • /
  • /boot
  • /opt

  • 13201_div.pdf: Manual of - Smart Array Controller FBWC 1GB Flash Backed Cache 8 only to P410i

  • c03479393.pdf: Linux best practices using HP Service Pack for ProLiant (SPP) and Software Delivery Repository (SDR)
NodeTypeForm
Hostnames t3fs[13,14] READ-WRITE !!
Services dcache pool cells, gridftp, dcap, gsidcap
Hardware HP Proliant DL380 G7
Install Profile fs13fs14
Guarantee/maintenance until 31-07-2018
Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf 13201_div.pdf r1 manage 910.4 K 2012-10-16 - 09:08 FabioMartinelli Manual of - Smart Array Controller FBWC 1GB Flash Backed Cache 8 only to P410i
JPEGJPG QlogicDualChannel8GbitFC.JPG r1 manage 2047.6 K 2012-03-22 - 13:20 FabioMartinelli HP G7 DL380 Qlogic Dual Channel 8Gbit/s FC
PDFpdf c03479393.pdf r1 manage 153.4 K 2012-12-26 - 21:42 FabioMartinelli Linux best practices using HP Service Pack for ProLiant (SPP) and Software Delivery Repository (SDR)
PDFpdf ols2009-pages-169-1842.pdf r1 manage 191.5 K 2012-03-22 - 15:59 FabioMartinelli IBM Paper about 10Gbit/s links and Linux.
Edit | Attach | Watch | Print version | History: r35 < r34 < r33 < r32 < r31 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r35 - 2017-05-17 - NinaLoktionova
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback