Failure Entry 1: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 10
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 10
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 1: Back to top

Failure Entry 2: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 6
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 6
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 2: Back to top

Failure Entry 3: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 3
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 3
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 3: Back to top

Failure Entry 4: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 8
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 8
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 4: Back to top

Failure Entry 5: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 7
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 7
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 5: Back to top

Failure Entry 6: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 4
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 4
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 6: Back to top

Failure Entry 7: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 2
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 2
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 7: Back to top

Failure Entry 8: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 1
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 1
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 8: Back to top

Failure Entry 9: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 12
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 12
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 9: Back to top

Failure Entry 10: SAS_PORT_FAILED-Recovery Failure Type Code: 113

Storage array: PSI_CMS_T3_CE5400
Port reporting problem:
Drive tray 0, ESM: B (Bottom), Channel 2, Internal connection
Status: SAS Port Failed

Failed or Degraded SAS Port

What Caused the Problem?

A Serial Attached SCSI (SAS) port has failed or is in a degraded state. The Recovery Guru Details area provides specific information you will need as you follow the Recovery Steps.

Important Note

When a port reports a failed or degraded status, the actual problem could be with any of the following:

Recovery Steps

1
If... Then...
The Details area reports that the port is Degraded Go to step 2.
The Details area reports that the port is Failed Go to step 3.
2 The problem may be a faulty cable that is connected to the affected port. Replace the cable and ensure that there is a secure connection on both ends of the cable.

click the Recheck button to see if the problem has been fixed.

If this problem still appears in the Summary area, go to step 3.

3 Select the Advanced >> Troubleshooting >> Support Data >> Collect menu option from the Array Management Window (AMW), and take the appropriate steps to save the support data to a .zip file.
4 Contact your technical support representative and indicate that a "Failed or Degraded SAS port" problem is being reported. Send your representative the file you saved and wait for further instruction.

End of Failure Entry 10: Back to top

Failure Entry 11: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 9
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 9
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 11: Back to top

Failure Entry 12: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 5
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 5
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 12: Back to top

Failure Entry 13: NO_REDUNDANCY_DRIVE-Recovery Failure Type Code: 32

Storage array: PSI_CMS_T3_CE5400
Component reporting problem: Drive in slot 11
Status: Optimal
Location: Drive tray 0, Drawer 2
Component requiring service: Drive in slot 11
Service action (removal) allowed: No
Service action LED on component: Yes
Working channel: 0

Drive - Loss of Path Redundancy

What Caused the Problem?

A communication path with a drive has been lost. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electronic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

If...

Then...

The affected tray listed in the Recovery Guru Details area contains both controllers and drives

Go to step 7.

The affected tray listed in the Recovery Guru Details area contains only drives

Go to step 3.

3

To determine the non-working channel, start at the drive port on the controller tray corresponding to the working channel (refer to the labels on the back of the controller tray if needed). Trace the cable from the working channel to the ESM canister in the affected drive tray reported in the details area.

Caution: Possible loss of data accessibility. Do not disconnect any cables on the working channel. Doing so may cause a possible loss of data accessibility.

4

Locate the other ESM canister in the affected drive tray (this is the canister on the non-working channel).

5

Replace the ESM canister on the non-working channel using the following steps:

a

Label the interface transceivers (GBICs or SFPs). The labels will help you correctly reconnect the cables to the new ESM canister.

While the cables are still connected, remove the interface transceivers from the ESM canister you are replacing.

b

Remove the ESM canister.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

c

Set all switches on the new ESM canister to the same values as the old ESM canister.

d

Insert the new ESM canister into the drive tray.

e

Using the labels created in step a, reconnect the cables to the replaced canister. Wait 40 seconds, then go to step 6.

6

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 7.

The problem has not been fixed

Go to step 7.

7

You must replace the drive. Which procedure you use depends on the RAID level of the volume group associated with the affected drive. To determine the associated volume group, highlight the affected drive in the Physical View of the Array Management Window and select View >> Associated Elements. Next highlight the associated volume group in the Logical View of the Array Management Window.

If...

Then...

The volume group is RAID 0

Go to "Recovery Steps for Replacing a Drive in a RAID 0 Volume Group."

The volume group is RAID 1, 3, or 5

Go to "Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group."

Recovery Steps for Replacing a Drive in a RAID 0 Volume Group

Use the following procedure if the affected volume group is RAID 0.

Fix any other problems reported by the Recovery Guru before continuing with this procedure. Note that all volumes in the Logical View of the Array Management Window must be Optimal .

1

Stop all I/O to the affected volumes.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Back up all data on the affected volumes. (Step 7 will destroy all data on the affected volumes.)

Note: To the operating system (OS), a failed volume is the same as a failed non-RAID drive. Refer to the OS documentation for requirements concerning failed drives and apply them where necessary.

5

If any of the affected volumes are also source or target volumes in a copy operation that is either Pending or In Progress, you must stop the copy operation before continuing.

Go to the Copy Manager by selecting Volume >> Copy >> Copy Manager, then highlight each copy pair that contains an affected volume and select Copy >> Stop.

6

If you have snapshot volumes associated with the affected volumes, these snapshot volumes will no longer be valid once you fail the drive in step 8.

If necessary, perform any operations on the snapshot volumes and then delete them.

7

Caution: Possible loss of data accessibility. Transitioning volumes to failed may cause the loss of accessibility to data on the volumes. Make sure that you back up all data on the affected volumes before starting this step.

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The affected volumes become Failed .

8

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

9

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

Note: Wait until the replaced drive is ready (its fault indicator light must be off) before attempting to initialize the volumes in step 10.

10

Highlight the volume group associated with the replaced drive in the Logical View of the Array Management Window and select Advanced >> Recovery >> Initialize >> Volume Group.

  • The volumes in the volume group are initialized, one at a time.
  • To monitor initialization progress for a volume, highlight the volume in the Logical View of the Array Management Window and select Volume >> Properties. Note that when the initialization is completed, the progress bar is no longer displayed.
  • When initialization is completed, all volumes in the volume group are Optimal .

Important: Make sure you save this procedure by selecting Save As. Once you fix the failure, you will not be able to access the information from Recovery Guru.

11

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

a

If desired, create any snapshot volumes that you deleted in step 6.

b

If desired, re-create any copies you stopped by highlighting the copy pairs in the Copy Manager and selecting Copy >> Re-Copy.

c

Add the affected volumes back to the operating system. You may need to reboot the system to see the re-initialized volumes.

Note: Do not start I/O to these volumes until you have restored data from backup

d Restore the data for the affected volumes from backup.

e

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Drive in a RAID 1, 3, or 5 Volume Group

Use the following procedure if the affected volume group is RAID 1, 3, or 5.

1

You should stop all I/O to all volumes in the volume group associated with the affected drive to reduce the possibility of data loss. If another drive fails in this volume group while you are performing this procedure, you will lose data.

2

Reseating the drive may clear up the path redundancy problem. Remove the drive and then re-insert it.

Note: The Service Action Allowed status in the Details area is always NO for this problem because the component is not failed. In this situation, it is acceptable to remove the battery even though the Service Action Allowed is NO.

3

Wait 40 seconds, and then click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem has been fixed

You are finished with this procedure. Do NOT go to step 4.

The problem has not been fixed

Go to step 4.

4

Although not required, you should back up all data on all volumes associated with the affected drive.

5

Highlight the affected drive in the Physical View of the Array Management Window and select Advanced >> Recovery >> Fail Drive. The associated volumes become Degraded .

6

Remove the failed drive (its fault indicator light should be on).

Note: Make sure the replacement drive has a capacity equal to or greater than the failed drive.

7

Wait 30 seconds, then insert the new drive. Its fault indicator light may be lit for a short time (one minute or less).

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area.

If...

Then...

The problem has been fixed.

You are finished with this procedure.

The problem has not been fixed.

There is a problem with the controller. Go to "Recovery Steps for Replacing a Controller."

Recovery Steps for Replacing a Controller

Important: The controller replacement recovery steps should only be attempted after ALL other options have been exhausted.

Use the following procedure to replace a controller to resolve a loss of path redundancy condition.

If... Then...
Your storage array has one controller Go to "Replacing a Controller in a Single-Controller Storage Array."
Your storage array has two controllers Go to "Replacing a Controller in a Dual-Controller Storage Array."

Replacing a Controller in a Single-Controller Storage Array

1

Ensure that your replacement controller matches the controller in the storage array. If you do not have a controller with the appropriate replacement part number, contact your technical support representative.

2

Stop all I/O to this storage array.

3

Turn off power to the affected tray.

4

Remove the affected controller. Refer to the Enterprise Management Window (EMW) to view which management method you are using to manage this storage array.

If... Then...
You are using In-Band management for ALL hosts attached to this storage array Go to step 5.
You are using Out-of-Band management for ANY host attached to this storage array Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

5

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Check to see if the new controller canister contains a battery.
  • If your model of storage array does not contain batteries, go to step 6.
  • If your model of storage array is supposed to contain batteries and...
    • there is not a battery installed in the new controller canister, then install the battery from the old canister, and go to step 6.
    • there is a battery installed in the new controller canister, then go to step 6.
The controller for this storage array is located in a tray containing only controllers Go to step 6.

6

a

Make sure at least one minute has elapsed. Then, insert the new controller canister firmly in place.

b

Turn on power to the affected tray.

c

Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Array Management Window (AMW).

d
If... Then...
The controller indicates that it is Online Go to step e.
The controller indicates that it is Offline Select Advanced >> Recover >> Place Controller >> Online and then go to step e.

e

If... Then...
The controller for this storage array is located in a tray containing both controllers and drives Determine whether you need to reset the battery age.
  • If your model of storage array does not contain batteries and is supposed to, go to step 7.
  • If your model of storage array is supposed to contain batteries and...
    • you installed the battery from the old controller canister, then you do not need to reset the battery age. Go to step 7.
    • there was already a battery in the replacement controller canister, then you must reset the battery age using the following procedure:

      Select the Components button on the tray containing the controllers in the Physical View of the Array Management Window. Highlight the batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 7.

The controller for this storage array is located in a tray containing only controllers Go to step 7.

7

If you have volumes mapped to hosts that have Automatic Volume Transfer (AVT) disabled, it may be necessary to redistribute the volumes to their preferred controller. Use the following steps to determine the AVT status of the hosts connected to your storage array:

a

Open the Storage Array Profile by selecting the Storage Array >> View Profile menu option from the Array Management Window. Then, select the profile's Mappings tab.

b

Scroll to the NVSRAM Host Type Internal Definitions section.

If... Then...
There are hosts mapped to the volumes on this storage array that have an AVT status of disabled

OR

There are hosts mapped to the volumes on this storage array that are not running a host-based, multi-path failover driver

It may be necessary to redistribute the volumes to their preferred controller. If the Array Management Window's Advanced >> Recovery >> Redistribute Volumes menu option is available, select the option.

Note: If you have a mix of hosts with AVT enabled and AVT disabled, all volumes will be immediately assigned back to their preferred path. However, until the host-based multi-path failover driver detects the valid preferred path (may take several minutes), the volumes mapped to the AVT-enabled hosts may get temporarily returned back to the non-preferred path.

If the menu option is not available (grayed out), the volumes are already associated with their preferred controllers and no action is needed.

Go to step 8.

There are NO hosts mapped to the volumes on this storage array with an AVT status of disabled

OR

All hosts mapped to volumes on this storage array are running a host-based multi-path failover drive

No action is required.

If volumes need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.

Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.

Got to step 8.

8

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Replacing a Controller in a Dual-Controller Storage Array

1

Determine which is the affected controller by locating the non-working channel. Refer to step 3 at the beginning of this recovery procedure for details on how to locate the non-working channel.

2

Place the affected controller offline.

a

Highlight the controller containing the battery near expiration in the Physical View of the Array Management Window.

b

Select Advanced >> Recovery >> Place Controller >> Offline.

c

Select Yes in the Place Offline confirmation window.

d

Go to step 3.

3

Read all of the following steps before taking any action.

a

Click the Recheck button to rerun the Recovery Guru.

b

Select the Offline Controller problem that is being reported in the Summary area.

c

Complete the Recovery Steps in the Offline Controller to replace the controller.

4

Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.


End of Failure Entry 13: Back to top