SIP3152

Use this procedure to resolve possible failed connection problems

This procedure is used to resolve the following problems:

The possible causes are:
  • A failed connection caused by a failing component in the serial attached SCSI (SAS) fabric between, and including, the adapter and device enclosure.
  • A failed connection caused by a failing component within the device enclosure, including the device itself.
Note: For SRC xxxx4060, the failed connection was previously working, and may have already recovered.
Considerations:
Attention:
  • When SAS fabric problems exist, replacing RAID adapters is not recommended without assistance from your service provider. Because the adapter might contain nonvolatile write cache data and configuration data for the attached disk arrays, additional problems can be created by replacing an adapter.
  • Removing functioning disk units in a disk array is not recommended without assistance from your service provider. A disk array might become unprotected or might fail if functioning disk units are removed. The removal of functioning disk units might also result in additional problems in the disk array.
  1. Determine the resource name of the adapter that reported the problem by performing the following:
    1. Access SST or DST.
    2. Access the product activity log and record the resource name that this error is logged against. If the resource name is an adapter resource name, use it and continue with the next step. If the resource name is a disk unit resource name, use the Hardware Service Manager to determine the resource name of the adapter that is controlling this disk unit.
  2. Determine if a problem still exists for the adapter that logged this error by examining the SAS connections as follows:
    1. On the System Service Tools (SST) screen, select Start a Service Tool and then press Enter.
    2. Select Display/Alter/Dump.
    3. Select Display/Alter storage.
    4. Select Licensed Internal Code (LIC) data.
    5. Select Advanced Analysis.
    6. Type FABQUERY on the entry line and then select it with option 1.
    7. On the Specify Advanced Analysis Options screen, type -SUB 01 -IOA DCxx -DSP 0 in the Options field, where DCxx is the adapter resource name. Press Enter.
      Note: More information is available by returning to the Specify Advanced Analysis Options screen and typing -SUB 01 -IOA DCxx -DSP 2 in the Options field, where DCxx is the adapter resource name. Press Enter.
      Do all expected devices appear in the list and are all paths marked as Operational?
      • No: Continue with the next step.
      • Yes: The error condition no longer exists. This ends the procedure.
  3. Perform the following to cause the adapter to rediscover the devices and connections:
    1. Use Hardware Service Manager to perform another IPL of the virtual I/O processor that is associated with this adapter.
    2. Vary on any other resources attached to the virtual I/O processor.
  4. To determine if the problem still exists for the adapter that logged this error, examine the SAS connections by performing the actions in step 2 again. Do all expected devices appear in the list and are all paths marked as Operational?
    • No: Continue with the next step.
    • Yes: The error condition no longer exists. This ends the procedure.
  5. Perform only one of the following corrective actions (listed in the order of preference). If one of the corrective actions has previously been attempted, then proceed to the next one in the list.
    • Reseat cables, if present, on adapter and device enclosure. Perform the following steps:
      1. Use adapter concurrent maintenance to power off the adapter slot, or power off the system or partition.
      2. Reseat the cables.
      3. Use adapter concurrent maintenance to power on the adapter slot, or power on the system or partition.
    • Replace the cable, if present, from adapter to device enclosure. Perform the following steps:
      1. Use adapter concurrent maintenance to power off the adapter slot, or power off the system or partition.
      2. Replace the cables.
      3. Use adapter concurrent maintenance to power on the adapter slot, or power on the system or partition.
    • Replace the device. See Disk drive.
      Note: If there are multiple devices with a path that is not Operational, then the problem is not likely to be with a device.
    • Replace the internal device enclosure or refer to the service documentation for an external expansion drawer. Perform the following steps:
      1. Power off the system or partition. If the enclosure is external, adapter concurrent maintenance can be used instead to power off the adapter slot.
      2. Replace the device enclosure.
      3. Power on the system or partition. If the enclosure is external, adapter concurrent maintenance can be used instead to power on the adapter slot.
    • Replace the adapter. The procedure to replace the adapter can be found in PCI adapter.
    • Contact your service provider.
  6. To determine if the problem still exists for the adapter that logged this error, examine the SAS connections by performing the actions in step 2 again. Do all expected devices appear in the list and are all paths marked as Operational?
    • No: Go to step 5.
    • Yes: The error condition no longer exists. This ends the procedure..