A failed connection caused by a failing component in the SAS fabric between,
and including, the adapter and device enclosure.
A failed connection caused by a failing component within the device enclosure,
including the device itself.
Note:
For SRC xxxx4060, the failed connection was previously working,
and may have already recovered.
Considerations:
Power off the system, partition, or card slot before connecting and disconnecting
cables or devices, as appropriate, to prevent hardware damage.
Some systems have SAS and PCI-X/PCIe bus interface logic integrated onto
the system boards and use a pluggable RAID Enablement Card (a non-PCI form
factor card) for these SAS/PCI-X/PCIe buses. For these configurations, replacement
of the RAID Enablement Card is unlikely to solve a SAS related problem because
the SAS interface logic is on the system board.
Some systems have the disk enclosure or removable media enclosure integrated
in the system with no cables. For these configurations the SAS connections
are integrated onto the system boards and a failed connection can be the result
of a failed system board or integrated device enclosure.
Some configurations involve a SAS adapter connecting to internal SAS disk
enclosures within a system using a cable card. Keep in mind that when the
procedure refers to a device enclosure, it could be referring to the internal
SAS disk slots or media slots. Also, when the procedure refers to a cable,
it could include a cable card.
When using SAS adapters in a Dual Storage IOA configuration, ensure that
the actions taken in this procedure are against the primary adapter (not the
secondary adapter).
Attention:
When SAS fabric problems exist, replacing RAID adapters is not recommended
without assistance from your service provider. Because the adapter might contain
non-volatile write cache data and configuration data for the attached disk
arrays, additional problems can be created by replacing an adapter.
Removing functioning disk units in a disk array is not recommended without
assistance from your service provider. A disk array might become unprotected
or failed if functioning disk units are removed. The removal of functioning
disk units might also result in additional problems in the disk array.
Determine the resource name of the adapter that reported the problem
by performing the following:
Access SST/DST.
Access the Product Activity Log and record the resource name
that this error is logged against. If the resource name is an adapter resource
name, use it and continue with the next step. If the resource name is a disk
unit resource name, use the Hardware Service Manager to determine the resource
name of the adapter that is controlling this disk unit.
Determine if a problem still exists for the adapter
that logged this error by examining the SAS connections as follows:
On the System Service Tools (SST) screen, select Start
a Service Tool then press Enter.
Select Display/Alter/Dump.
Select Display/Alter storage.
Select Licensed Internal Code (LIC) data.
Select Advanced Analysis.
Type in FABQUERY on the entry line and
then select it with option 1.
On the Specify Advanced Analysis Options screen, type -SUB
01 -IOA DCxx -DSP 0 in the Options field, where
DCxx is the adapter resource name. Press Enter.
Note:
More information is available by returning to the Specify Advanced
Analysis Options screen and typing -SUB 01 -IOA DCxx -DSP 2 in
the Options field, where DCxx is the adapter resource name.
Press Enter.
Do all expected devices appear in the list and are all
paths marked as Operational?
No: Continue with the next step.
Yes: The error condition no longer exists. This ends the procedure.
Perform the following to cause the adapter to rediscover the devices
and connections:
Use Hardware Service Manager to re-IPL the virtual I/O processor
that is associated with this adapter.
Vary on any other resources attached to the virtual I/O processor.
To determine if the problem still exists for the adapter that logged
this error, examine the SAS connections by performing the actions in step 2 again. Do all expected devices
appear in the list and are all paths marked as Operational?
No: Continue with the next step.
Yes: The error condition no longer exists. This ends the procedure.
Perform only one of the following corrective actions
(listed in the order of preference). If one of the corrective actions has
previously been attempted, then proceed to the next one in the list.
Reseat cables if present on adapter and device enclosure.
Perform the following:
Use adapter concurrent maintenance to power off the adapter slot, or power
off the system or partition.
Reseat the cables.
Use adapter concurrent maintenance to power on the adapter slot, or power
on the system or partition.
Replace the cable if present from adapter to device enclosure. Perform
the following:
Use adapter concurrent maintenance to power off the adapter slot, or power
off the system or partition.
Replace the cables.
Use adapter concurrent maintenance to power on the adapter slot, or power
on the system or partition.
If
there are multiple devices with a path that is not Operational, then the problem
is not likely to be with a device.
Replace the internal device enclosure or refer to the service documentation
for an external expansion drawer. Perform the following:
Power off the system or partition. If the enclosure is external, adapter
concurrent maintenance can be used instead to power off the adapter slot.
Replace the device enclosure.
Power on the system or partition. If the enclosure is external, adapter
concurrent maintenance can be used instead to power on the adapter slot.
Replace the adapter. The procedure to replace the adapter can be found
in PCI adapter.
Contact your service provider.
To determine if the problem still exists for the adapter that logged
this error, examine the SAS connections by performing the actions in step 2 again. Do all expected devices
appear in the list and are all paths marked as Operational?