MABIP55

Use this procedure to isolate a failing I/O adapter.

Attention: Go to PCI bus isolation using AIX, Linux, or the HMC to isolate a PCI bus problem from AIX, Linux, or the HMC.
  1. If the system is not IPLed, will it IPL to DST?
    No:
    Perform MABIP54. This ends the procedure.
    Yes:
    From the SAL display for the reference code, record the count. Continue with the next step.
  2. Go to the SST/DST display in the partition which reported the problem. Use STRSST if i5/OS® is running; use function 21 if STRSST does not work; or IPL the partition to DST.
  3. On the Start Service Tools Sign On display, type in a user ID with QSRV authority and password.
  4. Select Start a service tool > Hardware service manager > Logical hardware resources > System bus resources.
  5. Is there a resource name logged in the SAL entry?
    No:
    Continue with the next step.
    Yes:
    Go to step 12
  6. Do you have a location for the I/O processor?
    No:
    Record the Direct Select Address (DSA), word 7 of the reference code, from the SAL display. Then continue with the next step.
    Yes:
    Go to step 10
  7. Return to the HSM System bus resources display.
  8. Locate the I/O processor by performing the following:
    1. Select Display detail.
    2. Compare the DSA with the bus, card, and board information for the IOP.
      Note: The card information on the HSM display is in decimal format. You must convert the decimal card information to hexadecimal format to match the DSA format.
    3. Repeat this step until you find the IOP with the same DSA.
  9. Select Cancel, and then go to step 13.
  10. Locate the I/O processor in HSM by performing the following for each IOP:
    1. Select Associated packaging resource(s) > Display detail.
    2. Repeat until you find the IOP with the same location.
  11. Select Cancel > Cancel and go to step 13.
  12. Page forward until you find the multi-adapter bridge and IOP where the problem exists. Verify that the multi-adapter bridge and IOP are correct by matching the resource name(s) on the display with the resource name(s) in the SAL for the problem you are working on.
  13. For the IOP you are working on, select Resources associated with IOP (if the I/O adapters are not already displayed).
  14. If there is an IOA that is listed in any state other than "operational", then perform steps 15 through 18, starting with the disabled IOA by moving the cursor to the disabled IOA. Otherwise, move the cursor to the first IOA that is assigned to the IOP.
  15. Select Associated packaging resource(s) > Concurrent maintenance > Power off domain. Record the unit ID of the slot you are powering off. Did the domain power off successfully?
    No:
    Choose from the following options:
    • If only one IOA was listed as failing, power down the system and replace the IOA. Re-IPL the system. If a different reference code occurred, perform problem analysis and work that reference code. If there was no reference code, go to Verify a repair . This ends the procedure.
    • If there were multiple failed IOAs and concurrent maintenance did not work on one, then move to the next failed IOA and repeat steps 15 through 18.
    • If concurrent maintenance does not work for multiple failed IOAs, this procedure will not be able to identify a failing I/O adapter. Return to the procedure that sent you here. This ends the procedure.
    Yes
    Perform MABIP05 and then return here and continue with the next step.
  16. Did the IOP reset and IPL successfully?
    No:
    This procedure will not be able to identify a failing I/O adapter. Return to the procedure that sent you here. This ends the procedure.
    Yes:
    Check for the same failure that sent you to this procedure. Check the system control panel, the SAL for the partition that reported the problem, or the Work with partition status display for the partition that reported the problem. In the SAL, the count will increase if the reference code occurred again. Continue with the next step.
  17. Did the same reference code occur after the IOP was reset and IPL'd?
    No:
    Go to step 19.
    Yes:
    Perform the following:
    1. Go to the Hardware Service Manager display.
    2. Go to Packaging Hardware Resources.
    3. Power on the IOA by selecting Power on domain.
    4. Reassign the IOA to the IOP
    5. Return to the HSL resource display, showing the IOP and associated resources.
    6. Continue with the next step.
  18. Is there any other IOA, assigned to the IOP, that you have not already powered off and on?
    No:
    Go to step 21.
    Yes:
    Move the cursor to another IOA assigned to the IOP, choosing IOAs with a status of "unknown" or "disabled" before moving on to IOAs with a status of "operational". Go to step 15.
  19. The failing IOA is located. Exchange the I/O adapter that you just powered off. Use the location you recorded in step 15 to locate the IOA.
  20. Power on the IOA that you just exchanged. Does the same reference code that sent you to this procedure still occur?
    No:
    You have exchanged the failing IOA. Go to Verify a repair . This ends the procedure.
    Yes:
    The IOA is not the failing item. Remove the IOA and reinstall the original IOA. Continue with the next step.
  21. No failing IOAs were identified. Return to the procedure that sent you here. This ends the procedure.