Explanation
The offline solid state drive identified by this error must be repaired. The SAN Volume Controller (SVC) error log will identify a managed disk ID. This can be determined by running the maintenance procedure on the 1202 error.
Action
Determine whether the managed disk referred to in the error is currently offline using the SVC GUI or command-line interface (CLI). If the drive is online then the problem was transient, and should be investigated by IBM technical support before you replace hardware.
If the fault is still present, the managed disk will be offline. If the managed disk that is indicated in the error is still offline, perform the following steps:
- Submit the command 'svcinfo lsmdisk -filtervalue status=offline' to identify all of the offline solid state drives.
- Submit the command 'svcinfo lsmdisk (mdisk id)', where (mdisk id) is the ID of the offline MDisk. Record the ‘controller_name’, ‘node_name’ and ‘location’ properties of the managed disk.
- Submit the command 'svcinfo lsnodevpd (node_name)', where (node_name) is the value of that property variable recorded in step 2. Record the front_panel_id property of this node.
- If there are multiple solid state drives in a single node and all of the solid state drives in the node are offline, it is likely that the faulty component will not be the solid state drive. Determine how many solid state drives are in the affected node, first ensure that the node is online. Submit the command 'svcinfo lsnode (node_name), where (node_name) is the name of the node identified in step 2, and ensure that the ‘status’ property value is Online. If the node is offline, follow standard service procedures to resolve the node offline status.
- Submit the command 'svcinfo lsmdisk –filtervalue controller_name=(controller name)', where (controller name) is the is the value of that property variable recorded in step 2.
- If the command in step 5 displays multiple managed disks and all of them are ‘offline’, replace the following components in sequence: High speed SAS adapter and SAS Cable, PCIe Riser card, SAS drive backplane.
- If there is a spare drive slot in any of slots 0-3 in another SVC node in the same cluster that contains the high speed SAS adapter, swap the solid state drive into a spare drive slot using the solid state drive remove/replace instructions in the note below. If the managed disk is also offline in the new node, the solid state drive must be replaced by following the procedure in MAP 'Replacing an offline SSD'. If the managed disk comes online in this new drive bay, the drive has not failed. Swap the solid state drive back into its original location to determine whether the SAS components in the original node have failed. If the drive stays offline in the original node, the faulty component is either the high speed SAS adapter, the SAS cable or the disk drive backplane. Otherwise, the problem has been resolved by reseating the drive.
Note: A solid state drive can be swapped into a spare drive bay on any node that contains a high speed SAS adapter, but installing the drive into a different node will introduce a performance penalty because I/Os must be forwarded between nodes. To restore performance, the drive should be returned to its original node as soon as possible once the issue has been resolved.
Possible Cause-FRUs or other:
- High speed SAS adapter (30%)
- SAS Cable (30%)
- Solid state drive (30%)
- Disk drive backplane (10%)