SAN Volume Controller nodes can notify their hosts of errors for SCSI commands that are issued.
Some errors are part of the SCSI architecture and are handled by the host application or device drivers without reporting an error. Some errors, such as read and write I/O errors and errors that are associated with the loss of nodes or loss of access to backend devices, cause application I/O to fail. To help troubleshoot these errors, SCSI commands are returned with the Check Condition status and a 32-bit event identifier is included with the sense information. The identifier relates to a specific error in the SAN Volume Controller cluster error log.
If the host application or device driver captures and stores this error information, you can relate the application failure to the error log.
Table 1 describes the SCSI status and codes that are returned by the SAN Volume Controller nodes.
| Status | Code | Description |
|---|---|---|
| Good | 00h | The command was successful. |
| Check condition | 02h | The command failed and sense data is available. |
| Condition met | 04h | N/A |
| Busy | 08h | An Auto-Contingent Allegiance condition exists and the command specified NACA=0. |
| Intermediate | 10h | N/A |
| Intermediate - condition met | 14h | N/A |
| Reservation conflict | 18h | Returned as specified in SPC2 and SAM2 where a reserve or persistent reserve condition exists. |
| Task set full | 28h | The initiator has at least one task queued for that LUN on this port. |
| ACA active | 30h | This is reported as specified in SAM-2. |
| Task aborted | 40h | This is returned if TAS is set in the control mode page 0Ch. The SAN Volume Controller node has a default setting of TAS=0 , which is cannot be changed; therefore, the SAN Volume Controller node does not report this status. |
SAN Volume Controller nodes notify the hosts of errors on SCSI commands. Table 2 defines the SCSI sense keys, codes and qualifiers that are returned by the SAN Volume Controller nodes.
| Key | Code | Qualifier | Definition | Description |
|---|---|---|---|---|
| 2h | 04h | 01h | Not Ready. The logical unit is in the process of becoming ready. | The node lost sight of the cluster and cannot perform I/O operations. The additional sense does not have additional information. |
| 2h | 04h | 0Ch | Not Ready. The target port is in the state of unavailable. | The following conditions are possible:
|
| 3h | 00h | 00h | Medium error | This is only returned for read or write I/Os. The I/O suffered an error at a specific LBA within its scope. The location of the error is reported within the sense data. The additional sense also includes a reason code that relates the error to the corresponding error log entry. For example, a RAID controller error or a migrated medium error. |
| 4h | 08h | 00h | Hardware error. A command to logical unit communication failure has occurred. | The I/O suffered an error that is associated with an I/O error that is returned by a RAID controller. The additional sense includes a reason code that points to the sense data that is returned by the controller. This is only returned for I/O type commands. This error is also returned from FlashCopy target VDisks in the prepared and preparing state. |
| 5h | 25h | 00h | Illegal request. The logical unit is not supported. | The logical unit does not exist or is not mapped to the sender of the command. |
The reason code appears in bytes 20-23 of the sense data. The reason code provides the SAN Volume Controller node specific log entry. The field is a 32-bit unsigned number that is presented with the most significant byte first. Table 3 lists the reason codes and their definitions.
If the reason code is not listed in Table 3, the code refers to a specific error in the SAN Volume Controller cluster error log that corresponds to the sequence number of the relevant error log entry.
| Reason code (decimal) | Description |
|---|---|
| 40 | The resource is part of a stopped FlashCopy mapping. |
| 50 | The resource is part of a Metro Mirror or Global Mirror relationship and the secondary LUN in the offline. |
| 51 | The resource is part of a Metro Mirror or Global Mirror and the secondary LUN is read only. |
| 60 | The node is offline. |
| 71 | The resource is not bound to any domain. |
| 72 | The resource is bound to a domain that has been recreated. |
| 73 | Running on a node that has been contracted out for some reason that is not attributable to any path going offline. |
| 80 | Wait for the repair to complete, or delete the virtual disk. |
| 81 | Wait for the validation to complete, or delete the virtual disk. |
| 82 | An offline space-efficient VDisk has caused data to be pinned in the directory cache. Adequate performance cannot be achieved for other space-efficient VDisks, so they have been taken offline. |
| 85 | The VDisk has been taken offline because checkpointing to the quorum disk failed. |
| 86 | The svctask repairvdiskcopy -medium command has created a virtual medium error where the copies differed. |