Each pair of nodes is known as an input/output (I/O) group. An I/O group is defined during the cluster configuration process.
Each node can only be in one I/O group. The I/O groups are connected to the SAN so that all backend storage and all application servers are visible to all of the I/O groups. Each pair of nodes has the responsibility to serve I/O operations on a particular set of virtual disks (VDisks).
VDisks are logical disks that are presented to the SAN by SAN Volume Controller nodes. VDisks are also associated with an I/O group. Nodes do not contain any internal battery backup units and therefore must be connected to an uninterruptible power supply to provide data integrity in the event of a cluster-wide power failure. The uninterruptible power supply only provides power long enough to enable the SAN Volume Controller cluster to shutdown and save cache data. The uninterruptible power supply is not intended to maintain power and keep the nodes running during an outage.
When an application server performs I/O to a VDisk, it can access the VDisk with either of the nodes in the I/O group. When you create a VDisk, you can specify a preferred node. Many of the multipathing driver implementations that SAN Volume Controller supports use this information to direct I/O to the preferred node. The other node in the I/O group is used only if the preferred node is not accessible.
If you do not specify a preferred node for a VDisk, the node in the I/O group that has the fewest VDisks is selected by the SAN Volume Controller to be the preferred node.
After the preferred node is chosen, it can be changed only when the VDisk is moved to a different I/O group.
To view the current preferred node assignment, run the svcinfo lsvdisk command.
An I/O group consists of two nodes. When a write operation is performed to a VDisk, the node that processes the I/O duplicates the data onto the partner node that is in the I/O group. After the data is protected on the partner node, the write operation to the host application is completed. The data is physically written to disk later.
Read I/O is processed by referencing the cache in the node that receives the I/O. If the data is not found, it is read from the disk into the cache. The read cache can provide better performance if the same node is chosen to service I/O for a particular VDisk.
I/O traffic for a particular VDisk is, at any one time, managed exclusively by the nodes in a single I/O group. Thus, although a cluster can have eight nodes within it, the nodes manage I/O in independent pairs. This means that the I/O capability of the SAN Volume Controller scales well, because additional throughput can be obtained by adding additional I/O groups.
Figure 1 shows a write operation from a host (1), that is targeted for VDisk A. This write is targeted at the preferred node, Node 1 (2). The write is cached and a copy of the data is made in the partner node, Node 2's cache (3). The host views the write as complete. At some later time, the data is written, or de-staged, to storage (4). Figure 1 also shows two uninterruptible power supply units correctly configured so that each node is in a different power domain.
When a node fails within an I/O group, the other node in the I/O group assumes the I/O responsibilities of the failed node. Data loss during a node failure is prevented by mirroring the I/O read and write data cache between the two nodes in an I/O group.
If only one node is assigned to an I/O group or if a node has failed in an I/O group, the cache is flushed to the disk and then goes into write-through mode. Therefore, any writes for the VDisks that are assigned to this I/O group are not cached; they are sent directly to the storage device. If both nodes in an I/O group go offline, the VDisks that are assigned to the I/O group cannot be accessed.
When a VDisk is created, the I/O group to provide access to the VDisk must be specified. However, VDisks can be created and added to I/O groups that contain offline nodes. I/O access is not possible until at least one of the nodes in the I/O group is online.
The cluster also provides a virtual recovery I/O group that can be used for certain service actions. You can move the VDisks to the recovery I/O group and then into a working I/O group. I/O access is not possible when VDisks are assigned to a recovery I/O group.