MAP 5350: Powering off a SAN Volume Controller node

MAP 5350: Powering off a SAN Volume Controller node helps you power off a single SAN Volume Controller node to perform a service action without disrupting the host's access to disks.

Powering off a single node will not normally disrupt the operation of a SAN Volume Controller cluster. This is because, within a SAN Volume Controller cluster, nodes operate in pairs called an I/O group. An I/O group will continue to handle I/O to the disks it manages with only a single node powered on. There will, however, be degraded performance and reduced resilience to error.

Care must be taken when powering off a node to ensure the cluster is not impacted more than it need be. If the procedures outlined here are not followed, it is possible your application hosts will lose access to their data or, in the worst case, data will be lost.

You can use the following preferred methods to power off a node that is a member of a cluster and not offline:
  1. Use the Shut Down a Node option on the SAN Volume Controller Console
  2. Use the CLI command svctask stopcluster –node name

It is preferable to use either the SAN Volume Controller Console or the command-line interface (CLI) to power off a node, as these methods provide a controlled handover to the partner node and provide better resilience to other faults in the system.

If a node is offline or not a member of a cluster, it must be powered off using the power button.

To provide the least disruption when powering off a node, the following should all apply:
  • The other node in the I/O group should be powered on and active in the cluster.
  • The other node in the I/O group should have SAN fibre channel connections to all the hosts and disk controllers managed by the I/O group.
  • All the virtual disks handled by this I/O group should be online.
  • The host multipathing is online to the other node in the I/O group.

In some circumstances, the reason you are powering off the node might make meeting these conditions impossible; for instance, if you are replacing a broken fibre channel card, the virtual disks will not be showing an online status. You should use your judgment to decide when it is safe to proceed when a condition has not been met. Always check with the system administrator before proceeding with a power off that you know will disrupt I/O access, as they might prefer to either wait until a more suitable time or suspend the host applications

To ensure a smooth restart, a node must save the data structures it cannot recreate to its local, internal, disk drive. The amount of data it saves to local disk can be high, so this operation might take several minutes. Do not attempt to interrupt the controlled power off.

Attention: The following actions do not allow the node to save data to its local disk. Therefore, you should NOT power off a node using these methods:
  • Removing the power cable between the node and the uninterruptible power supply. Normally the uninterruptible power supply provides sufficient power to allow the write to local disk in the event of a power failure, but obviously it is unable to provide power in this case.
  • Holding down the node's power button. When the power button is pressed and released, the SAN Volume Controller indicates this to the software and the node can write its data to local disk before it powers off. If the power button is held down, the SAN Volume Controller hardware interprets this as an emergency power off and shuts down immediately without giving you the opportunity to save the data to a local disk. The emergency power off occurs approximately four seconds after the power button is pressed and held down.
  • Pressing the reset button on the light path diagnostics panel.
The following topics describe the methods for powering off a node:

Using the SAN Volume Controller Console to power off a node

This topic describes how to power off a node using the SAN Volume Controller Console.

Perform the following steps to use the SAN Volume Controller Console to power off a node:
  1. Sign on to the IBM® System Storage® Productivity Center or master console as an administrator and then launch the SAN Volume Controller Console for the cluster that you are servicing.
  2. Click Work with Nodes > Nodes in the My Work pane. The Viewing Nodes panel is displayed.
  3. Find the node that you are about to shut down.

    If the node that you want to power off is shown as Offline, then the node is not participating in the cluster. In these circumstances, you must use the power button on the node to power off the node.

    If the node that you want to power off is shown as Online, powering off the node can result in the dependent VDisks to also go offline. Verify whether or not the node has any dependent VDisks.

  4. Select the node and click Show Dependent VDisks from the drop-down menu.
  5. Make sure that the status of each virtual disk in the I/O group is Online. You might need to view more than one page.
    This figure shows an example of the Virtual Disks Status panel

    If any VDisks are shown as degraded, only one node in the I/O is processing I/O requests for that VDisk. If that node is powered off, it impacts all the hosts that are submitting I/O requests to the degraded VDisks.

    If any virtual disks are degraded and you believe this might be because the partner node in the I/O group has been powered off recently, wait until a refresh of the screen shows all the virtual disks online. All the virtual disks should be online within thirty minutes of the partner node being powered off.

    Note: If, after waiting 30 minutes, you have a degraded VDisk and all of the associated nodes and MDisks are online, contact the IBM Support Center for assistance.

    Ensure that all VDisks that are being used by hosts are online before you continue.

  6. If possible, check that all the hosts that access VDisks that are managed by this I/O group are able to fail over to use paths that are provided by the other node in the group.

    Perform this check using the host system's multipathing device driver software. The commands to use differ, depending on the multipathing device driver being used. If you are using the System Storage Multipath Subsystem Device Driver (SDD), the command to query paths is datapath query device. It can take some time for the multipathing device drivers to rediscover paths after a node is powered on. If you are unable to check on the host that all paths to both nodes in the I/O group are available, do not power off a node within 30 minutes of the partner node being powered on or you might lose access to VDisks.

  7. If you have decided it is okay to continue and power off the node, select the node that you want to power off, and then select Shut Down a Node from the drop-down menu.
    Figure 1. Shut Down a Node optionShut Down a Node option
  8. Click OK. If you have selected a node that is the last remaining node that provides access to a VDisk, for example, a node that contains solid-state drives (SSDs) with unmirrored VDisks, the Shutting Down a Node-Force panel is displayed with a list of VDisks that will go offline if this node is shut down.
  9. Check that no host applications are accessing the VDisks that will go offline; only continue with the shut down if the loss of access to these VDisks is acceptable. To continue with shutting down the node, click Force Shutdown.
During the shut down, the node saves its data structures to its local disk and destages all the write data held in cache to the SAN disks; this processing can take several minutes.

At the end of this process, the node powers off.

Using the SAN Volume Controller CLI to power off a node

This topic describes how to power off a node using the SAN Volume Controller CLI.

  1. Issue the svcinfo lsnode CLI command to display a list of nodes in the cluster and their properties. Find the node that you are about to shut down and write down the name of the I/O group it belongs to. Confirm that the other node in the I/O group is online.
    svcinfo lsnode -delim : 
    
    id:name:UPS_serial_number:WWNN:status:IO_group_id: IO_group_name:config_node:
    UPS_unique_id 
    1:group1node1:10L3ASH:500507680100002C:online:0:io_grp0:yes:202378101C0D18D8 
    2:group1node2:10L3ANF:5005076801000009:online:0:io_grp0:no:202378101C0D1796 
    3:group2node1:10L3ASH:5005076801000001:online:1:io_grp1:no:202378101C0D18D8 
    4:group2node2:10L3ANF:50050768010000F4:online:1:io_grp1:no:202378101C0D1796

    If the node that you want to power off is shown as Offline, the node is not participating in the cluster and is not processing I/O requests. In these circumstances, you must use the power button on the node to power off the node.

    If the node that you want to power off is shown as Online but the other node in the I/O group is not online, powering off the node impacts all the hosts that are submitting I/O requests to the VDisks that are managed by the I/O group. Ensure that the other node in the I/O group is online before you continue.

  2. Issue the svcinfo lsnodedependentvdisks CLI command to list the VDisks that are dependent on the status of a specified node.
    svcinfo lsnodedependentvdisks group1node1 
    
    vdisk_id       vdisk_name
    0              vdisk0
    1              vdisk1

    If the node goes offline or is removed from the cluster, the dependent VDisks also go offline. Before taking a node offline or removing it from the cluster, you can use the command to ensure that you do not lose access to any VDisks.

  3. If you have decided that it is okay to continue and that you can power off the node, issue the svctask stopcluster –node <name> CLI command to power off the node. Ensure that you use the –node parameter, because you do not want to power off the whole cluster:
    svctask stopcluster –node group1node1
    Are you sure that you want to continue with the shut down? yes
    Note: If there are dependent VDisks and you want to shut down the node anyway, add the -force parameter to the svctask stopcluster command. The force parameter forces continuation of the command even though any node-dependent VDisks will be taken offline. Use the force parameter with caution; access to data on node-dependent VDisks will be lost.

    During the shut down, the node saves its data structures to its local disk and destages all the write data held in the cache to the SAN disks; this process can take several minutes.

    At the end of this process, the node powers off.

Using the SAN Volume Controller Power control button

Do not use the power control button to power off a node unless it is an emergency or you have been directed to do so by another procedure.

With this method, you cannot check the cluster status from the front panel, so you cannot tell if the power off is liable to cause excessive disruption to the cluster. Instead, use the SAN Volume Controller Console or the CLI commands, described in the previous topics, to power off an active node.

If you must use this method, notice in Figure 2 that each SAN Volume Controller model type has a power control button number one on the front.

Figure 2. SAN Volume Controller models 2145-CF8, 2145-8A4, 2145-8G4, and 2145-8F4 or 2145-8F2 power control button and the SAN Volume Controller 2145-4F2 power switch
SAN Volume Controller models 2145-CF8, 2145-8A4, 2145-8G4, 2145-8F4, and 2145-8F2 power control button and the SAN Volume Controller 2145-4F2 power switch

When you have determined it is safe to do so, press and immediately release the power button. The front panel display changes to display Powering Off, and a progress bar is displayed.

The 2145-CF8 requires that you remove a power button cover before you can press the power button. The 2145-8A4, the 2145-8G4, the 2145-8F4, or 2145-8F2 might require you to use a pointed device to press the power button.

If you press the power button for too long, the node cannot write all the data to its local disk. An extended service procedure is required to restart the node, which involves deleting the node from the cluster and adding it back into the cluster.

This figure shows how Powering off is displayed on the front panel
The node saves its data structures to disk while powering off. The power off process can take up to five minutes.

When a node is powered off by using the power button (or because of a power failure), the partner node in its I/O group immediately stops using its cache for new write data and destages any write data already in its cache to the SAN attached disks. The time taken by this destage depends on the speed and utilization of the disk controllers; it should complete in less than 15 minutes, but it could be longer, and it cannot complete if there is data waiting to be written to a disk that is offline.

If a node powers off and restarts while its partner node continues to process I/O, it might not be able to become an active member of the I/O group immediately. It has to wait until the partner node completes its destage of the cache. If the partner node is powered off during this period, access to the SAN storage that is managed by this I/O group is lost. If one of the nodes in the I/O group is unable to service any I/O, for example, because the partner node in the I/O group is still flushing its write cache, the VDisks that are managed by that I/O group will have a status of Degraded.

Library | Support | Terms of use | Feedback
© Copyright IBM Corporation 2003, 2009. All Rights Reserved.