You can use the command-line interface (CLI)
to re-add a failed node back into a cluster after it was repaired.
Special procedures when adding a node to a cluster
Applications
on the host systems direct I/O operations to file systems or logical
volumes that are mapped by the operating system to virtual paths (vpaths),
which are pseudo disk objects supported by the Subsystem Device Driver
(SDD). SDD maintains an association between a VPath and a SAN Volume Controller virtual
disk (VDisk). This association uses an identifier (UID) which is unique
to the VDisk and is never reused. The UID allows SDD to directly associate
vpaths with VDisks.
SDD operates within a protocol stack that
contains disk and fibre channel device drivers that allow it to communicate
with the SAN Volume Controller using
the SCSI protocol over fibre channel as defined by the ANSI FCS standard.
The addressing scheme provided by these SCSI and fibre-channel device
drivers uses a combination of a SCSI logical unit number (LUN) and
the worldwide node name (WWNN) for the fibre channel node
and ports.
If an error occurs, the error recovery procedures
(ERPs) operate at various tiers in the protocol stack. Some of these
ERPs cause I/O to be redriven using the same WWNN and LUN numbers
that were previously used.
SDD does not check the association
of the VDisk with the VPath on every I/O operation that it performs.
Before
you add a node to the cluster, you must check to see if any of the
following conditions are true:
- The cluster has more than one I/O group.
- The node being added to the cluster uses physical node hardware
or a slot which has previously been used for a node in the cluster.
- The node being added to the cluster uses physical node hardware
or a slot which has previously been used for a node in another cluster
and both clusters have visibility to the same hosts and back-end storage.
If any of the previous conditions are true, the following
special procedures apply:
- The node must be added to the same I/O group that it was previously
in. You can use the command-line interface (CLI) command svcinfo
lsnode or the SAN Volume Controller Console to
determine the WWN of the cluster nodes.
- Before you add the node back into the cluster, you must shut down
all of the hosts using the cluster. The node must then be added before
the hosts are rebooted. If the I/O group information is unavailable
or it is inconvenient to shut down and reboot all of the hosts using
the cluster, then do the following:
- On all of the hosts connected to the cluster, unconfigure the
fibre-channel adapter device driver, the disk device driver and multipathing
driver before you add the node to the cluster.
- Add the node to the cluster and then reconfigure the fibre-channel
adapter device driver, the disk device driver, and multipathing driver.
Scenarios where the special procedures can apply
The
following two scenarios describe situations where the special procedures
can apply:
- Four nodes of an eight-node cluster have been lost because of the
failure of a pair of 2145 UPS or
four 2145 UPS-1U.
In this case, the four nodes must be added back into the cluster using
the CLI command svctask addnode or the SAN Volume Controller Console.
- A user decides to delete four nodes from the cluster and add them
back into the cluster using the CLI command svctask addnode or
the SAN Volume Controller Console.
For 5.1.0 nodes,
the SAN Volume Controller automatically
re-adds nodes that have failed back to the cluster. If the cluster
reports an error for a node missing (error code 1195) and that node
has been repaired and restarted, the cluster automatically re-adds
the node back into the cluster. This process can take up to 20 minutes,
so you can manually re-add the node by completing the following steps:
- Issue the svcinfo lsnode CLI command
to list the nodes that are currently part of the cluster and determine
the I/O group for which to add the node.
The following
is an example of the output that is displayed:
svcinfo lsnode -delim :
id:name:UPS_serial_number:WWNN:status:IO_group_id:IO_group_name
:config_node:UPS_unique_id:hardware:iscsi_name:iscsi_alias
1:node1:10L3ASH:0000000000000000:offline:0:io_grp0:no:1000000000003206:
8A4:iqn.1986-03.com.ibm:2145.ndihill.node1:
2:node2:10L3ASH:50050768010050B0:online:0:io_grp0:yes:10000000000050B0:
8A4:iqn.1986-03.com.ibm:2145.ndihill.node2:
- Issue the svcinfo lsnodecandidate CLI
command to list nodes that are not assigned to a cluster and to verify
that a second node is added to an I/O group.
The following
is an example of the output that is displayed:
svcinfo lsnodecandidate -delim :
id:panel_name:UPS_serial_number:UPS_unique_id:hardware
5005076801000001:000341:10L3ASH:202378101C0D18D8:8A4
5005076801000009:000237:10L3ANF:202378101C0D1796:8A4
50050768010000F4:001245:10L3ANF:202378101C0D1796:8A4
....
- Issue the svctask addnode CLI command to add a node
to the cluster.
Important: Each node in an
I/O group must be attached to a different uninterruptible power supply.
The
following is an example of the CLI command you can issue to add a
node to the cluster using the panel name parameter:
svctask addnode -panelname 000237
-iogrp io_grp0
Where 000237 is the panel
name of the node, io_grp0 is the name of the I/O group that
you are adding the node to.
The following is an example of the
CLI command you can issue to add a node to the cluster using the WWNN
parameter:
svctask addnode -wwnodename 5005076801000001
-iogrp io_grp1
Where 5005076801000001 is
the WWNN of the node, io_grp1 is the name of the I/O group
that you are adding the node to.
- Issue the svcinfo lsnode CLI command
to verify the final configuration.
The following example
shows output that is displayed:
svcinfo lsnode -delim :
id:name:UPS_serial_number:WWNN:status:IO_group_id:IO_group_name:config_node:UPS_unique_id:
hardware:iscsi_name:iscsi_alias
1:node1:10L3ASH:0000000000000000:offline:0:io_grp0:no:1000000000003206:
8A4:iqn.1986-03.com.ibm:2145.ndihill.node1:
Record the
following information for the new node:
- Node name
- Node serial number
- WWNN
- IQNs (if using hosts attached
using iSCSI connections)
- All WWPNs
- I/O group that contains the node
Note: If this command is issued quickly after
you have added nodes to the cluster, the status of the nodes might
be adding. The status is shown as adding if the process of adding
the nodes to the cluster is still in progress. You do not have to
wait for the status of all the nodes to be online before you continue
with the configuration process.
The nodes have been added to the cluster.