Moving production to Site B after planned outages (failover)

When you schedule a planned outage at your production site (Site A), you can switch production to your recovery site (Site B), allowing the processing of data to resume at Site B. This process is known as a failover recovery.

The storage units at both Site A and Site B must be functional and accessible.
In a disaster recovery environment, when two storage units are set up in two geographically distinct locations, the storage unit at the production or local site is referred to as Site A and the storage unit at the remote or recovery site as Site B.

For this scenario, assume that all I/O to Site A has ceased because of a planned outage, such as a scheduled maintenance. The failover operation is issued to the storage unit that will become the primary. That is, production is moved to Site B during this outage, which makes the target volumes at Site B convert to source volumes and causes them to enter a suspended state. Your original source volumes at Site A remain in the state they were in at the time of the site switch. Table 1 provides an example of the implementation of failover and failback operations.

Note: The failover recovery operation does not reverse the direction of a remote mirror and copy pair. It changes a target volume into a suspended source volume, while leaving the source volume in its current state.
The following assumptions are made for this scenario:
  • Applications continue to update the source volumes that are located at Site A.
  • Paths are established from Site A to Site B.
  • Volume pairs are in duplex state.

The following steps summarize the actions that you must take to move production to Site B after you initiate a planned outage at Site A.

  1. Quiesce applications to cease all write I/O from updating the source volumes when the planned outage window is reached. Quiescing your applications might occur as part of a planned outage, but the delay in processing caused by the quiesce action should be brief.
    Note: On some host systems, such as AIX®, Windows®, and Linux®, before you perform FlashCopy operations, you must quiesce the applications that access FlashCopy source volumes. The source volumes must be unmounted (depending on the host operating system) during FlashCopy operations. This ensures that there is no data in the buffers that might be flushed to the target volumes and potentially corrupt them.
  2. Perform a failover recovery operation to Site B. After the failover operation has processed successfully, the volumes at Site B transition from target to source volumes.
  3. Create paths in the opposite direction from Site B to Site A depending on your path design and when the source storage unit becomes available. You need the paths in the opposite direction because you want to transfer the updates back to Site A.
  4. Rescan (this is dependant on your operating system) your fibre-channel devices. The rescanning removes device objects for the site A volumes and recognizes the new source volumes.
  5. Mount your target volumes (now the new source volumes) on the target storage unit at Site B.
  6. Start all applications. After the applications start, all write I/O operations to the source volumes are tracked. Depending on your plans regarding Site A, the volume pairs can remain suspended (if you want to do offline maintenance).
  7. Initiate a failback recovery operation when your scheduled maintenance is complete. The failback recovery operation initiates the transfer of data back to Site A. This process resynchronizes the volumes at Site A with the volumes at Site B.
    Note: Failback recovery operations are usually used after a failover recovery has been issued to restart mirroring either in the reverse direction (remote site to local site) or original direction (local site to remote site).

    Table 1 provides an example of the implementation of failover and failback operations:

Table 1. Failover and failback implementation
Step Operation MC connectivity required to Format of source volume and target volume Format of source and target volume pair Result: Site A Result: Site B
1
Disaster at Site A Failover Site B Volume B, Volume A Volume B1 : Volume A1 Volume A1 -> Volume B1 (Suspended) The volume pair might display as full or pending duplex state if host write operations have stopped. Volume B1 -> Volume A1 (Suspended)
2 (Site A volumes must be in a suspended state)
Return production to Site A Failback Site A Volume A, Volume B Volume A1 : Volume B1 Volume A1 -> Volume B1 Volume A1 -> Volume B1
3a (Site B volumes must be in a suspended state)
Return to production (Site B)
Note: If Site A is still not operational; production can continue at Site B.
Failback Site B Volume B, Volume A Volume B1 : Volume A1 Volume B1 -> Volume A1 Volume B1 -> Volume A1
3b (prepare to return to production (Site A) from production (Site B) Failover Site A Volume A, Volume B Volume A1: Volume B1 Volume A1 -> Volume B1 Volume B1 -> Volume A1 (Suspended state; the volume pair might display full or pending state if host write operations have stopped.)
3c (Site A volumes must be in a suspended state)
Return to production - Site A Failback Site A Volume A, Volume B Volume A1: Volume B1 Volume A1 -> Volume B1 Volume A1 -> Volume B1
Library | Support | Terms of use | Feedback
© Copyright IBM Corporation 2004, 2007. All Rights Reserved.