Zonal VM, Primary Region, VM Zone (R1Z1) Failure
Figure: Zonal VM, Primary Region, VM Zone (R1Z1) Failure.
If a failure impacts all VMs of a Protected Domain, the entire Domain should be failed over to the secondary region. If the failure only impacts some VMs (but not all), separate the failed VMs from the good ones. The following steps should be performed.
- VM failure is detected (outside the scope of AROVA) and the affected VM(s) should be identified.
- This example illustrates VM failure in R1Z1.
- The affected VM will be failed over to the secondary region.
- AROVA remains in the primary region.
- From the AROVA UI, create a new Protected Domain.
- One of the following options should be performed to separate the failed VMs:
- Move all VMs to be failed over to the newly created Domain, or
- Move the VMs that are NOT desired to be failed over to the newly created Domain.
- No data copying is required.
- If several VMs are impacted by the same zone failure, they can be combined under a single Domain for failover.
- It is recommended to specify the new Domain name as a derivative of the original name and merge them, when possible.
- From the AROVA UI, initiate failover of the Domain containing the above specified VMs.
The following screen shots illustrate the above steps:
- Identify the impacted VMs with replication issues.
Figure: Identify impacted VMs.
- Create a new Protected Domain to failover the VMs with replication issues.
Figure: Create a new Protected Domain.
- Select and move the VMs (with replication issues) to the new Domain.
Figure: Move selected VMs.
Figure: Move the VMs to the new Domain.
Figure: The VMs have been successfully moved.
- Failover the new Domain containing the moved VMs.
Figure: Failover the Domain.
Figure: Start Failover.
If any of the Protected VM instances are still present in the GCP inventory, AROVA will display a warning message. To continue failover, the VMs must first be deleted from GCP.
Note: If it is not possible to delete the VMs, an option is available to ignore the warning and continue with failover anyway. Proceeding to failover without first deleting the VMs from the primary region can potentially create the situation with the same VM running on both sites when the primary region/zone is restored.
Figure: Warning message to remove VMs from the GCP inventory.
Figure: Failover of the new Domain has been started.
Also see: