Evaluating JetStream DR Software
Overview
- Set up a VMware cluster to be “protected” for the evaluation. Select applications to run on protected VMs during “normal” operation (no failure events) and for business continuity testing (during failover and failback). For testing, use a standard test suite (e.g., benchmark tests such as TPCC, HammerDB, etc.) or a custom test suite of your own choosing. Start the VMs/applications prior to the installation of JetStream DR software.
- Subscribe to JetStream DR for AVS from the Azure Marketplace.
- Review product documentation to become familar with installing, configuring and operating JetStream DR software.
- Define JetStream DR components and required resources (i.e., number of protected domains and their content, number of DR Virtual Appliances and their resources, required replication log device capacity and performance, required storage site capacity and replication bandwidth, and more).
- Smaller tests involving a low number of VMs (10-15 VMs) can be manually performed using the JetStream UI. Larger tests can be automated using Capacity Planning Tool scripts.
- Allocate the Azure Blob storage account under the Azure subscription.
- Install and configure JetStream DR software.
- Confirm that key VMware capabilities (e.g., snapshots, vMotion, etc.) are fully supported and data consistency is as expected.
Installation and Configuration
- If more than 10 VMs will be tested, it is recommended to use the CPT script to determine resources that will be required for VM protection, both on-premises and in Azure. Using statistics captured from vCenter Server, the tool provides recommendations regarding required DR resources and JetStream components including: number of protected domains and their content, number of DR Virtual Appliances and their resources, required Replication Log device capacity and performance, required Blob store capacity and replication bandwidth, and more.
- Connect the protected environment to the Azure Blob storage account.
- Note the replication bandwidth recommendation and compare it to the actual bandwidth. Actual bandwidth can be measured using the Bandwidth Tester – another tool available from the Automation Toolkit. Bandwidth Tester also checks that the Azure Blob storage account is available. Ensure the Azure Blob storage account has sufficient capacity and is provisioned at the appropriate performance level.
- Install JetStream DR software in the test cluster. This includes installation of the Management Server Appliance (MSA), the IO Filters on all hosts of the cluster, and at least one DR Virtual Appliance (DRVA) or as many DRVAs as recommended.
- Optional alerts can be configured to send e-mail notifications.
- Define protected domains and their corresponding DR resources (both on-premises and in Azure). For best results, consult the output of the CPT script to ensure that sufficient high-performance storage resources (e.g., NVMe) are available for the replication log store.
- If not already running in AVS, create an account for a “pilot light” vSphere cluster (minimum 3-node cluster) in AVS. Confer with Microsoft whether this could be provisioned as a temporary “trial” account for testing.
- For continued protection at the AVS recovery site (after failover), JetStream DR must be running in the pilot light cluster. Once again, obtain JetStream DR software from the Azure Marketplace and install it in the AVS cluster in the same manner as with the on-premises environment. However, do not install VMs or configure protected domains in the pilot light cluster. Pilot-light cluster installation and configuration is automated.
- Confirm the network topology. Note that AVS uses NSX-T internally. This may require additional work if the protected environment is not already using NSX.
- Define runbooks for failover, failback, and failover testing.
Testing Under Normal Operation
- Another output of the CPT script is a Start Protection Plan. Also included with the Automation Toolkit are dedicated scripts to execute the plan. They can automate the creation and configuration of DRVAs and protected domains and start protection. Alternatively, these steps can be manually performed operating the JetStream MSA using its UI.
- Start protection to begin replicating data to the Blob store. For enhanced testing, initiate protection while test workloads are running and notice operation of the VMs is not paused. Note any impact on application performance when protection is initiated.
- If desired, observe the DR statistics through the GUI (incoming/outgoing data rates, RPO, garbage collection, replication logging). Reports can also be generated. It may be helpful to observe statistics as VM protection is initiated, as well as after the VMs’ status have changed to “recoverable.” In the latter case, note the RPO data to confirm it meets your target RPO SLA.
- If desired, verify application performance and sustained protection while performing normal vSphere operations, such as vMotion, snapshots, etc. (Consult the Admin Guide for details about any vSphere operations that may have special requirements or limitations operating with JetStream DR software.)
- For near-zero RTO, continuous failover should be started at the recovery site. If CPT script is used, a plan for the recovery site is generated that can be used to start continuous rehydration automatically. Alternatively, continuous rehydration can be started manually from the recovery site using the MSA UI.
- Using the runbook created for failover testing, perform a non-disruptive failover (rehydration) of the protected VMs from Azure Blob storage into the AVS cluster recovery site. Notice this task does not affect the continued operation and protection of the on-premises VMs. Observe RPO and data consistency of the recovered VMs at the AVS recovery site, as well as the continued operation of JetStream DR in the protected on-premises cluster.
- Create a partial failure or complete failure of the protected cluster in the on-premises environment. Execute the failover runbook to recover the VMs from Azure Blob storage into the AVS cluster recovery site. Observe RPO and data consistency. Notice the recovered VMs in AVS continue to replicate their data into the appropriate containers in Azure Blob storage. (Note different methods can be used to simulate failure of the protected site. It could be powering off test VMs, deleting JetStream DR components (MSA, DRVAs), or turning off protected site hosts, etc.)
- Restore the on-premises test environment. Depending upon the steps used to simulate the disaster incident, it may be necessary to restore and/or verify the configuration of the protected cluster. If necessary, JetStream DR software may need to be reinstalled.
- The recovery_utility_prepare_failback script provided in the Automation Toolkit can be used to help clean the original protected site of any obsolete VMs, domain information, etc.
- Once the protected site is ready for failback, use the failback runbook to initiate the return of the VMs and their data from the object store to back to the original VMware environment (or an alternate VMware environment, if desired). Observe application performance at the recovery site during the background replication process.
- Complete the failback process then confirm resumed VM protection and data consistency.
Was this article helpful?