Overview
This is an overview of key elements of the JetStream DR architecture:
- IO Filter: VMware vSphere IO Filters are deployed to each vSphere host to enable continuous data intercept for each protected virtual machine’s virtual disks. The IO Filters are deployed by vSphere and run in the data storage path between the VMs and virtual disks. The IO Filters also monitor relevant events, such as vMotion, Storage vMotion, snapshots, etc.
- JetStream Management Server Appliance (MSA): The JetStream MSA is a virtual appliance that runs as a plug-in to vCenter. It collects and maintains statistics relevant to the protection of the VMs in the cluster(s) managed through the vCenter Server. It also provides administrative functions, such as selecting VMs for protection, etc. The MSA can be accessed through the vCenter Web Client GUI, and its functions can also be accessed via CLI or RESTful APIs.
- JetStream DR Virtual Appliance (DRVA): While the IO Filters capture data for replication, they do not communicate directly with the object store. The JetStream DRVA is a virtual appliance that maintains the replication log store and manages the transfer of the VMs and their data to the object store. The DRVA also manages functions such as in-line compression and garbage collection. There must be at least one DRVA per protected cluster, and there can be up to one DRVA per host.
Key Concepts
Concept | Description |
Protected Site | The environment where the protected VMs normally run. If considered in terms of data flow, the protected site can be thought of as the source. |
Storage Site / Object Store | The environment where the object store maintains the continuously updated objects containing the VMs and their data. The Storage Site may be located with the Recovery Site, or it may be in a different location. |
Recovery Site | The environment where the VMs will run when rehydrated from the object store. |
Background Replication | The process of reading existing data from the virtual disk at the protected site and copying it to the recovery site. Background replication is important especially when protection is initiated. |
Foreground Replication | The process of continuously identifying newly generated data at the protected site and copying it to the appropriate object store destination. |
Write Throttling (Backpressure) | Controlled slowing of an application’s write operations at the protected site, typically when network bandwidth is insufficient for the amount of foreground replication induced by the application’s current write activity. |
Replication Rate (Backpressure) | Controlled slowing of background replication to avoid negatively impacting application performance. |
Garbage Collection | The process in which invalidated data that is no longer needed at the protected site is removed from the object store. This minimizes unnecessary consumption of storage space at the object storage site. |
Protected Domain | A group of VMs the administrator has determined should be protected and restored together. All VMs in a protected domain are replicated to the same bucket/container in the Storage Site. |
Runbook | A set of instructions that are followed as part of failover or failback to specify VM startup sequence and configuration parameters. |
Recovery from Object Cloud Virtual Appliance (RocVA) | A virtual appliance in the data center at the Recovery Site. During a failover process, the RocVA runs temporarily to facilitate the rehydration of the VMs and their data from the object store. |
Representation VM (RVM) | During failover, an RVM is created for each VM being rehydrated. RVMs are temporary. They are created during the recovery process (failover, failback, restore) and are automatically deleted when no longer needed. |
Replication Log | The replication log is the record of data being replicated to the object store. Each protected domain uses a single replication log store for all the VMs belonging to the protected domain. |
Replication Log Store | The shared non-volatile memory resource in the protected site that is dedicated for the use of JetStream DR software. Exposed to the DRVA(s) as an iSCSI LUN, the replication log store is used to maintain replication logs, as well as garbage collection metadata, and other metadata created by the JetStream DR software. |
IO Filter | A “filter” that runs in ESXi that provides direct access to the IO path between a VM and its corresponding virtual disk(s). The IO Filters also monitor relevant events, such as vMotion, Storage vMotion, snapshots, etc. In JetStream DR, the IO Filters manage the flow of data between protected VMs, primary storage, and the replication log store. |
Protection Modes | Two methods are available to write protected data to primary storage and the replication log store. (1) “Write-through” method: The IO Filter acknowledges completion of a write operation back to a protected VM only after it has received acknowledgement of the write from both the replication log store and the primary storage. (2) “Write-back” method: The IO Filter acknowledges completion of the write operation to the protected VM upon receiving acknowledgement from the replication log store only; the write to primary storage is asynchronous. |
Architecture
The diagram below illustrates the relationships among VMs, protected domains, and DRVAs. As noted previously, each protected VM has an IO Filter attached to it, to capture data as it is written to primary storage. Groups of VMs can be protected together in a single protected domain.
Multiple protected domains, each with their own replication log store (for replication logs and garbage collection metadata) can be maintained under a single DRVA. Each protected domain replicates all its data to a discrete bucket in the object store.
In configuring VM protection, the administrator may want to consider when it is helpful to protect multiple VMs in the same protected domain, and when it is helpful to establish multiple separate protected domains. For example:
- The protected domain is the most granular level of protection. If one VM within a protected domain needs to fail over, all the VMs in the domain must fail over.
- Certain clustered applications and databases are best protected together. If all nodes in an application cluster fail over, they can be recovered with consistency if they are all in the same protected domain.
- Some applications with large amounts of data may be best protected in a single-VM domain.
- Each protected domain has its own replication log store, and multiple replication log stores can share the same LUN. If the replication log store requirements of protected domains change, JetStream DR can automatically allocate space appropriately for each domain.
Security
JetStream DR utilizes the following methods to enhance the security of communications between its components and the storage site:
- All communications between the management server and appliances (DRVA, RoCVA) are conducted through HTTPS-based REST API calls to protect the privacy and integrity of data.
- The management server and appliances authenticate using trusted certificates to safeguard against man-in-the-middle attacks.
- SSH keys are protected using a passphrase to prevent unauthorized users from gaining access to the keys.
- Appliances communicate with storage via HTTPS to protect all data transferred over the public network.