JetStream Software Portal

Triggering JetStream DR Alarm Conditions

The following guidelines describe conditions and methods to trigger various JetStream DR alarms which may be helpful for troubleshooting and testing purposes.

DRVA Restarted

  • Reboot the DRVA VM or restart the DRVA service.
  • An alarm will be triggered and can be viewed from DRVA VM > Monitor > Events.

DRVA High CPU Usage Duration Exceeded

  • In preparation, configure the DRVA with a minimum: 4 CPU + 8GB Memory.
  • Open the DRVA console and enable SSHD service.
    • Connect via SSH to the DRVA using PuTTY.
  • Use the top command to manage system processes.
  • Create multiple (duplicate) SSH sessions running the following command in each:
    • cat /dev/zero > /dev/null

This will generate a continuous load on the system by creating an infinite loop writing zero bytes to nowhere. This operation causes high CPU usage because it is essentially an infinite loop of generating and discarding data, which keeps the CPU busy.

You may need to perform this task using up to 15 to 20 SSH sessions. The alarm should trigger after about 15 minutes with CPU usage above 90%.

DRVA High Memory Usage Duration Exceeded

Method 1
The DRVA high memory usage alarm may be triggered by conditions releated to DRVA high CPU usage (described above).

  • The following command can be used to write and discard 4G of memory.
    • dd if=/dev/zero of=/dev/null bs=4G count=1
  • Repeating this command can produce a spike in memory usage for testing alarm conditions.

Method 2
An alternate method is to consume the defined memory to the point the alarm is triggered.

  1. Create a temporary mount point:
    • # sudo mkdir /mnt/tmpfs
    • # sudo mount -t tmpfs -o size=8G tmpfs /mnt/tmpfs
  2. Use the dd command to allocate required memory.
    • The following command writes a 7.5 GB file filled with zero bytes into the tmpfs mount, thus consuming 3.5 GB (3500MB) of RAM.
      • # dd if=/dev/zero of=/mnt/tmpfs/testfile bs=1M count=7500
  3. To clean up after testing:
    • Reboot the DRVA, or
    • Unmount the created tmpfs mount
      • # sudo umount /mnt/tmpfs
      • # rmdir /mnt/tmpfs

DR Store IO Error

  • Remove the replication log from a DRVA that contains an actively replicating protected domain.
    • The DRVA should be configured to use the replication log that gets removed for the test.
  • This action should trigger the alarm.

DRVA Unreachable Duration Exceeded

  • Power off the DRVA.
    • Or, disconnect the DRVA network.
  • This action should trigger the alarm.

DR Store Unavailable

  • It is not possible to trigger this error specificially. It is similar in behavior to an IO error.

Bitmap Mode ‘On’ Duration Exceeded

  • Create a new protected domain.
  • Protect multiple VMs (two or three should be sufficient).
  • Wait for the VMs to enter the initial sync phase.
  • From the DRVA Edit settings screen, disconnect the replication volume disk.
  • This action should trigger the alarm.

Protected Domain Recovery Failure

Method 1
This alarm condition can be triggered while performing planned failover:

  • Initiate planed failover from the recovery site.
  • While failover is in progress, shut down the primary MSA.
  • After a period of time, the alarm should be triggered from the primary site.

Method 2
This alarm condition can be triggered while performing continuous failover:

  • Start continous failover.
  • Terminate the task from the task log.
  • After a period of time, the alarm should be triggered from the primary site.

Failback Interrupted Due to Issue at Failover Site

  • Initiate a failover.
  • After the failover successfully completes, open an MSA SSH session on the recovery site (where the domain has failed over).
  • Start the failback process and concurrently stop the VME2 service on the recovery site by issuing the command:
    • #service vme2 stop
  • This action should trigger the alarm.
  • After the test, the VME2 service can be restarted by issuing the command:
    • #service vme2 start

Protected Domain Test Failover Failed

  • Initiate a test failover at the recovery site.
  • As test failover is being performed, power off the MSA at the primary site.
  • This action should trigger the alarm.

Application Write Backpressure On

  • If the incoming VM network speed is high compared to the outgoing replication traffic, this can create backpressure leading to the alarm being triggered.

DR Virtual Appliance Network IP Not Available

  • Disconnect the DRVA network.
  • After a period of time, the alarm should be triggered.

Test Failover Site Ready

  • Conduct a test failover and perform the steps to the point where VMs can be tested at the recovery site.
  • An alert message will appear in the UI of the recovery site where VMs can be tested.

Replication Log Reserved Space Running Low

  • Deploy a DRVA and add a replication log volume with a minimal configuration.
  • Create a protected domain configured with a large total estimated data size to be protected.
    • Set the metadata size to be greater than half the capacity of the replication log disk.
  • Once the protected domain is created, protect the VM.
    • Navigate to the replication log and change the reserved space alarm threshold to 10% (the default size is 5%).
  • This action should trigger the alarm.

Protected Domain Recovery Runbook Execution Failed

  • Create a protected domain and protect a VM that doesn’t have VMware tools installed.
    • If necessary, uninstall VMware tools from the VM.
  • Configure a runbook for ReIP of the primary site.
    • ReIP allows IP addresses of protected VMs to be changed via runbooks during failover or failback.
  • Initiate a failover.
    • After a period of time, the alarm should be triggered that can be viewed from Cluster > Monitor > Events.

VM Protection Cancelled

  • This condition could occur in earlier versions of JetStream DR (version 4.1.x and prior) when a protected VM undergoes a snapshot revert.
  • Current versions of JetStream DR software have addressed the underlying issue and it is no longer possible to create this condition or trigger this alarm.

DR Store Degraded in Multi-Pathing Mode

  • This condition occurs if the replication log uses an iSCSI volume that relies on multi-pathing to storage and one of the paths becomes broken or degraded.
    • In such case, the alarm will be triggered.
  • This issue is not applicable in AVS environments which do not use iSCSI multi-pathing to storage.

Protected Domain Recovery (Failover / Failback / Restore ) Completed

  • Initiate any failover, failback, or restore operation.
  • Upon successful completion of the task, the alarm will be triggered.
Was this article helpful?

Related Articles