Highlighted

Using Replication Gateway tunables

 

About Replication Gateway tunables

If you are using Resiliency Platform Data Mover to replicate your data across data centers, you have an option to tune some replication parameters to optimize the performance and scalability of the replication.

Resiliency Platform Data Mover replicates data of the protected virtual machines at the source data center to the target data center. This data of the protected virtual machines is called Veritas Replication Set. Replication Gateway is a staging server that aggregates and batches data from multiple Veritas Replication Sets during replication. This data is transmitted in the unit of update sets and all the replication tunables are related to the update set.

 

What is an update set

To replicate the data from source data center to target data center, the Data Mover driver taps the workload I/Os and sends it to the source Replication Gateway over LAN. The source Replication Gateway implements a data optimization technique (data de-duplication) and creates a set of workload I/Os. This set of workload I/Os collected over a period of time is called an update set. You can change the size of the update set.

The update set is stored on the staging storage disk in its optimized format. While sending this update set over WAN, data optimization (compression) algorithms are used.  As a result of this data optimization, the amount of data sent over WAN is much lower than the amount of data processed for replication by the Replication Gateway.

The update sets are periodically created for each protected workload virtual machine. The WAN link is effectively used by sending multiple update sets in parallel. Resiliency Platform Data Mover is resilient to the network and infrastructure faults. In the event of a network fault or Replication Gateway reboot, it does not send any data over WAN. The replication is resumed from the point where it was paused (before faults) using optimized handshake algorithm. In case of an infrastructure fault such as failure of Replication Gateway, the replication can be resumed through another Replication Gateway. In this case, the amount of data needed to be re-sent is equal to the update set lost on the failed Replication Gateway.

 

Parameters that can be tuned

Following are the three parameters for Replication Gateway that you can tune to optimize the Recovery Point Objective (RPO), replication performance, and scalability:

  1. Update set size: The size of the update set. You can change the default size for the update set.
  2. Replication Frequency: You can set a time interval or frequency for the update set. The period set to delimit and generate a new update set is called replication frequency. Once an update set is cut, it gets scheduled for transfer to the target Replication Gateway.
  3. Quota per Veritas Replication Set: The space reserved for each virtual machine in the reserve storage is called quota per Veritas Replication Set.

 

How replication works with update set, Replication Frequency, and quota per Veritas Replication Set

The staging storage on the Replication Gateway component of Veritas Resiliency Platform consists of two parts:

  1. Reserved storage: Replication Gateway allocates a certain part of the staging storage disk to each of the protected virtual machine.  This part of the staging storage is called reserve storage.
  2. Shared pool: This part of the staging storage is shared among all the virtual machines.

Following is the sequence in which update sets get created and stored in various areas of Replication Gateway storage:

  • When replication starts at the source data center, update sets are created and stored in the reserve storage according to the Quota per Veritas Replication Set. While sending the data from protected host, both the parameters update set size and replication frequency are considered and if any of the two parameters is achieved, the update set gets created on the source Replication Gateway.
  • These update sets are then transmitted over WAN to the target Replication Gateway. If the LAN speed is higher than the WAN speed, the update sets are not transmitted in the same speed as they are created and stored. As a result, the reserve storage may get filled up quickly.
  • When the reserve storage is full, the Replication Gateway checks for availability of a shared pool. If the Replication Gateway has a shared pool storage available, then the update sets are stored in that shared pool storage area.
  • If the shared pool storage area is full and the reserve storage disk is also full, then replication goes into flow control mode. In the flow control mode, the Data Mover driver keeps a track of the changed blocks but does not replicate the data.
  • Once shared pool storage or reserve storage is available for storing the update sets, replication is resumed. All the blocks that have changed during bit tracking mode are now sent to the Replication Gateway to be stored in the available storage area.

 

Factors governing the replication characteristics

The default size of the update set is chosen in such a way that the Replication Gateway gets maximum possible data optimization as well as maximum WAN bandwidth for replication. Following are some of the factors that affect the size of the update set or can be affected when update set size is changed:

  • Workload I/O pattern
  • Maximum WAN throughput
  • Maximum data optimization
  • Maximum Number of workload hosts protected per Replication Gateway

 

When do you need to change the size of the replication tunables

The default size of update set is tuned to solve most of the applications having moderate sizes of the disks and moderate throughput. These values are tuned to achieve service level agreement (RPO/RTO) for the resiliency group.

You may consider changing the size of the update size in the following situations:

  • If you want to protect more virtual machines without allocating more staging storage on the Replication Gateway.
  • If the size of the staging storage in the customer environment is smaller than expected.
  • If the workload I/O pattern is such that not much optimization is possible with default value.

Note that changing these values to a value that does not fit your I/O pattern might violate the service level agreement. For example, if the size of the update set is changed to a considerably smaller size and workload I/O rate is very high, the Replication Gateway tries to optimize as much as possible. But if the WAN bandwidth is low, then Replication Gateway may not be able to provide the desired RPO. In such cases, the replication goes into flow control mode. Once enough space is available on the Replication Gateway, replication is automatically resumed.

So, it is expected to have a WAN link with sufficient bandwidth to sustain the workload I/O traffic. Also, to achieve the best optimization possible, keep the update set size to the default value. In cases where workload I/O rate is not high or virtual machine disk size is not large, you can change the size of the update set and set it to a lower value.

If you decrease the size of the update set, the number of virtual machines to be protected increases and vice versa.

Tunable type

Impact of the change

Quota per Veritas Replication Set (Quota-per-CG)

Scale (number of protected virtual machines)

size of update set (Update-set-size)

Performance (Local Deduplication, compression, RPO)

Replication Frequency

RPO for all the resiliency groups configured on the gateway

 

The following formulae are used to calculate these tunables.

  • Reserve storage = number of virtual machines X Quota per Veritas Replication Set
  • Quota per Veritas Replication Set = Minimum number of update sets required X (size of update set or Replication frequency, whichever is achieved earlier)

 

How to change the size of the update set

You can change the value of all the replication tunables using the Replication Gateway KLISH menu. You can change these tunables at run-time based on the above parameters. Note that you need to have the same size of the parameters on all the peer gateways.

  • To modify Quota per Veritas Replication Set:

manage> datamover operation modify-quota-size

  • To modify update set size for replication Gateway:

manage>datamover operation modify-updateset-size

  • To modify replication frequency:

manage>datamover operation modify-replication-frequency

 

Note: This article is also published at: https://origin-www.veritas.com/support/en_US/article.100044091

You can access Veritas Resiliency Platform documentation at: https://sort.veritas.com/documents?prod=vrp