Deduplicated Replication
Deduplicated replication provides the ability to reduce the amount of data sent over the network by replication jobs. This feature is useful for reducing the on-the-wire data bandwidth requirements of replication, especially when using a high-latency, low-bandwidth, high-cost network.
Note:
This feature imposes a cost in the form of pre-processing and increased memory overhead. The effectiveness of deduplication is highly data dependent, so it is strongly recommended to verify the deduplication savings with representative datasets prior to using this feature in a production environment. Deduplicated replication is more efficient when there is more duplicate data.Deduplicated replication is disabled by default. To enable deduplicated replication for individual replication actions, click Enable deduplication in the Add Replication Action dialog box in the BUI, or set the dedup
property to true
in the CLI.
Deduplicated Replication Statistics
In the CLI, each replication update action has a stats
node. The stats
node records information about the most recent replication update, as well as the accumulated statistics over the lifetime of the replication action. To view statistics for a specific update that was not the most recent update, see the finish alert for the update as described in Start and Finish Alerts.
These replication action stats
node properties quantify:
-
On-disk compression benefits
-
Deduplication benefits
-
Replication data stream compression benefits
-
Replication update duration
-
Deduplication tables construction time (before sending data)
-
Deduplication tables maximum memory consumption
Table "Replication Action stats Node Properties (CLI Read-Only)" in Replication Action Properties describes the action stats
node properties of a deduplicated replication stream. See especially properties with dedup
and dd_
in their names.
Measuring Deduplicated Replication Statistics
When deduplication is enabled for a replication stream, the data is transformed through several layers of deduplication and compression. Data rates are measured and recorded as the data is transformed.
To determine whether deduplication was effective for the replication action, examine the replication statistics in the stats
node of a replication action in the CLI, or in finish alerts in the BUI or the CLI.
Single Deduplicated Replication Update Benefits Comparison
-
In the BUI, use the replication finish alerts to compare the
phys_bytes
andafter_dedup
statistics to evaluate the benefit of deduplicated replication. For information about replication finish alerts, see Start and Finish Alerts. -
In the CLI, use the replication finish alerts to compare the
phys_bytes
andafter_dedup
statistics or use the replication actionstats
node to comparelast_phys_bytes
andlast_after_dedup
statistics to evaluate the benefit of deduplicated replication. For information about statistics in thestats
node, see table "Replication Action stats Node Properties (CLI Read-Only)" in Replication Action Properties.
Averaged Deduplicated Replications Updates Benefits Comparison
To determine the average benefit of all deduplicated replication updates performed by this replication action, use the replication action stats
node to compare statistics dd_total_phys_bytes
and dd_total_after_dedup
. For information about statistics in the stats
node, see table "Replication Action stats Node Properties (CLI Read-Only)" in Replication Action Properties.