Capacity saving function: data deduplication and compression
When the capacity saving function is in use, the controller of the storage system performs data deduplication and compression to reduce the size of data to be stored. The capacity saving function can be used for data stored on all drive types, including data stored on encrypted drives and external storage (UVM).

Deduplication: The data deduplication function deletes duplicate copies of data written to different addresses in the same pool and maintains only a single copy of the data at one address. The deduplication function is enabled on a Dynamic Provisioning pool and then on the DP-VOLs in the pool. When deduplication is enabled, data that has multiple copies between DP-VOLs assigned to that pool is removed.
When you enable deduplication on a pool, the deduplication system data volume for that pool is created. The deduplication system data volume is used exclusively by the storage system to manage the data deduplication function. A search table in the deduplication system data volume is used to locate redundant data in the pool.
Compression: The data compression function utilizes the LZ4 compression algorithm to compress the data. The compression function can be enabled per DP-VOL.
The capacity overheads associated with the capacity saving function include:
Capacity consumed by metadata
The capacity consumed by metadata for the capacity saving function (deduplication and compression) is approximately 3% of the consumed DP-VOL capacity that has been processed by capacity saving. For example, if the consumed capacity of a DP-VOL is 150 TB and the capacity saving feature has processed 100 TB of the 150 TB consumed capacity reducing it to 30 TB, the capacity consumed by metadata for capacity saving function will be approximately 3 TB (3% of 100 TB). The total consumed capacity of this DP-VOL at this instant is 83 TB (30 TB + 50 TB + 3 TB).
Capacity consumed by garbage (invalid) data
The capacity consumed by garbage data is approximately 7% of the total consumed capacity of all DP-VOLs with capacity saving enabled. The capacity is dynamically consumed based on garbage data created by the capacity saving process and cleaned by the background garbage collection process. The garbage collection process is a background process with a lower priority than host I/O, so the capacity consumed by garbage data depends on both the garbage created and the host I/O rate.