Capacity saving function: data deduplication and compression

When the capacity saving function is in use, the controller of the storage system performs data deduplication and compression to reduce the size of data to be stored. The capacity saving function can be used for data stored on flash drives, including data stored on encrypted drives.

When you enable deduplication on a pool, the deduplication system data volume for that pool is created. The deduplication system data volume is used exclusively by the storage system to manage the data deduplication function. A search table in the deduplication system data volume is used to locate redundant data in the pool.

olh-caution.gif Do not use capacity saving and NAS deduplication on the same volumes, because the additional processing decreases the I/O performance substantially. For information about the NAS deduplication function, see the File Services Administration Guide.

olh-note.gif The capacity overheads associated with the capacity saving function include:

The capacity consumed by metadata for the capacity saving function (deduplication and compression) is approximately 3% of the consumed DP-VOL capacity that has been processed by capacity saving. For example, if the consumed capacity of a DP-VOL is 150 TB and the capacity saving feature has processed 100 TB of the 150 TB consumed capacity reducing it to 30 TB, the capacity consumed by metadata for capacity saving function will be approximately 3 TB (3% of 100 TB). The total consumed capacity of this DP-VOL at this instant is 83 TB (30 TB + 50 TB + 3 TB).

The capacity consumed by garbage data is approximately 7% of the total consumed capacity of all DP-VOLs with capacity saving enabled. The capacity is dynamically consumed based on garbage data created by the capacity saving process and cleaned by the background garbage collection process. The garbage collection process is a background process with a lower priority than host I/O, so the capacity consumed by garbage data depends on both the garbage created and the host I/O rate.