Use cases for capacity saving

The results of enabling the capacity saving functions of deduplication and compression depend on the properties and access patterns of the stored data. In addition, when capacity saving is enabled, some storage behaviors are different from conventional behaviors because of the increase in load of storage controller processing caused by data scanning and garbage collection by data update. Before implementing capacity saving, you need to confirm whether it should be applied to your specific storage environment.

The following table lists several storage use cases and describes the application of capacity saving to each use case.

Use case

Application

Description

Office

Deduplication and compression

Because there are many identical file copies, deduplication is effective.

VDI

Deduplication and compression

Deduplication is very effective because of OS area cloning.

Database (TPC-H)

Compression

Deduplication is not effective because the database has unique information for each block.

For a database that has many data updates, garbage data is increased, so it is not suitable.

Database (TPC-C)

Compression

Image/video

Not suitable

Compressed by application

Backup/archive

Deduplication and compression

Deduplication is effective between backups.

Caution
  • I/O performance to data with compression and deduplication is degraded. Verify the performance before using the capacity saving function.
  • Because 10% is used for metadata and garbage data, capacity saving should be applied only when the result is expected to be 20% or higher.
  • Even in the same use case, you need to consider having a temporary area that is required when data is created before an operation starts and a temporary area that is used for data update after an operation starts.
  • In deduplication and compression, processing is performed per 8 KB. Therefore, if the block size of the file system is an integral multiple of 8 KB, capacity saving is likely to be effective.
  • If update write continues by exceeding the capability of garbage collection, Cache Write Pending is increased because of increase in the used pool capacity and waiting for a free capacity, and the system might be affected. Therefore you should not apply the capacity saving function to operations that continuously cause update write.