Determine correct system storage settings for compression, deduplication, erasure coding and validate expected savings

virtualramblings

Mar 31, 2020 4 min read

Determine correct system storage settings for compression, deduplication, erasure coding and validate expected savings

I covered Compression, Deduplication, and Erasure Coding in the NCP Study Guide. However, there’s always more to learn!

Compression

Workloads Recommended for Compression

Almost all

Workloads Not Ideal for Compression

Encrypted datasets
Already compressed datasets (for example, images, audio, or video)

Deduplication

Workloads Recommended for Deduplication

Base images (cache)—you can manually fingerprint them using vdisk_manipulator
P2V and V2V when using Hyper-V (ODX uses a full data copy)
Cross-container clones (not usually recommended because single containers are preferred)

Workloads Not Ideal for Deduplication

Anything outside the recommendations above. In most cases, compression yields the highest capacity savings and should be used instead.

Erasure Coding

EC-X pairs perfectly with inline compression; you can safely enable them together for maximum efficiency.

The savings from the erasure coding feature depends on the cluster size and coldness of the data.

Consider a 6-node cluster configured with redundancy factor 2. A strip size of 5 is possible: 4 nodes for data and 1 node for parity. Data and parity comprising the erasure coded strip leaves one node in the cluster to ensure that if a node failure occurs, a node is available for rebuild. If you use a strip of (4, 1), the overhead is 25% (1 for parity and 4 for data). Without erasure coding, the overhead is 100%.

So what savings can be attributed to EC-X?

Replication Factor of 2 (RF2) allows the utilization of about 50% of raw storage capacity. EC-X can take this utilization to 80%.

Workloads Recommended for Erasure Coding

Write once, read many (WORM) workloads
Backups
Archives
File servers
Log servers
Email (depending on usage)

Workloads Not Ideal for Erasure Coding

Anything write- or overwrite-intensive
VDI

Nutanix