Erasure Coding is a technique used by storage devices similar to RAID to provide chunk parity protections and can greatly reduce the capacity footprint required by enterprise standard triple replication for data resiliency. It can reduce the footprint dramatically, providing the same resiliency as triple protection while additionally allowing for distributed fault protection and rebuilding of data.

A protection scheme is defined as “::”. For example, “4:2:rs” defines a protection scheme where for each 4 data chunks added 2 parity chunks are generated by the Reed-Solomon algorithm, and all of them will be located in different failure domains. The replication factor must always be at least the number of parity chunks + 1; in this example we require 3 replicas on ingest.

EdgeFS implements Erasure Coding once an object has come to rest. This complements traditional mirroring (replication) by allowing ingest data to be replicated in a highly performant manner utilizing high speed Replicast, inline deduplication and compression, and later encoded where space is reclaimed by removal of unnecessary replicas after the protection scheme’s properties have been satisfied. Currently supported models:

To configure Erasure Coding for the cluster:

$ efscli cluster create demo -o ccow-ec-enabled=1,ccow-ec-datamode=4:2:rs
$ efscli cluster list demo -x
System response:
CLUSTER EC_ON EC_MODE REPCNT
clu1 0 6:3:rs 3