What is deduplification?
11:33 AMDeduplification is a new form of technology that reduces data down to its raw essentials by pooling redundant patterns within a file, across files, and even within a block, and stores only unique data segments. Take, for example, a typical Word document. When compressing or zipping a file up it looks for common patterns and then reduces them for that file and creates an archive. Deduplification takes this one step further by looking at chunks of common data and creates both pointers to previously used data and unique data segments for later use. As a result, data that is continually static and being backed up can be greatly reduced. As per our vendor, Data Domain, we can achieve the following through their global compression process which includes deduplication:
- Maximum data reduction, averaging 20x data reduction over time, which minimizes the physical storage capacity required for storing backup images.
- Extended retention with petabytes (PB) of protection storage, Data Domain systems offer up to 56.7 PB of usable capacity in a very small footprint.
- Economy of tape by storing multiple months of backup images for less than $0.35/GB. This extended retention also optimizes performance of both backup and recoveries without making any tradeoffs.
- Network-efficent replication by transferring only unique data segments, reducing bandwidth requirements by as much as 99%.
Once more, this process is done "in-line" which means data is sent over the wire and deduplicated on the fly with minimal impact to the client sending the data. This in turn reduces the CPU and memory cycles needed to support deduplification at the source (EMC Avamar and Symantec Puredisk take these approaches). For more information on what OpSource has deployed you may click on the following links:
0 comments