Deduplication best practices and choosing the best dedupe technology
Data
deduplication is a technique to reduce storage needs by eliminating redundant data in your
backup environment. Only one copy of the data is retained on storage media, and redundant data is
replaced with a pointer to the unique data copy. Dedupe
technology typically divides data sets in to smaller chunks and uses algorithms to assign each
data chunk a hash identifier, which it compares to previously stored identifiers to determine if
the data chunk has already been stored. Some vendors use delta differencing
technology, which compares current backups to previous data at the byte level to remove
redundant data.
Dedupe technology offers storage and backup administrators a number of benefits, including lower
storage space requirements, more efficient disk space use, and less data sent across a WAN for
remote backups, replication, and disaster recovery. Jeff Byrne, senior analyst for the Taneja
Group, said deduplication
technology can have a rapid return on investment (ROI). "In environments where you can achieve
70% to 90% reduction in needed capacity for your backups, you can pay back your investment in these
dedupe solutions fairly quickly."
While the overall data deduplication concept is relatively easy to understand, there are a
number of different techniques used to accomplish the task of eliminating
redundant backup data, and it's possible that certain techniques may be better suited for your
environment. So when you are ready to invest in dedupe technology, consider the following
technology differences and data deduplication best practices to ensure that you implement the best
solution for your needs.
In this guide on deduplication
best practices, learn what you need to know to choose the right dedupe technology for your data
backup and recovery needs. Learn about source vs. target
deduplication, inline vs. post-processing deduplication, and the pros and cons of global
deduplication.