Hash values are stored in an index that is referenced when subsequent backups are performed. When
data generates a hash value that already exists in the index, the data is not stored a second time.
Rather, an entry with the hash value is simply added to the "recipe file" for that backup session.
Over the course of many backups, numerous instances generating the same hash values occur.
However, the actual data is only stored once, so less storage is consumed by duplicate backup data.
Figure 25
illustrates the process that backup data undergoes with Dynamic deduplication. The numbered
list that follows corresponds with the image.
Item
1
2
3
Figure 25 Hash-based Chunking
.
Dynamic Deduplication Implementation
Dynamic deduplication is enabled per library on the D2D. See
When you configure the library, it defaults to deduplication enabled. If you disable it, you cannot
selectively apply deduplication to any data on the library device. Compression is also disabled if
deduplication is disabled.
70
D2D Systems
Description
The backup stream is analyzed in 4K chunks that generate unique hash values. These hash
values are placed in an index in memory.
When a subsequent backup contains a hash value that is already in the index, a second in-
stance of the data is not stored.
When new hash values are generated, they are added to the index and the data is written
to the deduplication store.
Figure
26.