cancel
Showing results for 
Search instead for 
Did you mean: 

Fundamentals of deduplication and how does it works.

Amit_Pendse
Level 3

Hi All ,

Can you please let me kmow the fundamentals of deduplication,what i know is in dedup segments the data in strem and it checks for the data segments and compares with previous data stored and if its duplicate, its not stored again.
The restore process will be faster as its on disk.Reduces the amount of space as its compress the data
For eg :--
If i am backing somes files say 1 to 5 on daily basis and 1 to 10 on weekly basis or say there is a huge excel file and its been updated on daily basis how  dedub will determine the duplications on file level.

1 ACCEPTED SOLUTION

Accepted Solutions

V4
Level 6
Partner Accredited

only changed blocks would be backed up.

DeDupe breaks file into segments and this segments are indexed , and are given hash value  . this are maintained by fingerprinting DB.. Next time when backup is initiated for same file with some increments.. same file would be segmented and would be matched againsted fingerprinting DB... AS ONLY increment is few so it would be only backed up.....

DeDupe is great but is still time consuming at initial stage (indexing and fingerprinting etc.)

with 7.5 this is further reduced by using change journal tracking log , which eliminates re-cycle process during incrementals (segmenting , matching with fingerprint db etc)./..just go straight forward match it against change journal tracking log what has changed since last. and vroooom....

 

Hope this clears your query

View solution in original post

2 REPLIES 2

V4
Level 6
Partner Accredited

only changed blocks would be backed up.

DeDupe breaks file into segments and this segments are indexed , and are given hash value  . this are maintained by fingerprinting DB.. Next time when backup is initiated for same file with some increments.. same file would be segmented and would be matched againsted fingerprinting DB... AS ONLY increment is few so it would be only backed up.....

DeDupe is great but is still time consuming at initial stage (indexing and fingerprinting etc.)

with 7.5 this is further reduced by using change journal tracking log , which eliminates re-cycle process during incrementals (segmenting , matching with fingerprint db etc)./..just go straight forward match it against change journal tracking log what has changed since last. and vroooom....

 

Hope this clears your query

Marianne
Level 6
Partner    VIP    Accredited Certified

Amit, I believe speedfreak has given a very good explanation.

Please let us know if your question has been answered. If so, please mark speedfreak's post as solution.

If not. please tell what is still unclear - we will try again.