cancel
Showing results for 
Search instead for 
Did you mean: 

Understanding deduplication?

MIXIT
Level 6
Partner Accredited

Hi all.  I seem to be confused about how deduplication is implemented in real world scenarios. 

 

I do understand the very basic idea that duplicate blocks are identified during the backup, and the backup then backs up only one copy of a block, presumably with some kind of reference in the catalogue that tells the backup software where other copies of that block should go to during a restore.  (correct me if anything is wrong here so far). 

 

I have no implemented this in real world yet, but have thought about doing it a few times.  My type of clients don't typically require it hence why all these years have gone by with no usage of it. 

 

So I'm doing the Veritas training on BE 15 and as per usual, the training material tells you you have integrated deduplication across the board and life is beautiful, then when you get done to actually talking about how it's used, one finds you must used a disk for this.  I guess a USB-attached drive, I"m not sure the specifics. 

 

So this seems to rule out tape drives and libaries completely, effectively taking away your ability to rotate media to offsite storage.  Or, perhaps the deduplication can be done as a second backup stage, so you first backup to tape, then copy that to a deduplication disk.  But why?  Unless your dedupe disk is meant to be your DR local restore source. 

 

So it's just that deduplication is marketed as this fantastic way to reduce data storage across the boad and yet really it has very limited use if it is restricted to disk storage.  I might be omitting SAN's here as I"m not sure what BE considres a SAN to be as far as the dedupe discussion goes. 

 

Anwyay, needless to say I am at a total loss of a complete understanding on this.  Please help :)

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

CraigV
Moderator
Moderator
Partner    VIP    Accredited

Dedupe for any vendor is done to disk first. You can duplicate to tape, but the data is then rehydrated to the original size. So this begs the (rhetorical) question...why duplicate to tape when you can duplicate to another disk/array.

The disk can be an array like EMC VNXe/VNX, HP MSA, Seagate NAS etc, or USB drive, or even internal to the server. As long as it is disk accessible by the server.

Dedupe doesn't make backups faster, it makes them smaller, but the key is to run the backups for a longer retention period. Only then do you realise the full value of dedupe. It isn't for everyone either.

SAN in the general sense is shared storage, so it could be SAS/FC/iSCSI...it means block technology.

To understand how dedupe actually works, read pkh's article below:

https://www.veritas.com/community/articles/deduplication-simplified-part-1-backup

https://www.veritas.com/community/articles/deduplication-simplified-part-2-restore

Thanks!

View solution in original post

3 REPLIES 3

CraigV
Moderator
Moderator
Partner    VIP    Accredited

Dedupe for any vendor is done to disk first. You can duplicate to tape, but the data is then rehydrated to the original size. So this begs the (rhetorical) question...why duplicate to tape when you can duplicate to another disk/array.

The disk can be an array like EMC VNXe/VNX, HP MSA, Seagate NAS etc, or USB drive, or even internal to the server. As long as it is disk accessible by the server.

Dedupe doesn't make backups faster, it makes them smaller, but the key is to run the backups for a longer retention period. Only then do you realise the full value of dedupe. It isn't for everyone either.

SAN in the general sense is shared storage, so it could be SAS/FC/iSCSI...it means block technology.

To understand how dedupe actually works, read pkh's article below:

https://www.veritas.com/community/articles/deduplication-simplified-part-1-backup

https://www.veritas.com/community/articles/deduplication-simplified-part-2-restore

Thanks!

MIXIT
Level 6
Partner Accredited

Thanks for the info CraigV.  Today I did the VSE and VSE+ training and exams so now I am a total master of Backup Exec (snicker).  But at least now I know a thing or two more.  I'm also set to read the Dedupe section of the BE Admin guide but I"ll first check out pkh's stuff you mentioned. 

 

On a side note, I"m actually surprised my post got posted above, since I spent 45 minutes yesterday fighting with this god awful Veritas site, getting http 500 errors every time I'd try to post (from different machines/different browsers).  I was ready to track down a Veritas manager and stomp my feet inpetulantly.  Anyway, glad my post made it after all.  Hopefully this reply does too.  I was 0 for 5 yesterday (3-4PM Eastern Feb 9 2016).  . 

CraigV
Moderator
Moderator
Partner    VIP    Accredited

YEah there was an issue with the forums for all of yesterday and it only came online sometime last night South African time.

Glad this helped sort out your questions!

Cheers!