cancel
Showing results for 
Search instead for 
Did you mean: 

How do I calculate what size I need to make my Dedupe store?

MissLL
Level 3

How do I calculate what size my dedupe store should be and should I dedupe everything?

 

I currently have around 11TB of data, made up of around 70 odd Virtual Machines with various functions, Domain Controllers, File Servers, Exchange 2010, etc etc which I need to get my head around for a project, I don't know how to calculate how big the dedupe store should be, nor if I should dedupe everything.

 

Also, is there a tool which would look at data on a filestore and calculate where the bulk lies (currently we have one job to back up a whole fileserver and the job just takes too long, it starts on Friday and is still running on Monday night when I am meant to chhange the tapes for the Monday job, we are now skipping the Monday backup and sometimes it is still running on the Tuesday night!) - I figure if I break the job down into several smaller jobs it might be more sucessful.

 

Any help gratefully recieved!

15 REPLIES 15

Backup_Exec1
Level 6
Employee Accredited Certified
Hi You can use the tool below for deduplication, https://www-secure.symantec.com/connect/blogs/new-backup-exec-deduplication-assessment-tool Moreover please see the below link which describes requirment for deduplication with backupexec, http://www.symantec.com/docs/TECH76487 http://www.symantec.com/docs/HOWTO23357 Moreover below link describes best practices for deduplication, http://www.symantec.com/docs/HOWTO21767 Hope this helps. Thanks

AmolB
Moderator
Moderator
Employee Accredited Certified

Backup Exec 2010 supports upto 16TB of dedup folder and X size of dedup folder requires

X* 1.5GB of RAM (recommended) Its also recommended to use dedicated volume for the dedup folder. 

Breakdown larger jobs into smaller ones and then run those jobs simultaneously (Your 

media server hardware should supports multiple concurrent jobs). You may also use client side 

dedup feature if your media server is running short of resources.

Expect better dedup ratio from a flat file backup as compared to database backups.

Also make sure you are using BE2010R3 with latest patches, previous versions had few issues 

which are fixed in the R3.

CraigV
Moderator
Moderator
Partner    VIP    Accredited

I think your work colleague above mentioned this quite a while back in the 1 TN listed wink

pkh
Moderator
Moderator
   VIP    Certified

There is no way to calculate the size of a dedup folder beforehand.  There is the dedup overhead which is unknown and also it depends on how well your data deduplicates.  VM's do not dedup well compared to files.

To track down the cause of your slow job, you can either break it up like you want to, or expand the job log and check the backup and verify timing of each of the resource that is backed up.

MissLL
Level 3

OK, so rather than backing up entire machines, would we be better not to dedupe these VM's and just full and incremental and save the deduping for say file servers?

 

Is there any hard and fast rules for dedupe?

pkh
Moderator
Moderator
   VIP    Certified

No. I think you should just go ahead and dedupe everything and then see whether you are getting the benefits that you are expecting.  If not, then revert that particular server/VM to normal backup.

robnicholson
Level 6

Craig isn't a Symantec employee cheeky

robnicholson
Level 6

Not a scientific method but on our typical office-type file server containing ~1.2TB of data, six months of weekly-full & daily-diff is using 2.5TB of dedupe space which is pretty impressive IMO. It means we've got plenty of expansion space on our 7.2TB dedupe disk storage or we could extend out the retention period beyond six months.

Typically, the differential backup grows to ~100GB (10%) by the end of the week, slowing ramping up as the week goes on.

We'd like to be able to use our Advanced Disk Backup license to allow us to switch to daily incremental backups (using True Image for restore) but it doesn't support backup of distributed file system (grumble grumble).

We don't backup Exchange to dedupe as GRT Exchange backups don't dedupe - they simply get stored as Exchange image files. Hopefully compressed but I doubt it... so we use B2D for Exchange as it's faster than dedupe.

Cheers, Rob.

CraigV
Moderator
Moderator
Partner    VIP    Accredited

...?

robnicholson
Level 6

Ignore me - got my threads mixed up!

MissLL
Level 3

It looks like I need to be a partner to download this program, is that the case?

CraigV
Moderator
Moderator
Partner    VIP    Accredited

That's correct MissLL, but contact this person on the mail address below. If you read further down the page they actually make this suggestion to mail them:

raj_bakshi@symantec.com

Thanks!

MissLL
Level 3

Do you clear your dedup store at all? - Just you say it ramps up as the week goes on?

 

I am pretty new to all this, so forgive the daft questions.

robnicholson
Level 6

No, we don't manually clear it up as I hope that BE does that for us. There is an automatic management task that runs once a day and I think one of it's tasks is to flag expired files/blocks from the dedupe database for re-use.

I guess that if we had a temporary huge blip of use of storage then we might consider dedupe but to be honest, I'd just let it expire automatically.

Cheers, Rob.

robnicholson
Level 6

So continuing the very-back-of-envelope calculation, you could say 2 x disk usage for ~6 months of deduplicated storage of an average office document based organisation.

Cheers, Rob.