cancel
Showing results for 
Search instead for 
Did you mean: 

Windows Server 2012r2 Deduplication and Backup Exec 15

dgingeri
Level 3

I'm attempting to migrate our current Backup Exec 2014 system to a new pair of Backup Exec 15 servers, one CASO in the remote datacenter and one managed in our headquarters.  We had planned on keeping the disk stored backups for 30 days and tape stored backups for 90 days.  

However, I have discovered many things in this setup that the sales people told my management that Backup Exec could do that is simply isn't able to do.  So, in order to keep our word on keeping backup data for the advertised 90 days, I need to keep it all on our disk storage as well as offsite tape storage.  The big problem I have is that Backup Exec's deduplication isn't good enough for that.  It dedups the data over the course of the job, but not across all the jobs.  So, data builds up much faster in storage than it should.  Very poor design, and not true deduplication.

I've also discovered that, despite claims otherwise, I cannot duplicate backup jobs from our CASO to our managed server for tape backup.  No matter what I do, it just doesn't give me the option to duplicate it to another server, only to local storage.  The best I can do is backup the whole CASO from the managed server directly to tape, and call it our weekly full backup.  If we need something from 80 days ago, we'll have to pull it from the disk storage, and only have the tape backups for disaster recovery.  If we need something from tape, we'll have to restore the jobs from the tape back to disk, and then restore the data from the job.  (If that much even works.)

So, I have to find workarounds.  (Needless to say, I am extremely disappointed in the Backup Exec product and sales people.)

I was thinking maybe we could use the Windows Server deduplication.  Windows Dedup would be able to dedup the data from the various Windows VMs we have and keep the data storage low, if it would work.  I have discovered that Backup Exec whines like crazy and refuses to use storage on a local volume with Windows Server unless the Backup Exec folders are specifically excluded from the dedup process.  (Not good, since that is exactly the data I wanted deduped.)  So, I thought maybe if I put the backup destination on a Windows server and enable dedup on that server.  (We're using Linux based iSCSI storage right now.  I could get Windows 2012r2 licenses for them and just remake them into file servers.)  In a pinch, I could even use the free Starwind VSAN to get deduplication over iSCSI. 

Has anyone tried either of these?  Would this work?

8 REPLIES 8

DarthBilly
Level 5
Employee

Let me try to break these down.

1. Each storage has it's own media protection. On the storage tab of each type of job you can associate disk to 30 days of protection and Tape to a media set with 90 days of protection (StorageProtection.png).

2. No, you do not need to keep disk and tape. Again you can set this disk storage to expire any time and the same for tape storage.

3. Yes, you can dedupe from the CAS to the Managed media server. We use the term in house called opt-dedupe. When you create the backup job, on your duplicate page, chose the Managed Backup Exec's dedupe storage as a destination.

I'm sorry for your frustration, but these are simple tasks, but maybe I've been using the software so long it's second nature. perhaps you could give us some insight on how to make this easier.

pkh
Moderator
Moderator
   VIP    Certified

 It dedups the data over the course of the job, but not across all the jobs.

What is the basis for this statement?

How long have you been using the dedup folder?  Did you check what is the dedup ratio that you are getting?

If you are getting poor dedup ratio, it could be that your data is not suitable for dedup, e.g. it is compressed or encrypted.

 

The reason why you did not see the other dedup folder could be that you have not shared it.  Did you share both dedup folder?

I do need to keep both disk backup, for local restores for ongoing operations, and the tape backups, for offsite storage in case of disaster recovery.  If the office burns down or the ViaWest building gets run over by a bulldozer, we need to be able to restore the systems from the missing site on the available site to get operations running again.  If both sites are destroyed in some awkward disaster, we can still get our tapes to restore the servers and get our customers running on our services agains from another site.  We have customers all around the country who need our services in order to operate, according to the FCC.  If they don't have our services running, then get fined, repeatedly and heavily.  That makes them very cross with us.  So, yeah, we do need both.

I just discovered the "share" selection on the right click menu for storage, available only on the CAS side for some reason, that would allow me to dup jobs between servers.  I'm now working on that.  It is annoying that the documentation for such things is so incredibly sparse and difficult to find.  So, I got that fixed.  We did discover that we had to swap roles between our two machines in order to get the tape library able to dup the jobs form the CAS.  (We didn't have the option to duplicate a job to tape from the CAS server while the tape library was attached to the managed server.  So, we had to swap roles between them to even try to get the option.  I have yet to really test that.)  I had some really annoying difficulties to solve from the system renames, including having to remove them from the domain and readd them because of trust relationships breaking on the renaming, dedup storage that had to be deleted and recreated multiple times due to the way BUE keeps track of them, and finally all the service restarts.  I think I finally have it fixed now, as of 20 minutes ago.  I can now test what we need.  

As for the dedup performance, I backed up the CAS to the managed server twice in quick succession to test.  It had 556GB of VHDs from 14 Windows 2008r2 Hyper-V VMs as test data, directly stored with no filesystem dedup and not backed up previously, mostly web servers and 2 SQL servers, as well as the whole OS of the main machine.  Both the CAS and the managed server are not yet production and have only test backups for history, and those jobs were deleted shortly after they were created.  The first backup job took 2 and a half hours and consumed 480GB on the test 900GB dedup storage.  (The 10.8TB iSCSI store is not yet available for this use.)  On the second backup, I ran out of space.  That's some pretty poor dedup performance.  A zip file of the VHDs is smaller than 480GB.  Using Windows Server dedup, on our IT file server, I had those VHDs down to 14GB, and the OS is only taking up 30GB.  I would have thought the initial backup would have taken 100Gb at most, and with the second taking place less than 3 hours later on a nearly unused system it would have been less than 1GB for the second job.  Filling a 900GB 'deduped' store with 2 backups is pretty bad.  Maybe that's not 'optimum' circumstances, but it is closer to real world circumstances than anything else I could get.

Both my BUE systems, as stated above, are not yet production.  I'm finalizing the configurations and testing before migrating the backups to them.  I get to migrate all the backups from the current system (BUE 2014) onto the managed server at the ViaWest DC first, then move the current system's storage to the CAS, then configure it to duplicate the jobs to the CAS at HQ, which will then back them up to tape for offsite storage.  We planned on keeping local disk stored data for 30 days and tape stored data for 90 days.  Our current system is only set to hold data for 2 weeks, and has no redundancy or external tape storage, so this should be a vast improvement, if I can get it to work.

pkh
Moderator
Moderator
   VIP    Certified

Are you using the Hyper-V agent to backup your VM's?

They aren't VMs, just the VHDs from VMs.

pkh
Moderator
Moderator
   VIP    Certified

If you backup the VHD's just like ordinary files, then BE is not using the VHD filestreamer, so it does not the contents of the VHD's.  The contents of the VHD get shifted around when the VHD is used, thus the VHD1 from Time A is different from VHD1 from Time B and they dedup badly, just like compressed or encrypted files.

To get good dedup result, you need to use the Hyper-V agent to backup VHD because the BE VHD filestreamer will be at work.

Note that the mechanism to dedup files is different from the mechanism to compress files so there is no comparison.

When you use Windows dedup, a filestreamer is used to handle the VHD's.

But these VHDs aren't active.  They didn't change at all between the first and second backup jobs.  I would presume, that with completely duplicate data, the "deduplication" method would deduplicate those files at the very least, and cut 556GB of data off the stored data of the second backup.  It didn't.  That's a pretty poor job of deduplication. 

Also, I directly copied these same VHD files, twice, to a test store using Starwind's deduplicating iSCSI VSAN, onto a 100GB store.  It took several jobs to copy them because Windows wouldn't believe they'd fit, but they fit just fine.  The iSCSI lun files took up a total of 37GB after it was done.  It turned 1112GB of data into 37GB.  That's what deduplication is all about.  (It put the quad core Core i5 4570 processor on the test machine into 70-80% usage and 80+% of the 16Gb of memory on the machine in use for the entire time it was copying, but it worked as expected.)