cancel
Showing results for 
Search instead for 
Did you mean: 

Verifying opt-dedupe operation across WAN

Hi.

 

I need some help in figuring out if I need to actually run a verify template in my deduplication policy.  So the scenario goes like this:

 

1  CASO BackupExec server sitting at a colo site across a WAN from my office

1  Managed Media Server sitting locally in my office to those resource agents I want to use

Both Backup Exec servers have deduplication option enabled

Deduplication folders are located on an iSCSI connection to a Powervault MD3600i

I have a policy set up on the Managed Media Server that:

  *Does a backup job of respective resources (with verify turned on, since its local), to its local dedupe folder.

  *Does a duplicate job of the backup (with verify turned off), to the remote dedupe folder (on the CASO server).

  *Lastly, my issue has been with a verify job for the 'duplicate' operation.  I try and set up a template so that it runs only once, on the weekend.  However,

   even this seems to take too long.

 

I have read that verify jobs are really only needed for tape.  Is that really the case?  Thank you for any and all help.yes

1 Solution

Accepted Solutions
Accepted Solution!

I’m not sure I have an answer

I’m not sure I have an answer for you but here’s my experience.

When I create a job on a local server that runs the backup, verify local, copy to remote WAN, and verify remote WAN copy the verify runs very slowly and rehydrates the data from the remote site.  Network traffic saturates the pipe when running the verify. 

When I select data that has completed the copy to the remote WAN site by manually creating a verify job (similar to what you would do to restore data) and then runing the verify from remote WAN server the verify runs quickly.   Network traffic is very low.  The key is to go into “Destination, Device and Media” settings for the verify job.  Then select the remote media server for the “Restore Device or Media Server”.   It appears when run this way the verify process simply checks the “hash” between the local and remote servers.  I have not yet figured out how to create an automated job to run to the verify job from the remote server (rather than from the local server).  But for me an occasional verify (run manually) might work.  And maybe my understanding is entirely incorrect too.   Hope this helps.

View solution in original post

4 Replies

During dedupe verify, data is

During dedupe verify, data is rehydrated & thats why the notice in slow performance

 

EDIT**

Reference KB - http://www.symantec.com/docs/HOWTO21767

Dedupe does not work.....

....if I have to verify the information.   I've been testing this now for weeks!  I have a 20MB WAN connection between my sites.  Per all the Best Practices, and all the symantec employees on these forums...you MUST RUN A VERIFY JOB ON ALL BACKED UP DATA.

http://www.symantec.com/business/support/index?page=content&id=HOWTO59043

The reason why we invested in Backup Exec deduplication, is to better 'automize' our data getting off site.  We did not want to rely on the scheduling of tape pickup and send off, so we decided to give deduplication a shot to leverage our WAN circuit, so that we could simply transmit all backed up data.

Basically, we do this to protect ourselves against fire or destruction at our office.  What I would not want to hear from Symantec, in the remote case that a disaster DOES happen, is:  "Sorry, but since you didn't verify your backup, we can't get your data."  However, even though my duplicate jobs are lightning fast, verify jobs are UNBELIEVABLY slow.  In fact, verify jobs are impossible over a WAN, if you have to verify in the 'hundreds of gigs of data' range, is impossible.  I could not create any scenario where I could get them to run over a weekend.

Unless someone out there can give me some slick advice on what I'm doing wrong (have yet to get it) If there are administrators out there looking to replace their tape situation by deduplicating their data to their colo, look elsewhere other than Symantec Backup Exec Dedupe. 

I've never turned on the

I've never turned on the verify for any B2D or dedupe, only tape.  Disk is a lot more reliable of a media than tape that is rotated offsite to unknown conditions.  So my rotation would be B2D2T(v)  Or B2D<->2D2T(v)

Symantec's statement is a general catch-all and they will make a best case effort regardless of verify being on or off to help you recover, short of bad media.  They don't provide data recovery services.

A better solution to verify, is to schedule random restores of data on a bi-weekly or monthly basis.  Restore an entire server, or an entire tape to an alternate location and test its integrity.  

Other than that, perhaps an OST appliance that does a data integrity check in real-time and on it's own schedule like an EMC DataDomain box?  Forget the antiquated BackupExec tape technology, and use what the DD box is good at.  Ensuring safe non-corrupt data that can be replicated offsite more efficiently than BE.

Accepted Solution!

I’m not sure I have an answer

I’m not sure I have an answer for you but here’s my experience.

When I create a job on a local server that runs the backup, verify local, copy to remote WAN, and verify remote WAN copy the verify runs very slowly and rehydrates the data from the remote site.  Network traffic saturates the pipe when running the verify. 

When I select data that has completed the copy to the remote WAN site by manually creating a verify job (similar to what you would do to restore data) and then runing the verify from remote WAN server the verify runs quickly.   Network traffic is very low.  The key is to go into “Destination, Device and Media” settings for the verify job.  Then select the remote media server for the “Restore Device or Media Server”.   It appears when run this way the verify process simply checks the “hash” between the local and remote servers.  I have not yet figured out how to create an automated job to run to the verify job from the remote server (rather than from the local server).  But for me an occasional verify (run manually) might work.  And maybe my understanding is entirely incorrect too.   Hope this helps.

View solution in original post