cancel
Showing results for 
Search instead for 
Did you mean: 

Clarification: NBU 7.5 Dedupe retentions

Jess_M12
Level 2

Hi all,

Let’s say I have an SLP with the following config:

Backup                 Dedupe STU       FIXED 9 weeks

                Duplication         OFFSITE_DATA_CENTER_DEDUPE_STU     FIXED 9 Weeks

                                Duplication         OFFSITE_DATA_CENTER_TAPE     FIXED 9 Weeks

Let’s say the initial backup is 30GB.  All steps complete successfully. 

Now this runs every weekend and data change is minimal. When the initial backup and duplication expire, will the next backup/duplication job have to dedup the full data amount of the 30GB across the WAN?

 

Now take it a step further with a large data set of terabytes and small WAN pipe of about15mb.  What should I do so I do not have to resend terabytes of data every 9 weeks? Just set a long retention?

Thanks for any information and clarification!

1 ACCEPTED SOLUTION

Accepted Solutions

Andrew_Madsen
Level 6
Partner

Is the job a full or an incremental? If it is a full every week there will be data that points to existing data in the dedupe pool. As long as there are pointers the data will remain. Even if you run synthetic fulls against the dedupe data it will remain as long as it is referenced.

In your scenario you backup locally to a deduplication pool and then duplicate (is it duplicate or replicate) to a deduplication pool in another data center and keep it for 9 months. We can ignore the tape out. So first full is taken in DC1 of 100 GB and replicated to DC2 undeduplicated because all is fresh. A second incremental is taken with little change but that is deduplicated against what is in DC1. That is then duplicated to DC2 but is deduplicated against what is in DC2 and sent WAN friendly. This happens for the rest of the week and then your next full is processed. This can be a synthetic or standard full. If it is synthetic then the file pointers are changed to reflect the incrementals done against the previous fulls so the expiration timer that was set in the first full is reset for those objects that will be used in the synthetic full.

If it is a standard full then the full is deduplicated in the first dedupe pool in DC1. That data that was in the original full is now referenced by the second full so it will need to stay for 9 more weeks minimum. Then we deduplicate this to DC2 and the data stays around for 9 months there.

All in all your backups will continue to be WAN friendly even though the original images were expired. In a dedup pool those images are basically a bunch of pointers to share data blocks, nothing more.

 

View solution in original post

6 REPLIES 6

Nicolai
Moderator
Moderator
Partner    VIP   

A SLP will try to complete the entire SLP as fast as possible. It will not wait 9 month before transferring data from OFFSITE_DATA_CENTER_DEDUPE_STU  to OFFSITE_DATA_CENTER_TAPE.

I have heard that a SLP improvement in 7.6 is underway that will enable SLP duplication when a copy is near expiration.

In general transferring data from a MSDP to tape will require the entire data to be re-hydrated.

Andrew_Madsen
Level 6
Partner

Is the job a full or an incremental? If it is a full every week there will be data that points to existing data in the dedupe pool. As long as there are pointers the data will remain. Even if you run synthetic fulls against the dedupe data it will remain as long as it is referenced.

In your scenario you backup locally to a deduplication pool and then duplicate (is it duplicate or replicate) to a deduplication pool in another data center and keep it for 9 months. We can ignore the tape out. So first full is taken in DC1 of 100 GB and replicated to DC2 undeduplicated because all is fresh. A second incremental is taken with little change but that is deduplicated against what is in DC1. That is then duplicated to DC2 but is deduplicated against what is in DC2 and sent WAN friendly. This happens for the rest of the week and then your next full is processed. This can be a synthetic or standard full. If it is synthetic then the file pointers are changed to reflect the incrementals done against the previous fulls so the expiration timer that was set in the first full is reset for those objects that will be used in the synthetic full.

If it is a standard full then the full is deduplicated in the first dedupe pool in DC1. That data that was in the original full is now referenced by the second full so it will need to stay for 9 more weeks minimum. Then we deduplicate this to DC2 and the data stays around for 9 months there.

All in all your backups will continue to be WAN friendly even though the original images were expired. In a dedup pool those images are basically a bunch of pointers to share data blocks, nothing more.

 

Jess_M12
Level 2

Hi Nicolai,

I just fixed my post for the retentions (9months to 9 weeks).  I can see that it messed up what I was asking. 

I see it completing the SLP in the correct fashion, Backup completes, then duplicates over the Wan, then duplicates to tape.

So just to confirm that after the 9 week retention on the original SLP which deduped 30GB, it expired the data and will need to re-send the 30Gb over the WAN again.  So the cycle begins again.

 

What are some suggestions for doing the same but with Terabytes of data? Should I do a retention of say 2 to 3 years on the initial backup? Then just do shorter retentions on subsequent backups.

 

Sorry if this is confusing or if I am over complicating it, just trying to wrap my head around the dedup retentions.

Thanks for the reply!

 

 

Yasuhisa_Ishika
Level 6
Partner Accredited Certified

Well, it is not guareed that deduped data is kept after its expiration date.You shold configure infinity or long term retension for full backup, and keep sequent full backups so as not to expire full backups.

Jess_M12
Level 2

Hi Andrew,

To answer your questions and help anyone else who might have this question: That particular job is a standard full and duplicate to dedupe STU's.

Perfect explanation. I really appreciate the amount of effort you put into your response.

Thanks Andrew and Nicolai for your responses!

Jess_M12
Level 2

Hi Yasuhisa,

That was my thought for the first original backup and I will follow that rule for my large shares. Providing disk space availability.

Thanks for your response!