cancel
Showing results for 
Search instead for 
Did you mean: 

Paranioa - Simplify Long Term Data Retention

Casey_Sterling
Level 4
We have some requirements that require us to retain backup data for a long period of time. Unfortunately, the only way to capture the data that I can figure, is to keep our monthly full backups (which is > 10TB) of our filers which, of course, is quite annoying.  What I'd like to do is:
 
Full backup monthly - retained short term (say 2 months)
Incremenals daily     - retained short term (2 months)
Run a 30 day incremental - which retains files modified over the past 30 days, and retain those for 'a long time'.
 
Of course...bkexec can do this...out of the box netbackup cannot (as far as i can tell). yep, I could write something that produces a list of files modded in the month - pipe that into bpbackup...but...host side script backups are unnerving as I'm not the only one managing these hosts.
 
Yep, am also aware I could run a full backup at beginning of month, run diff incs, then run a cumulative inc before the next full...however...lets just say i've had a 'results may vary' using this method. For instance, if the cumulative fails for whatever reason over a weekend...then a full kicks in...im basically hosed. Using the date method is a better guarantee.
 
I'm guessing I'm stuck with bpbackup...but figured i'd at least ask...
16 REPLIES 16

Ron_Cohn
Level 6
Casey,
 
If you run a full at the first of the month and run "cumulative incrementals" (not differential incrementals) for the remainder of the month, then the full and last cumulative is all that you would need.  Even if a cumulative fails doing the month, the next cumulative will pickup those missed changes.  Also, if a differential tape goes bad, you have lost that data, whereas a cumulative will still give you protection.

Casey_Sterling
Level 4

Excellent point - it's just the retentions and keeping policies simple that's the tough part.

in otherwords we'd need

full backup on day one =10000year retention
cumulatives = 2 month retentions
30th day cumulative= 10000year retention...(hope it runs per legal requirements)
next full runs = 2 month retention

OR

full backup on day 1
cumulatives run
full runs
write simple script to find the last cumulative backup before a full and bpexpdate it...

I'm picking up what ur puttin down, however, would be nice just to create a sched in a policy that states "just get the last 30 days of modified data" and be done with it...

Darren_Dunham
Level 6
Looks to me like this is pretty simple as long as you split it up into two separate policies.

I'd take your first two requirements and put them into their own policy:
Full backup monthly - retained short term (say 2 months)
Incremenals daily     - retained short term (2 months)
Pretty simple here, monthly full, daily incremental, both with 2 month retention.
Then separate your other to a different policy:
Run a 30 day incremental - which retains files modified over the past 30 days, and retain those for 'a long time'.
So policy 2 should only need a single schedule, a monthly incremental backup with infinite retention.

Obviously this policy is more fragile in the face of bad media and means that old restorations may require a lot of tapes (because the files may have been saved at different times), but the implementation is pretty simple.

Also policies like this where full backups are not run periodically are more prone to run into issues where expected backups are not done (like where older files are moved to a new directory...  The files may have been backed up in the old directory, but the person doing the restore might not be able to find them because they would never get saved from the new location).

Good luck!

--
Darren

Ron_Cohn
Level 6
A followup question is:  Is your 2 month retension and 10,000 year retension stored at the same site?

Casey_Sterling
Level 4
"Pretty simple here, monthly full, daily incremental, both with 2 month retention.
Then separate your other to a different policy:
Run a 30 day incremental - which retains files modified over the past 30 days, and retain those for 'a long time'."

that's the trick - that 'button' does not seem to exist as far as i cant tell (30 day incremental from mod date). i can run a cumulative at the end of the month before the full (in a diff sched or policy)...however...i CANT stop the full from running if the previous cumulative fails...which is important, as all the cumulatives/incs before the last one will have a shorter retention. which would 'hose up' my legal requirements. hence, the paranioa thread.

takes me back to bpbackup (script) or bpexpdate (script)...no biggie writing one...just annoying.

Casey_Sterling
Level 4
lol - yeah - daily i drive them home and keep them all in a sock drawer. i have a lot drawers...

Darren_Dunham
Level 6


@Casey Sterling wrote:
"Pretty simple here, monthly full, daily incremental, both with 2 month retention.
Then separate your other to a different policy:
Run a 30 day incremental - which retains files modified over the past 30 days, and retain those for 'a long time'."

that's the trick - that 'button' does not seem to exist as far as i cant tell (30 day incremental from mod date). i can run a cumulative at the end of the month before the full (in a diff sched or policy)...however...i CANT stop the full from running if the previous cumulative fails...which is important, as all the cumulatives/incs before the last one will have a shorter retention. which would 'hose up' my legal requirements. hence, the paranioa thread.

takes me back to bpbackup (script) or bpexpdate (script)...no biggie writing one...just annoying.


I'm not sure what problem you see.  In Netbackup (unlike Networker if you're familiar with it), policies are independent.  So the fact that your first policy may run a full in the meantime has no effect on your "long-term" policy.

Just set it to run an incremental backup once a month.  The incremental will reference only this policy, which last ran a month ago.  That gives you a backup with "all the files changed in the past month".

Or have I misunderstood your goal here?

--
Darren

Casey_Sterling
Level 4
That's what I thought as well at first, but when I tested it (setup a seperate policy), the backup in the second policy did not extend across 30 days. meaning, setting a cumulative or diff monthly from month 1 to month 2 did not capture 30 days of information, it only captured 1 day.

test setup was, policy 1:
fulls + incs retained for 2 months, effectively run daily.

policy 2
monthly incremental, retained 10000years


Ron_Cohn
Level 6
Darren,
 
Are you sure about the statment:
 
"I'm not sure what problem you see.  In Netbackup (unlike Networker if you're familiar with it), policies are independent.  So the fact that your first policy may run a full in the meantime has no effect on your "long-term" policy."
 
The policies maybe independent, however if you are using the archive bit (like I do), that should not be true.  For me, I have cumulatives going to 1 robotic and fulls going to another.  So, in my case, I have 2 policies for 1 server.

Casey_Sterling
Level 4
quagmire...hmmm. I'm going to recreate this early test and will update accordingly in a few days.


edit...oh, and thanks for the information guys...much appreciated...


Message Edited by Casey Sterling on 01-10-2008 11:59 AM

Darren_Dunham
Level 6


Are you sure about the statment:
 
"I'm not sure what problem you see.  In Netbackup (unlike Networker if you're familiar with it), policies are independent.  So the fact that your first policy may run a full in the meantime has no effect on your "long-term" policy."
 
The policies maybe independent, however if you are using the archive bit (like I do), that should not be true.  For me, I have cumulatives going to 1 robotic and fulls going to another.  So, in my case, I have 2 policies for 1 server.

So don't use the archive bit.  No, I wasn't thinking of that possibility with my response.  Most of my machines aren't Windows servers (most are NDMP appliances), so archive bits don't apply.

Yes, if you're relying on the archive bit, then I don't see a way to make this work easily.   Since there's only one bit, any type of special backup can throw off your normal backups. 

I can imagine a scheme where you clone the "end of month" cumulative (which can automatically extend the retention on the clone) and then immediately do the next month's full.  But that's always going to leave a window where files can be changed between the two.  Such files will not be captured in any later cumulative.

--
Darren

Casey_Sterling
Level 4
Yep, the two policies are 'competing'. I'd prefer not to move the hosts from using the archive bit to modified date, as, that's a manual change that would make these hosts non-standard; at least in comparison to the rest of the environment. I am going to test it however; moving a few candidate hosts from archive bit to modified date to see if that does clear it up.

Ron_Cohn
Level 6
Casey,
 
NetBackup Terminology:
 
Full backup - self explanatory
Differential - Changes since the last backup (whether full or not)
Cumulative - Changes since the last FULL backup.
 
Policy structure: - 1 policy with 3 different schedules
 
One schedule which performs a full at the 1st of the month.  Use in-line copy to create 2 sets of tapes with different expiration dates
 
One schedule which performs your Daily Differentials.  If there is an error - DO NOT PERFORM A FULL!  Let the next differential pickup the changes.
 
One schedule which runs on the last day and is a Cumulative (remember - all changes since the last FULL backup).
 
Also, you are going to have to change the number of copies of each image you will keep (2 should suffice - 1 for the Differential and 1 for the Cumulative).


Message Edited by Ron Cohn on 01-11-2008 11:04 AM

Darren_Dunham
Level 6
You can't guarantee all the files end up in the cumulatives.  There's always going to be a gap between the previous cumulative and the next full.

--
Darren

Ron_Cohn
Level 6
"You can't guarantee all the files end up in the cumulatives. There's always going to be a gap between the previous cumulative and the next full."

Darren,

That is true in any situation. There is no way to make it so *nothing is missed. If you run a full at the end of the month, data will be missed if any activity occurs on the server between the time the full executes and midnight. For 99.99% of companies, this is not an issue. So a cumulative is still the best response.

Casey_Sterling
Level 4
"One schedule which performs a full at the 1st of the month.  Use in-line copy to create 2 sets of tapes with different expiration dates
 
One schedule which performs your Daily Differentials.  If there is an error - DO NOT PERFORM A FULL!  Let the next differential pickup the changes."

"If there is an error - DO NOT PERFORM A FULL!" - is exactly what i'm worried about. it's not 'me' kicking off the full - netbackup could do it on it's own.

Notice how many schedules will be needed - just to capture the last 30 days of modified files. it's actually kind of rediculous. I should have mentioned I've been using netbackup since 3.2 and have ~50k tapes offsite. Not newbie...my bad. I need to keep our environment as simple as possible as we have lot's of data and am not the only admin in the system.

Yeah...it appears i'm stuck with bpbackup (user backup) as there really isn't a guaranteed way to capture from the scheduler.

Thanks for you're assistance guys.