cancel
Showing results for 
Search instead for 
Did you mean: 

slp

Arun_K
Level 6

We have SLP in our environment

Copy 1 goes on Data domain

Copy 2 on tape.

Our tape library was down for 4 -5 days.

Now i want to initiate all my duplication jobs.How to check what all are pending and how to initiate?

 

nbstlutil

1 ACCEPTED SOLUTION

Accepted Solutions

mph999
Level 6
Employee Accredited
when i used to run an environment we split the backups into multiple seperate environments with no one environment being busy for more than 16 hours in a day. That way we had capacity for catch up if we had any issues, extra backups and restores. Sure backups failed, but most were successful on the rerun, so in any 24hr period we maintained an average of 98.4 %, across about 2500 servers. Additionally all database servers had 3 days worth of diskspace for the redo logs just in case of an issue with backups. We never had any major issues, as there was always the ability of moving a client to another server, and if a backup server did go down we only lost 1/25 of the backups as opposed to 100%. Further, the load for any server was spread out with some fulls running on a mon, some on a tue, some on a wed and so on, that way we didnt get near to overloading the network by running all the fulls on a fri The systems were all designed to do what they needed to do, importantly, they were designed on paper, not by guessing. All this seems excessive perhaps, but given the importance of the data, and the potential financial loss if systems were down it was cheap. I have zero sympathy when I am told that backups are down and the database will stop in 30 mins, or there are 5 hours to duplicate 8 hours worth of data because there was no allowance in the system for unexpected downtime. Martin

View solution in original post

7 REPLIES 7

mph999
Level 6
Employee Accredited
You can't kick them off, there is no supported command to do this, they should start to duplicate on their own. You can check what is pending with nbstlutil list -U Martin

Arun_K
Level 6

I read a document stating that the oldest image is duplicated first,

 

SLPs duplicate your oldest images first. While an old backlog exists, your newest backups will not be duplicated. This can cause you to miss your Service Level Agreements (SLAs) for getting a copy of a backup to offsite storage.

mph999
Level 6
Employee Accredited
No, the robot going down, or, whatever the fault was caused you to miss the sla This is how slp currently works, and there is no way around it. You could cancel the images and duplicate them manually, this would be the only option.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Keep on reading... The document that you are quoting from contains more than just the lines you have copied.....

 

2. To reduce (and ultimately get rid of) a backlog, your duplications must be able to catch up. Your duplications must be able to process more images than the new backups that are coming in. For duplications to get behind in the first place, they may not have sufficient hardware resources to keep up. To turn this around so that the duplications can catch up, you may need to add even more hardware or processing power than you would have needed to stay balanced in the first place.
The key to avoiding backlog is to ensure that images are being duplicated as fast as new backup images are coming in, over a strategic period of time.
......
As you introduce SLPs into your environment, monitor the backlog and ensure that it declines during periods when there are no backups running. Do not put more jobs under the control of SLPs unless you are satisfied that the backlog is reducing adequately.
Consider the following questions to prepare for and avoid backlog:
 Under normal operations, how soon should backups be fully duplicated? What are your Service Level Agreements? Determine a metric that works for the environment.
 Is the duplication environment (that includes hardware, networks, servers, I/O bandwidth, and so on) capable of meeting your business requirements? If your SLPs are configured to use duplication to make more than one copy, do your throughput estimates and resource planning account for all of those duplications?
Do you have enough backup storage and duplication bandwidth to allow for downtime in your environment if there are problems?
Have you planned for the additional time it will take to recover if a backlog situation does occur? After making changes to address the backlog, additional time will be needed for duplications to catch up with backups.
 
 
 

PS: "Strange" how you and Nikhil/Puneet have exactly the same environment??

mph999
Level 6
Employee Accredited
Thank you for posting that Marianne, regarding spare capacity if there are issues. Any backup system using any backup software can have downtime due to many reasons, there should always be spare capacity to catch up, and if you do not have this, in my opinion, and that of many experienced people i know, is bad design. Still its always easier to just blame Netbackup. martin

revarooo
Level 6
Employee

Hear, hear!

mph999
Level 6
Employee Accredited
when i used to run an environment we split the backups into multiple seperate environments with no one environment being busy for more than 16 hours in a day. That way we had capacity for catch up if we had any issues, extra backups and restores. Sure backups failed, but most were successful on the rerun, so in any 24hr period we maintained an average of 98.4 %, across about 2500 servers. Additionally all database servers had 3 days worth of diskspace for the redo logs just in case of an issue with backups. We never had any major issues, as there was always the ability of moving a client to another server, and if a backup server did go down we only lost 1/25 of the backups as opposed to 100%. Further, the load for any server was spread out with some fulls running on a mon, some on a tue, some on a wed and so on, that way we didnt get near to overloading the network by running all the fulls on a fri The systems were all designed to do what they needed to do, importantly, they were designed on paper, not by guessing. All this seems excessive perhaps, but given the importance of the data, and the potential financial loss if systems were down it was cheap. I have zero sympathy when I am told that backups are down and the database will stop in 30 mins, or there are 5 hours to duplicate 8 hours worth of data because there was no allowance in the system for unexpected downtime. Martin