cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup AIR - restricting REPLICATION jobs

SYMAJ
Level 6
Partner Accredited

I have a situuation whereby the link between two NBU sites / domains has been inactive for some time, and the link has now been upgraded and is available again.

I suspended secondary operations on the SLP's in the primary site during this period, and as a result now have a couple of thousand images awaiting replication to the secondary site.

Before I re-enable the secondary operations on the SLP's I want to ensure that I don't have hundreds of replicaton jobs starting at once, as I have seen this in the past and the system becomes unresponsive.

How can I limit the amount of replication jobs that will be submitted as I want to limit this to a manageable number (say 20 concurrent).  I know I can limit the amount of streams on the Storage Server / Disk Pool, but I am afraid that the replications will all kick in and the backups will not have any available streams to work with.

Bottom line - I want to continue to allow my backups to run but also have a limited number of replication jobs running concurrently allongside them.

Any input appreciated.

AJ

13 REPLIES 13

Michael_G_Ander
Level 6
Certified

The number of replication jobs per SLP is Disk resource multiplier (default 2) on the source times the number of concurrent write job on the target STU as I understand it.

So creating a STU with 1 concurrent write job and point the target SLP(s) to that should limit the active replication jobs to 2 per SLP. Would suggest to stiil active one SLP at the time

Another way is to use nbstlutil to active backup images one at the time, gives more control, but way more time consuming.

Hope this helps

 

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

SYMAJ
Level 6
Partner Accredited

Just to be sure I understand your post......

Are you saying there is no need to update anything on the source domain, and I should create a new STU on the remote (target) domain with only one concurrent write job, then point the IMPORT step of the SLP on the remote site to use this STU ?  This will result in 2 replications running concurrently from that SLP ?

Thanks,

AJ

Michael_G_Ander
Level 6
Certified

Yes, that is how I understand the replication to work.

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

SYMAJ
Level 6
Partner Accredited

OK -many thanks.

AJ

Phil_Scarfo1
Not applicable
Employee Accredited Certified

SYMAJ

I think what i would do would be to increase the max size per AIR replicaiton job parameter to something like 5 TB ...

This way no matter what is in the hopper, that right there will limit how many get created.

Let us know what you finally decide to do ... Good Luck !!

*note ... looks my all my symaccount info has been wiped.  I am an employee, accredited and certified.
 

Andrew_Madsen
Level 6
Partner

SYMAJ,

If you suspended operations creating a different STU will not help you as the original STU is in the version of SLP the original backups were run under. Follow Phil's suggestion and you will create less jobs but bigger which is good.

cruisen
Level 6
Partner Accredited

Hello Symaj,

From the NetBackup 7.1 Best Practice - Using SLPs and AIR.pdf

Reduce backlog by delaying or canceling the duplication of the older images.

SLPs duplicate the oldest images first. Until the backlog is cleared up, your new backups will not be
duplicated. The easiest way to reduce the backlog is to use nbstlutil to delay or cancel processing of
old images.

Delaying duplications is a good way of clearing a backlog caused by a temporary lack of resources (for example if a tape library is down), stopping work being queued while new resources are being installed and allowing more urgent duplications to be processed ahead of older, less urgent ones. It does not solve the problem of a continuously growing backlog caused by having more duplications than the available resources can manage.
 

nbstlutil inactive – using different qualifiers this command can be used to delay pending
duplications by suspending processing for a particular SLP (-lifcycle), storage destination (–
destination) or image (-backupid)

The command nbstlutil pendimplist can be run on the target master server to view images that
have been successfully duplicated but have not yet been are imported into the target domain (for example
because the import storage unit is marked  ̳down‘ for some reason).

Use the following entries in the LIFECYCLE_PARAMETERS file to tune the size and frequency of
duplication jobs. Changes to these parameters will take effect immediately without restarting any
NetBackup services:

Best regards,

Cruisen

SYMAJ
Level 6
Partner Accredited

Andrew - were we not talking about changing the SLP on th remote site only ?  Surely if this is the case and the secondary operations are suspended on the primary site it will not have picked up the version of the SLP in the remote site yet ?  So - if I change the SLP on the remote site to use a different STU it should be honoured when I activate the secondary operations (replicate) on the primary site.   Correct ?

Cruisen - all of my images are in the same SLP, which makes your suggestion unworkable I think......

I think I will change the MAX SIZE PER AIR REPLICATION to a huge number as per Phil and see what happens.

I will let you know how I get on........

AJ

SYMAJ
Level 6
Partner Accredited

Andrew - were we not talking about changing the SLP on th remote site only ?  Surely if this is the case and the secondary operations are suspended on the primary site it will not have picked up the version of the SLP in the remote site yet ?  So - if I change the SLP on the remote site to use a different STU it should be honoured when I activate the secondary operations (replicate) on the primary site.   Correct ?

Cruisen - all of my images are in the same SLP, which makes your suggestion unworkable I think......

I think I will change the MAX SIZE PER AIR REPLICATION to a huge number as per Phil and see what happens.

I will let you know how I get on........

AJ

cruisen
Level 6
Partner Accredited

Hello,  i do not understand what you mean by ==> all of my images are in the same SLP.

did you run nbstlutil stlilist === Shows the status for incomplete copies of lifecycle managed images.

nbstlutil active | inactive | cancel -backupid id_value

# nbstlutil stlilist -U

Display the information for an incomplete lifecycle image in user-readable output.

# nbstlutil stlilist -U
Image abc_1225727928 for Lifecycle SLP_Test1 is IN_PROCESS

OPTIONS

-after mm/dd/yyyy HH:MM:SS => Restricts the SLP secondary operation to only those backups started after the specified date-time.

-b Lists only the backup IDs.

 
-backupid value 

Specifies the backup ID whose images are to be processed.

-before mm/dd/yyyy HH:MM:SS

etc ....
nbstlutil stlilist –image_incomplete –U ==> this command displays details of the unfinished copies sorted by age and can be used to determine both the time the images have been in the backlog and the names of the individual images.
You need to put the corrsponding backupid in inactive state.
 
best regards,
Cruisen

 

 

onepranav
Level 4

Hello,

Did u ever find a way to achieve this?
i'd love to know if there was a way to let backups continue but limit the replication operations somehow.

sdo
Moderator
Moderator
Partner    VIP    Certified

@onepranav limit in what way(s) ?   max source concurrent replication jobs ?  max target concurrent replication jobs ?  run/scheduling windows ?  max network bandwidth ?  max quantity/volume of data GB/TB ?

Going back to my original query -

Setup -
SLP tied to a storage unit -> that storage unit is against a disk pool -> Disk pool is setup for MFR based datadomain to datadomain replications.

The amount of backups/replications that can be run are controlled via the stream that we limit of the disk pools or the SLP parameters. When we change the stream count, it affects both backups and replication jobs. Say i have 100 streams setup for my Disk pool. My question is if there is a way to limit the replication jobs to use only 20 of the streams and let backups use 80 streams.