cancel
Showing results for 
Search instead for 
Did you mean: 

Is there any setting for failover in NBU??

liverpool2507
Level 4

HI Experts,

 

I am testing the failover scenario with NBU 7.6.1.

I am backing up the SMB share (network share). My client which is the source of my data is mounted on the windows 2008 NBU master server and has 2 nodes-node A and node B.

issue: 

if i run a backup from nodeA and do the failover from the cluster while the backup is running, the backup stops. It doesnt fail but is partially successful. NBU backsup the data till the point when failover is executed. 

Activity monitor says "backup is partially successful" and says that "the specified path no longer exists". 

In an ideal scenario, when a failover is run on nodeA, it should automatically switch to node B without affecting the backup.

However, if i take a backup (without failover) which is successful and do a restore and while restore is running, i do a failover of the nodeA then restore is successful. So the issue is only when failover is done while the backup is running.

 

I want to know if there is any setting in NBU which needs to be edited in order do a sucessful backup when a failover is run in the middle of the running backup.

Kindly help and suggest.

 

Cheers!!

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

A cluster does not give you 'always on'. It gives you failover.

Failover means that the application and all resources are taken offline on the failed node and brought online on the standby node. 
You now have a master server that can either restart or resume backups.

What happens to the jobs after failover depends on the type of job, progress at the time of failure and # tries per ## hours settings in Global Attributes.

View solution in original post

11 REPLIES 11

sdo
Moderator
Moderator
Partner    VIP    Certified

When failover occurs then NetBackup services must be stopped and re-started which interrupts *all* jobs (backups, duplications, replications, restores).

To limit the impact to the backup job - you could try setting 'checkpoint' on the backup policy which should allow you to restart the backup job from the point of the last known previously successful check-point event.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

A cluster does not give you 'always on'. It gives you failover.

Failover means that the application and all resources are taken offline on the failed node and brought online on the standby node. 
You now have a master server that can either restart or resume backups.

What happens to the jobs after failover depends on the type of job, progress at the time of failure and # tries per ## hours settings in Global Attributes.

liverpool2507
Level 4

Hi sdo, 

as per your suggestion, i tried setting the checkpoint in the policy attribute. However, i encountered the same issue. 

Below is the error message that i see in the activity monitor.

 

9/3/2015 5:46:22 AM - begin writing
9/3/2015 5:48:28 AM - Error bpbrm(pid=14720) from client ssdcl498.ibrix.hp: ERR - failure reading file: \\ISVUN01Y.USA.HP.COM\NKSHARE\ndmp\file3 (WIN32 64: The specified network name is no longer available. )
9/3/2015 5:49:46 AM - Info bpbkar32(pid=15668) bpbkar waited 48 times for empty buffer, delayed 48 times.   
9/3/2015 5:49:46 AM - Info bptm(pid=12916) waited for full buffer 3247 times, delayed 6989 times    
9/3/2015 5:49:48 AM - Info bptm(pid=12916) EXITING with status 0 <----------        
9/3/2015 5:49:48 AM - Info bpbrm(pid=14720) validating image for client ssdcl498.ibrix.hp        
9/3/2015 5:49:48 AM - Info bpbkar32(pid=15668) done. status: 1: the requested operation was partially successful    
9/3/2015 5:49:48 AM - end writing; write time: 0:03:26
the requested operation was partially successful (1)

The job was successfully completed, but some files may have been
busy or inaccessible. See the problems report or the client's logs for more details.

liverpool2507
Level 4

Hi sdo, 

as per your suggestion, i tried setting the checkpoint in the policy attribute. However, i encountered the same issue. 

Below is the error message that i see in the activity monitor.

 

9/3/2015 5:46:22 AM - begin writing
9/3/2015 5:48:28 AM - Error bpbrm(pid=14720) from client ssdcl498.ibrix.hp: ERR - failure reading file: \\ISVUN01Y.USA.HP.COM\NKSHARE\ndmp\file3 (WIN32 64: The specified network name is no longer available. )
9/3/2015 5:49:46 AM - Info bpbkar32(pid=15668) bpbkar waited 48 times for empty buffer, delayed 48 times.   
9/3/2015 5:49:46 AM - Info bptm(pid=12916) waited for full buffer 3247 times, delayed 6989 times    
9/3/2015 5:49:48 AM - Info bptm(pid=12916) EXITING with status 0 <----------        
9/3/2015 5:49:48 AM - Info bpbrm(pid=14720) validating image for client ssdcl498.ibrix.hp        
9/3/2015 5:49:48 AM - Info bpbkar32(pid=15668) done. status: 1: the requested operation was partially successful    
9/3/2015 5:49:48 AM - end writing; write time: 0:03:26
the requested operation was partially successful (1)

The job was successfully completed, but some files may have been
busy or inaccessible. See the problems report or the client's logs for more details.

liverpool2507
Level 4

Hi Marriane,

Policy Type that i have selected in the NBU policy is MS-Windows. 

Schedule backup attempts in global attributes is 1 try/24 hr.

I have tried setting up the checkpoint as suggested by sdo but no gain.

i have pasted the output of the activity monitor in the reply to sdo. Kindly have a look f u can suggest anything.

 

Thanks

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

1 try every 24 hours will not try to restart or resume the backup.

Only when number of tries is more that 1 within the backup window, will the backup be retried.

liverpool2507
Level 4

I have changed the tries/per hour now. (3 per 1 hour). im not sure if thats really gonna help (i may be wrong)

What i do not understand is that with the same settings set earlier, restore works fine (if failover is performed while restore is running). 

 

sdo
Moderator
Moderator
Partner    VIP    Certified

Ok - am I right in thinking that this UNC path:

\\ISVUN01Y.USA.HP.COM\NKSHARE\ndmp\file3

...is accessible from one of clustered master server nodes, but not from the other node?  If so, then you'll need to investigate that, because that in itself is not a NetBackup issue.

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

What did you use for the client in the policy?

liverpool2507
Level 4

the client that i have selected is the master server. as my share is mounted on NBU master server. its is a SMB network share that i need to backup.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Did you select master server virtual hostname in the policy? Is UNC path accessible from both nodes? Client service on both nodes started with sufficient access permissions to the share?