cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup Master server on VCS

tarmizi
Level 5
Partner Accredited Certified

One of our customer having some issue with their Netbackup master recently. Their environment have two Netbackup Master servers which is on Veritas Cluster Server. Our customer facing an error when the Master backup is failover to passive node. It seem that the Netbackup services is not brought up automatically even when the failover were successfull. We have to issue command "bpup -v -f" to start all of the services manually.
Please advice. Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

tarmizi
Level 5
Partner Accredited Certified

Hi Marianne,

Thank you for the advice. We have log a case to Veritas for this issue under the VCS (SFHWA) product. 

It seems like the VCS helper service need to be re-enter the password on both nodes and everything is fine now. We manage to perform manual failove and perform backup on each node. 

Regards

View solution in original post

13 REPLIES 13

RiaanBadenhorst
Moderator
Moderator
Partner    VIP    Accredited Certified

You'll probably find that there is a single service that might be failing, or that it might be taking longer to startup than VCS expects. Set the NetBackup Server resource to non-critical. The restart or failover the service group and then monitor the Event viewer application/system events to see if there is a complaint about any service that doesn't start. If there are none, its probably just a timeout issue and you can increase the online timeout of the NetBackup Resource.

You can also control which resource VCS monitors if you look in the registry but try the above first.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Are you referring to automatic failover due to a fault of a critical resource?
Or a switch to the other node (manually initiated)?

Can you confirm that the NetBackup resource was online at the time of failover?
Was the NetBackup resource perhaps in faulted state? If so, did you manually clear the fault?

Faults on non-persistent resources are not automatically cleared.
The reason for that is that 'something' happened for the resource to fault. You need to troubleshoot the reason and fix it, then clear the fault.

A manual switch to another node will online a service group to the same state as it was on the source node.

Are your backup resources using VCS commands or VCS GUI to start/stop NBU?
In a clustered environment, VCS commands/utilities should be used, not NBU commands.

Well, I am only guessing here.
To really know what happened we need to see the Engine_A log.

Please extract all the lines from slightly before and after the failover and paste in a text file (e.g. engine.txt) and upload here. 
Please do not upload the entire file as that is normally too big.

tarmizi
Level 5
Partner Accredited Certified

Hi all,

We have went through the node of VCS and can see a service name "Veritas Cluster Server Helper" is stopped. What does it do? could this related?

Thank you

Veritas cluster server helper.JPG

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You need to check Event Viewer logs to see why the service was stopped.

Herewith explanation of VCS helper service:

https://www.veritas.com/support/en_US/article.TECH31331

Please look at the Engine_A log or extract and post here. 
The reason for VCS-related 'events' is normally clearly logged in this file.

tarmizi
Level 5
Partner Accredited Certified

Hi Marianne,

Responding to your query, it is an automatic failover from node0 to node1 then out of curiosity, we perform another failover (manually trigger) from node1 back to node0 but the Netbackup service is not up and its why we ran "bpup -v -f" command. And we found an error in Event Viewer (as per shown below) in node0 that might what triggered the automatic failover on VCS. 

Thank you

EventViewer.JPG

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The Event Viewer error is a Microsoft error.

Please provide Engine log as per previous requests.

 

tarmizi
Level 5
Partner Accredited Certified

Hi Marianne,

One silly question, how can we collect the Engine log?

Thank you

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The engine and agent logs are located at %VCS_HOME%\log\

VCS_HOME is typically C:\Program Files\Veritas\Cluster Server

 

tarmizi
Level 5
Partner Accredited Certified

Hi Marianne,

Here the log from node0

thank you

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@tarmizi just another 2c from my side :

If you and your customer need to manage and maintain any kind of application on VCS, then it is extremely important that you understand the basics of VCS. There are various options available for online and classroom training.

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

There seems to be no resource for NetBackup itself. 

Resouce dependencies work like this (I will try my best to sketch a text version of a service group here):

                               NetBackup
                           (NBU services)
                                   / \
                                 /     \
                          Lanman        Mount 
                     (virtual name)    (Drive letter)
                         /                     \
                       IP                  DiskGroup
                       /
                     NIC

Resources are onlined from the bottom up.

After Lanman and Mount resources have onlined successfully, the NetBackup resource must be onlined.

The NetBackup resource seems to be missing from the cluster config.

I see these resources going offline and online (you can map them to my picture above):

NIC is a persistent resource (always online) - no need to online or offline.
NetBackup_Server-IP
NetBackup_Server-Lanman
NetBackup_Server-MountV-H
NetBackup_Server-VMDg-H

You can see that there is no NetBackup resource. Something like NetBackup_server-NBU .

This command output will confirm configured resources: 
hastatus -sum

Who performed the NetBackup Clustered installation and configuration?

Was it not checked and tested before handing over to the customer? 

I found this TN that could explain the absence of a NetBackup resource: 
https://www.veritas.com/support/en_US/article.000014833
   
(The TN is quite old, but worth checking with with Veritas Support to see if TN is still relevant.)

In all honesty, if this is the case, then someone has not done his/her job during initial testing.

 

tarmizi
Level 5
Partner Accredited Certified

Hi Marianne,

Thank you for the advice. We have log a case to Veritas for this issue under the VCS (SFHWA) product. 

It seems like the VCS helper service need to be re-enter the password on both nodes and everything is fine now. We manage to perform manual failove and perform backup on each node. 

Regards

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Glad it was sorted.

I'm still wondering about the NBU resource. There was no reference to it in the Engine log.

Did it magically appear when the VCS helper account was fixed?