Forum Discussion

tarmizi's avatar
tarmizi
Level 5
7 years ago

Netbackup Master server on VCS

One of our customer having some issue with their Netbackup master recently. Their environment have two Netbackup Master servers which is on Veritas Cluster Server. Our customer facing an error when the Master backup is failover to passive node. It seem that the Netbackup services is not brought up automatically even when the failover were successfull. We have to issue command "bpup -v -f" to start all of the services manually.
Please advice. Thank you

  • tarmizi's avatar
    tarmizi
    7 years ago

    Hi Marianne,

    Thank you for the advice. We have log a case to Veritas for this issue under the VCS (SFHWA) product. 

    It seems like the VCS helper service need to be re-enter the password on both nodes and everything is fine now. We manage to perform manual failove and perform backup on each node. 

    Regards

13 Replies

  • You'll probably find that there is a single service that might be failing, or that it might be taking longer to startup than VCS expects. Set the NetBackup Server resource to non-critical. The restart or failover the service group and then monitor the Event viewer application/system events to see if there is a complaint about any service that doesn't start. If there are none, its probably just a timeout issue and you can increase the online timeout of the NetBackup Resource.

    You can also control which resource VCS monitors if you look in the registry but try the above first.

  • Are you referring to automatic failover due to a fault of a critical resource?
    Or a switch to the other node (manually initiated)?

    Can you confirm that the NetBackup resource was online at the time of failover?
    Was the NetBackup resource perhaps in faulted state? If so, did you manually clear the fault?

    Faults on non-persistent resources are not automatically cleared.
    The reason for that is that 'something' happened for the resource to fault. You need to troubleshoot the reason and fix it, then clear the fault.

    A manual switch to another node will online a service group to the same state as it was on the source node.

    Are your backup resources using VCS commands or VCS GUI to start/stop NBU?
    In a clustered environment, VCS commands/utilities should be used, not NBU commands.

    Well, I am only guessing here.
    To really know what happened we need to see the Engine_A log.

    Please extract all the lines from slightly before and after the failover and paste in a text file (e.g. engine.txt) and upload here. 
    Please do not upload the entire file as that is normally too big.

    • tarmizi's avatar
      tarmizi
      Level 5

      Hi Marianne,

      Responding to your query, it is an automatic failover from node0 to node1 then out of curiosity, we perform another failover (manually trigger) from node1 back to node0 but the Netbackup service is not up and its why we ran "bpup -v -f" command. And we found an error in Event Viewer (as per shown below) in node0 that might what triggered the automatic failover on VCS. 

      Thank you

      • Marianne's avatar
        Marianne
        Level 6

        The Event Viewer error is a Microsoft error.

        Please provide Engine log as per previous requests.

         

  • Hi all,

    We have went through the node of VCS and can see a service name "Veritas Cluster Server Helper" is stopped. What does it do? could this related?

    Thank you