Forum Discussion

bert_geiger's avatar
bert_geiger
Moderator
11 years ago

VCS: NBU failover impeded by nagios monitoring - any experience?

OS: SUSE Linux; VCS: 5.1 NBU: 7.1 clustered Master Server in own SG

scenario: Nagios regularly starts diverse CLI commands on NBU to check for e.g. long-running jobs, nbemm still responding etc...

It seems that these checks stop VCS from doing a switch/failover of the NBU SG. "Workaround" is to disable the Master Server monitoring in Nagios; I am looking for a way disable these checks locally. Once I know how, I can have VCS do it. Any experience with this scenario?

 

Regards,

Bert

  • Hello Bert,

    Honestly didn't got a chance to work in exact scenario & I am no expert in NBU, however thinking of this issue, do we have already or can we build scripts at unix level to check long running jobs ?, nbemm still responding ?

    If yes, we can keep running the scripts in loop in background & use VCS "process" agent to monitor the handwritten scripts.

    We need to use specific exit codes in the script (110 = Successful / 100 = unsuccessful).

     

    G

  • Hi Gaurav,

    thanks for your input! I was thinking along the same lines, and found out that it's a bit easier: Nagios (at least the way it's set up here) does not have a local agent, just a link from the local machine to the Nagios server. I will use a Process Resource to kill the link (and re-establish it at startup, naturally). And thanks about reminding me of the "100/110" exit.

    Further comments are welcome, but I will consider it as "solved" for now.

    Regards,

    Bert