cancel
Showing results for 
Search instead for 
Did you mean: 

Need Help with VCS monitoring a process as a resource

Chotchki
Level 2

Hey Everyone,

I have a cluster main.cf file as follows:

group BATCH-SG (
        SystemList = { foo = 0, bar = 1 }
        AutoStartList = { foo }
        OnlineRetryLimit = 3
        )

        IP ip3-RES (
                Device = bond0
                Address = "0.0.0.1"
                NetMask = "255.255.254.0"
                )

        NIC nic1-RES (
                Device = bond0
                )

        ProcessOnOnly DB_Watchdog-RES (
                PathName = "/app/watchdog/bin/db-watchdog-wrapper"
                UserName = ezbatch
                )

        ProcessOnOnly NAS_Watchdog-RES (
                PathName = "/app/watchdog/bin/nas-watchdog-wrapper"
                UserName = ezbatch
                )

        requires group Cluster_Storage-SG online local firm
        ip3-RES requires nic1-RES


        // resource dependency tree
        //
        //      group BATCH-SG
        //      {
        //      ProcessOnOnly DB_Watchdog-RES
        //      ProcessOnOnly NAS_Watchdog-RES
        //      IP ip3-RES
        //          {
        //          NIC nic1-RES
        //          }
        //      }
 

I'm trying to set it up so that when either of the ProcessOnOnlys die, the group dies and VCS initiates a failover. However when either of the watchdogs exit with status 100, VCS just restarts them. Any ideas on what I'm doing wrong?

1 ACCEPTED SOLUTION

Accepted Solutions

mikebounds
Level 6
Partner Accredited

Ah, just seen you have OnlineRetryLimit set on group so you need to set this to zero:

 

haconf -makerw
hagrp -modify  BATCH-SG OnlineRetryLimit 0
haconf -dump -makero

Then VCS will failover group.

Mike

View solution in original post

3 REPLIES 3

mikebounds
Level 6
Partner Accredited

Check the value of RestartLimit:

hatype -display ProcessOnOnly -attribute RestartLimit

If this is non-zero then you need to set it to zero if you don't want VCS to restart and failover straightaway:

hatype -modify ProcessOnOnly RestartLimit 0

Mike

mikebounds
Level 6
Partner Accredited

Ah, just seen you have OnlineRetryLimit set on group so you need to set this to zero:

 

haconf -makerw
hagrp -modify  BATCH-SG OnlineRetryLimit 0
haconf -dump -makero

Then VCS will failover group.

Mike

Chotchki
Level 2

Thank you that fixed the issue.