Forum Discussion

mkruer's avatar
mkruer
Level 4
12 years ago

VCS AutoStartList ungracefully failover.

I have a cluster setup and everything seems to be working as expected except in one test case of an ungraceful shutdowns. The outstanding issue seems to be with the ungraceful shutdown. At this tim...
  • arangari's avatar
    12 years ago

    The 'AutoStartList' is used only in case of node-joining event. For example, for a failover group, when all of the nodes in its SystemList join the cluster, and AutoStartList is set with AutoStart attribute set to 1 (default), the Online of service group is initiated. This considers the AutoStartList order for group's possible target.

    On the node-fault (for System B above), if the System A has not brought the resources online, it is more-likey that the fault was detected after ShutdownTimeout.

    The groups are failed-over to other node on a node-fault only when  following happens:

    1. Node A - port-h is closed un-gracefully. (HAD dies).  Node goes into DDNA (Daemon Dead, Node Alive)state and all other nodes mark the SGs configured on this node as 'AutoDisabled' - to avoid any concurrency violation.

    2. Node A leaves port-a membership within ShutdownTimeout seconds. At this point other nodes will consider that node A is down and 'AutoEnable' the SGs configured on node A and start failover action.

    2.a - if the port-a membership does not go within ShutDownTimeout, to protect concurrency violations, VCS will continue groups in AutoDisabled state. One can come out of this situation, by confirming that the Node A is indeed down / applications are not running on this node, issue 'hagrp -autoenable' command followed by 'hagrp -online' command.