cancel
Showing results for 
Search instead for 
Did you mean: 

SFHA 5.1 for Windows - GCO failover query & Cluster Service Group

MikeCampbell
Level 3

Hello,

I wonder if anyone can help, we are trying to determine the correct failover behaviour we should expect between two clusters in a GCO config.

We have built an enviroment of two clusters (C1 & C2) each with 2 nodes joined with GCO. Failover for the application hosted is thus :- A1 to A2 and B1 to B2 there is no local failover within a cluster - except for the Cluster Service group that can run either on node within its local cluster.

============SITE A

Cluster 1 - C1                         

NODEA1 - Application Service Group A PRIMARY NODE

NODEB1 - Application Service Group B PRIMARY NODE

 

============SITE B

Cluster 2 - C2                        

NODEA2 - Application Service Group A FAILOVER NODE

NODEB2 - Application Service Group B FAILOVER NODE

 

Now if we shutdown or reboot a node that is NOT running the cluster service group - just an application service group - then what we see is that the application service group fails over to the remote cluster.

But...

If we shutdown or reboot a node that IS running the cluster service group and an application service group, then what we see is that the particular application service group does not failover but remains offline on both clusters. And the cluster service group (of that particular cluster) fails over to the remaining local node.

It seems then that the ownership of the cluster service changes the behaviour of the failover of the application during a controlled shutdown / reboot of that node.

I have tried to explain it briefly! :)

ClusterFailOverPolicy is set to Auto.

 

Anyways can we set it so that if the node carrying the Cluster Service Group shutsdown or reboots then it will move its application service group to the remote cluster then shutdown itself?

 

cheers

1 ACCEPTED SOLUTION

Accepted Solutions

Wally_Heim
Level 6
Employee

Hi Mike,

When you shutdown the node that hosts the ClusterService group, the connection between the two servers is broken.  In this case, Cluster 1 is not able to transmit the status of the Group to the remote cluster for it to perform actions on. 

The issue is that GCO/VCS keeps trace of the shutdown process and a normal shutdown might not trigger a failover.  there are some timeouts that come into play and it can take about 10 minutes for GCO to determine if site failover is needed. A lot of servers will reboot within this 10 minute window. 

You mentioned that the group stayed offline in both sites after this test was completed.  You should set the AutoStartList on both service group in both clusters.  This will ensure that the cluster node will restart the service group if it is not already running elsewhere in the cluster or GCO configuration.

Thanks,

Wally

View solution in original post

1 REPLY 1

Wally_Heim
Level 6
Employee

Hi Mike,

When you shutdown the node that hosts the ClusterService group, the connection between the two servers is broken.  In this case, Cluster 1 is not able to transmit the status of the Group to the remote cluster for it to perform actions on. 

The issue is that GCO/VCS keeps trace of the shutdown process and a normal shutdown might not trigger a failover.  there are some timeouts that come into play and it can take about 10 minutes for GCO to determine if site failover is needed. A lot of servers will reboot within this 10 minute window. 

You mentioned that the group stayed offline in both sites after this test was completed.  You should set the AutoStartList on both service group in both clusters.  This will ensure that the cluster node will restart the service group if it is not already running elsewhere in the cluster or GCO configuration.

Thanks,

Wally