Showing results for 
Search instead for 
Did you mean: 

Increasing Gab stable timeout values

Not applicable
we're having issues with the switches we are running our heartbeats through and are thinking of increasing the gab timeout value to prevent unecessary attempts at failover.  Seems the switches are having some kind of spanning tree issue that causes the heartbeats to timeout when it is remapping certain ports.  The ideal solution would be to get our own private heartbeat switch, but that's not gonna happen just yet.  So are there any reprocussions to having the gab stable timeouts increased to say 60 seconds?  Also i tried adding the value -t 60000 to /etc/gabtab and the timeout didn't get set when i ran /etc/init.d/gab start/stop.  Is there another place where this should be set?

anyway we're running vcs5.0 on solaris 10 platform.  Any answers would be greatly appreciated.


Level 2
In our in house testing we have observed problems in cluster membership with switches in which spanning tree algorithms are turned on.
As you mentioned the ideal situation is to have dedicated switches with spanning tree algorithms turned off for LLT heartbeats. 

GAB stable timeout is the time for which GAB waits before computing a new membership, say when cluster starts up or when a node fails.
The default is 5 secs. So after 5 secs the master node computes, sends out  the new membership to all nodes in the cluster 
There is no stable membership in the cluster until gab stable timeout.

What this means is that there can be no traffic sent over the network through GAB, LLT for this period of time. Say if you are running Oracle or CFS and CVM, gab_stable_timeout is 60 secs then for 60 secs there will be no membership and no communication between cluster nodes and your application will be affected when there is reconfiguration. 
So Increasing GAB stable timeout is not advised.  

If you plan to increase please make sure that it is a temporary arrangement till you get a dedicated switch setup for LLT hbs or turn off spanning tree algorithms.

As you mentioned adding the -t option to the /etc/gabtab should change the value of gab stable timeout. I am not sure why it didn't get set for you.
Anyways find the below steps to change it

Step 1: Add gabconfig -t 60000 to /etc/gabtab file.
The below file was for two node cluster with seed 2.
[gab04]#cat /etc/gabtab
/sbin/gabconfig -c -n2 -t 60000
Step 2: Check status of the gab
If (gab is configured) then goto step 3
else goto step 4
[gab04]#/etc/init.d/gab status
GAB: module is configured
Step 3:
Stop gab
[gab04]#/etc/init.d/gab stop
Stopping GAB: 
Step 4:
Startup gab
[gab04]#/etc/init.d/gab start
Starting GAB...
Starting GAB done.
Step 5:
Check gabconfig for Stable timeout
[gab04]#gabconfig -l | grep Stable
Stable timeout (ms) : 60000