Solved: VCS Cluster not starting.

br1 · ‎10-23-2013

Hello All,
I am having difficulties trying to get VCS started on this system.
I have attached what I have got so far. I apperciate any comments or suggestions as to go from here.
Thank you

The hostnames in the main.cf corrosponds to that of the servers.

hastatus -sum
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
VCS WARNING V-16-1-11046 Local system not available

hasys -state
VCS ERROR V-16-1-10600 Cannot connect to VCS engine

hastop -all -force
VCS ERROR V-16-1-10600 Cannot connect to VCS engine

hastart   / hastart -onenode
dmesg: Exiting: Another copy of VCS may be running

engine_A.log
2013/10/22 15:16:43 VCS NOTICE V-16-1-11051 VCS engine join version=4.1000
2013/10/22 15:16:43 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 03/03/05-14:58:00
2013/10/22 15:16:43 VCS NOTICE V-16-1-10114 Opening GAB library
2013/10/22 15:16:43 VCS NOTICE V-16-1-10619 'HAD' starting on: db1
2013/10/22 15:16:45 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms
2013/10/22 15:17:00 VCS CRITICAL V-16-1-11306 Did not receive cluster membership, manual intervention may be needed for seeding

#gabconfig -a
GAB Port Memberships
===============================================================

#lltstat -nvv
LLT node information:
Node State Link Status Address
* 0 db1 OPEN
bge1 UP 00:03:BA:15
                                  bge2 UP 00:03:BA:15
     1 db2 CONNWAIT
bge1 DOWN
bge2 DOWN

bash-2.05$ lltconfig
LLT is running

ps -ef | grep had
root 826 1 0 15:16:43 ? 0:00 /opt/VRTSvcs/bin/had
root 836 1 0 15:16:45 ? 0:00 /opt/VRTSvcs/bin/hashadow

AHerr · ‎10-23-2013

If only one of two nodes can connect through llt (see your lltstat -nvv where one node is present and the other is down) then the cluster will attempt to start but will wait for both nodes to be available.

This is done to ensure in a heartbeat disconnection scenario or split-brain condition that you do not have 2 seperate clusters starting.

If this is a known condition, you can run the command

# gabconfig -C -X

This removes the number of nodes needed to seed a cluster, but this command should only be performed if you are certain the other node does not already have a running cluster. You should also diagnose why the other nodes' heartbeat links are not visable from llt.

View solution in original post

AHerr · ‎10-23-2013

If only one of two nodes can connect through llt (see your lltstat -nvv where one node is present and the other is down) then the cluster will attempt to start but will wait for both nodes to be available.

This is done to ensure in a heartbeat disconnection scenario or split-brain condition that you do not have 2 seperate clusters starting.

If this is a known condition, you can run the command

# gabconfig -C -X

This removes the number of nodes needed to seed a cluster, but this command should only be performed if you are certain the other node does not already have a running cluster. You should also diagnose why the other nodes' heartbeat links are not visable from llt.

br1 · ‎10-23-2013

That command worked, VCS Server came up online.
Currently there is network issues with the secondary server.

out of curisourity, hastart -onenode didnt work..

Thank you very much.

g_lee · ‎10-23-2013

From the error you posted:

hastart / hastart -onenode
dmesg: Exiting: Another copy of VCS may be running

... unless you killed the previous version of had/hastart (that would have started automatically on boot and was probably still running waiting for gab membership from the other node(s)), it looks like hastart -onenode failed as it found the other copy of had already running.

Note the following from the man page:

-onenode

Use this option only to start VCS on a single system where LLT and GAB are not required. Do not use this option to start VCS on a node in a multisystem cluster.

So you shouldn't be using this option to start had in a cluster with more than one node - use the gabconfig -c -x procedure provided by AHerr if there are known LLT/network issues (note you need to ensure the second/other node is definitely down or you may end up with some issues eg: a split brain cluster)

Marianne · ‎01-08-2020

Moved:

VOX

VCS Cluster not starting.