06-29-2011 02:36 AM
Hi All,
I have installed Solaris with Veritas cluster on a blade Server. My cluster is working fine but when i give reboot to both the systems my cluster is not getting started by own I need to start it using hastart on all cluster nodes. if anybody knows any solution that will be really helpful....
Thanks & Regards,
06-30-2011 06:56 AM
Hi Sourabhcredible,
I'm more of a windows guy, but it sounds like vcs is not set to automatically start at the given run level that you are booting to. I know that Solaris uses startup scripts to determine what services to start and what to stop when changing run levels. It just sounds like VCS startup scripts are not there.
Again, I'm a windows guy not a Unix guy so I'm not sure exactly how VCS services are started on Solaris. I'm hoping that someone else familar with VCS on Solaris will speak up and correct me if I'm wrong here.
Thanks,
Wally
09-23-2011 11:03 AM
Sourabhcredible,
You will want to check that your gabtab has the appropriate number of nodes in the settings. If you have two nodes, you should see a "-n 2". In my example below, I have 5 nodes in my cluster.
~ $ cat /etc/gabtab /sbin/gabconfig -c -n5
If you're running the latest VCS, you will want to check that the VCS services are enabled. If any are disabled, you'll want to enable them with 'svcadm enable'. In the example below, I have 3 services disabled:
~ $ svcs -xv llt gab vcs vxfen
svc:/system/llt:default (Veritas Low Latency Transport (LLT) Init service)
State: online since Thu Sep 22 16:40:50 2011
See: man -M /opt/VRTSllt/man/man1m/ -s 1M lltconfig
See: /var/svc/log/system-llt:default.log
Impact: None.
svc:/system/gab:default (Veritas Group Membership and Atomic Broadcast (GAB) Init service)
State: disabled since Thu Sep 22 16:37:38 2011
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: man -M /opt/VRTS/man/man1m/ -s 1M gabconfig
Impact: 2 dependent services are not running:
svc:/system/vcs:default
svc:/system/vxfen:default
svc:/system/vcs:default (Veritas Cluster Server (VCS) Init service)
State: offline since Thu Sep 22 16:37:38 2011
Reason: Service svc:/system/gab:default is disabled.
See: http://sun.com/msg/SMF-8000-GE
Path: svc:/system/vcs:default
svc:/system/gab:default
See: man -M /opt/VRTS/man/man1m/ -s 1M vcsconfig
Impact: This service is not running.
svc:/system/vxfen:default (Veritas I/O Fencing (VXFEN) Init service)
State: offline since Thu Sep 22 16:37:38 2011
Reason: Service svc:/system/gab:default is disabled.
See: http://sun.com/msg/SMF-8000-GE
Path: svc:/system/vxfen:default
svc:/system/gab:default
See: man -M /opt/VRTS/man/man1m/ -s 1M vxfenconfig
Impact: 1 dependent service is not running:
svc:/system/vcs:default
If you're running an older version of VCS (5.0), you will want to check your startup scripts in /etc/rc2.d:
~ $ ls -al /etc/rc*.d/*vcs /etc/rc*.d/*llt /etc/rc*.d/*vxfen /etc/rc*.d/*gab
-rwxr--r-- 3 root sys 2414 Sep 26 2006 /etc/rc0.d/K10vcs
-rwxr--r-- 3 root sys 4731 Sep 18 2006 /etc/rc0.d/K15vxfen
-rwxr--r-- 3 root sys 1979 Nov 10 2005 /etc/rc0.d/K49gab
-rwxr--r-- 2 root sys 1539 Sep 29 2005 /etc/rc2.d/S70llt
-rwxr--r-- 3 root sys 1979 Nov 10 2005 /etc/rc2.d/S92gab
-rwxr--r-- 3 root sys 4731 Sep 18 2006 /etc/rc2.d/S97vxfen
-rwxr--r-- 3 root sys 2414 Sep 26 2006 /etc/rc3.d/S99vcs
If all that checks OK, you will want to check your LLT and GAB statuses:
/tmp $ lltstat -vvn | head -15
LLT node information:
Node State Link Status Address
0 hsweb1 OPEN
nxge2 UP 00:14:4F:6D:79:EA
e1000g2 UP 00:14:4F:86:4A:68
* 1 hsweb2 OPEN
nxge2 UP 00:14:4F:6D:D6:2A
e1000g2 UP 00:14:4F:81:B9:58
2 CONNWAIT
nxge2 DOWN
e1000g2 DOWN
3 CONNWAIT
nxge2 DOWN
e1000g2 DOWN
4 CONNWAIT
landuca@hsweb2 11:00:49
/tmp $ sudo gabconfig -a
GAB Port Memberships
===============================================================
Port a gen 8cdb05 membership 01
Port b gen 8cdb09 membership 01
Port h gen 8cdb79 membership 01
landuca@hsweb2 11:00:53
/tmp $ sudo lltconfig -a list
Link 0 (nxge2):
Node 0 hsweb1 : 00:14:4F:6D:79:EA
Node 1 hsweb2 : 00:14:4F:6D:D6:2A permanent
Link 1 (e1000g2):
Node 0 hsweb1 : 00:14:4F:86:4A:68
Node 1 hsweb2 : 00:14:4F:81:B9:58 permanent
Check console messages for any failures within the startup procedure.
09-23-2011 02:50 PM
If you can start VCS using hastart, then I can't see this being a GAB issue as running hastart would not resolve the GAB seeding. So issue is probably that hastart is disabled on boot up, so check setting as AlanTLR says above.
The other thing you can check is the engine log as if hastart is being run, then this will be in the engine log regardless of whether had fails imediately or waits for fencing or GAB.
Mike