Forum Discussion

Cian_McCarthy's avatar
13 years ago

Having trouble with getting VCS 'had' to start.

 

I installed VCS 6.0.2 on RHEL 6.1, chose not to start LLT and GAB during configuration so that it would be configured a single node cluster in a standalone cluster.
However 'had' failed to startup, and the error message at the end of installation was as follows : 
"It is likely that the startup failure issues will be resolved after rebooting the system. If issues persist after reboot, contact Symantec technical support or refer to installation guide for further troubleshooting."
 
I restarted the system, but unfortunately 'had' still would not start after running 'hastart'.
The file /var/VRTSvcs/log/engine_A.log does indicate that 'had' started :
 
2012/11/13 21:24:27 VCS NOTICE V-16-1-11022 VCS engine (had) started
2012/11/13 21:24:27 VCS NOTICE V-16-1-11050 VCS engine version=6.0
2012/11/13 21:24:27 VCS NOTICE V-16-1-11051 VCS engine join version=6.0.10.0
2012/11/13 21:24:27 VCS NOTICE V-16-1-11052 VCS engine pstamp=6.0.100.000-GA-2012-07-20-16.30.01
2012/11/13 21:24:27 VCS NOTICE V-16-1-10114 Opening GAB library
2012/11/13 21:24:27 VCS INFO V-16-1-10196 Cluster logger started
 
but, a 'ps -ef | grep had' returns no entry.
 
The file /var/VRTSvcs/log/hastart.log has this entry for the startup attempt :
 
File was modified after system boot,
boottime = <1352828351>, modifytime = <1352828394>
[root@localhost log]# view engine_A.log
 
  • If you have not configured LLT and GAB, then to start VCS you must use:

    hastart -onenode

    and "hastart" on its own without "-onenode" flag will not work with GAB.

    This should work on bootup as the RC scripts have something like:

    if [ "$gabconfigured" = "Configured" ]; then
        $HASTART
    else
         $HASTART -onenode
    fi

    One additional thing to check is that you have "ONENODE=yes" inside the file /etc/sysconfig/vcs.

    Mike

  • Thanks very much Tushar and mikebounds for your suggestions.

     

    I tried 'hastart -onenode', but unfortunately 'had' is stiill not in the list of processes :

     

     

    [root@localhost log]# ps -ef | grep had
    root     11547  4049  0 10:20 pts/1    00:00:00 grep had
     
    [root@localhost log]# tail -n 4 hastart.log
    boottime = <1352828351>, modifytime = <1352828394>
    Wed Nov 14 10:15:20 GMT 2012
    File was modified after system boot,
    boottime = <1352828351>, modifytime = <1352841867>
     
    [root@localhost log]# tail -n 10 engine_A.log
    2012/11/13 21:24:27 VCS NOTICE V-16-1-10114 Opening GAB library
    2012/11/13 21:24:27 VCS INFO V-16-1-10196 Cluster logger started
    2012/11/14 10:15:20 VCS NOTICE V-16-1-11022 VCS engine (had) started
    2012/11/14 10:15:20 VCS NOTICE V-16-1-11027 VCS engine startup arguments=-onenode 
    2012/11/14 10:15:20 VCS NOTICE V-16-1-11050 VCS engine version=6.0
    2012/11/14 10:15:20 VCS NOTICE V-16-1-11051 VCS engine join version=6.0.10.0
    2012/11/14 10:15:20 VCS NOTICE V-16-1-11052 VCS engine pstamp=6.0.100.000-GA-2012-07-20-16.30.01
    2012/11/14 10:15:20 VCS NOTICE V-16-1-10115 Using GABSIM
    2012/11/14 10:15:20 VCS NOTICE V-16-1-14032 Forming a one node cluster
    2012/11/14 10:15:20 VCS INFO V-16-1-10196 Cluster logger started
     

    I wonder if the problem is license-related. I am using 60-day trialware which I originally downloaded and installed on the same machine back in Jan/Feb 2012. Shortly after that, I was tasked with something else, and I've only recently been reassigned to invesiigate VCS use in a one-node cluster. I did a fresh install of RHEL6 and of VCS 6.0.2, but perhaps the original installation is flagged as license expired based on the MAC address on my NIC?

    Then again maybe I wouldn't have gotten that far with the install if there was a licensing issue?..

  • I don't think issue is licencing as this should show in engine log - something "VCS is not licensed" and with 6.0, it comes with its own 60 day eval, but even after 60 days, VCS will still work, you will just get warnings in the log.  You can verify if you have the built-in keyless 60 day eval by using:

     

    vxkeyless -v display

    Can you provide the following:

    Output from command "gabconfig -a" (expecting it to say nothing)

    Contents of main.cf (in /etc/VRTSvcs/conf/config)

    Usually if "had" can't start it says in the engine log why it can't so it looks like "had" is just dying rather than exiting.

    Mike

  •  

    Thanks Mike, see the output of the commands below :
     
     
    [root@localhost bin]# /opt/VRTSvlic/bin/vxkeyless -v display
    Product  Description
    VCS      Cluster Server
     
    [root@localhost bin]# ./vxlicrep
     
    Symantec License Manager vxlicrep utility version 3.02.61.004
    Copyright (C) 1996-2012 Symantec Corporation. All rights reserved.
     
    Creating a report on all VERITAS products installed on this system
     
     -----------------***********************-----------------
     
       License Key                         = AJZE-RC8Z-XACN-2KXT-U4PP-PPPP-PPPP-RNPD-P
       Product Name                        = VERITAS Cluster Server
       Serial Number                       = 35044
       License Type                        = PERMANENT
       OEM ID                              = 2006
       Site License                        = YES
       Editions Product                    = YES
     
     Features := 
       Platform                            = Unused                             
       Version                             = 6.0                                
       Tier                                = Unused                             
       Reserved                            = 0 
     
       Mode                                = VCS                                
       CPU_TIER                            = 0
       VXKEYLESS                           = Enabled
     
    [root@localhost bin]# gabconfig -a
    GAB gabconfig ERROR V-15-2-25022 unknown error
     
    [root@localhost bin]# cat /etc/VRTSvcs/conf/config/main.cf
    include "types.cf"
    include "OracleTypes.cf"
    include "OracleASMTypes.cf"
    include "Db2udbTypes.cf"
    include "SybaseTypes.cf"
     
    cluster cork-vcs-node1 (
            UserNames = { "admin" = fopHojOlpKppNxpJom, "openet" = JiiQjcHhiPihIg }
            Administrators = { "admin", "openet" }
            CounterInterval = 5
            )
     
    system localhost (
            )
     
  • Hi Mike,

    Thanks again for your prompt response, and insightful troubleshooting.

    I changed hostname from localhost to vcs-node1 in files 'main.cf', '/etc/sysconfig/network' and '/etc/hosts', and rebooted the server.

    It appears to have resolved the issue :

     

     

    [root@vcs-node1 ~]# hostname
    vcs-node1
    [root@vcs-node1 ~]# gabconfig -a
    GAB gabconfig ERROR V-15-2-25022 unknown error
    [root@vcs-node1 ~]# hastart -onenode
    [root@vcs-node1 ~]# ps -ef | grep had
    root      1776     1  0 16:47 ?        00:00:01 /opt/VRTSvcs/bin/had -onenode
    root      1942     1  0 16:47 ?        00:00:00 /opt/VRTSvcs/bin/hashadow
    root      2223  2109  0 17:03 pts/0    00:00:00 grep had
     
    [root@vcs-node1 ~]# cd /var/VRTSvcs/log/
    [root@vcs-node1 log]# ls -lrt
    total 36
    drwxr-x---. 2 root root 4096 Nov 13 15:35 vxfen
    -rw-r--r--. 1 root root  654 Nov 13 17:21 uuidconfig.log
    -rw-r--r--. 1 root root 1480 Nov 14 16:47 CmdServer-log_A.log
    drwxr-xr-x. 2 root root 4096 Nov 14 16:47 tmp
    -rw-r--r--. 1 root root 9221 Nov 14 16:47 engine_A.log
    -rw-r--r--. 1 root root  477 Nov 14 16:47 HostMonitor_A.log
    -rw-r--r--. 1 root root 1592 Nov 14 17:03 hastart.log
     
    [root@vcs-node1 log]# tail -n 51 engine_A.log
    2012/11/14 14:48:12 VCS INFO V-16-1-10196 Cluster logger started
    2012/11/14 16:47:02 VCS NOTICE V-16-1-11022 VCS engine (had) started
    2012/11/14 16:47:02 VCS NOTICE V-16-1-11027 VCS engine startup arguments=-onenode 
    2012/11/14 16:47:02 VCS NOTICE V-16-1-11050 VCS engine version=6.0
    2012/11/14 16:47:02 VCS NOTICE V-16-1-11051 VCS engine join version=6.0.10.0
    2012/11/14 16:47:02 VCS NOTICE V-16-1-11052 VCS engine pstamp=6.0.100.000-GA-2012-07-20-16.30.01
    2012/11/14 16:47:02 VCS NOTICE V-16-1-10115 Using GABSIM
    2012/11/14 16:47:02 VCS NOTICE V-16-1-14032 Forming a one node cluster
    2012/11/14 16:47:02 VCS INFO V-16-1-10196 Cluster logger started
    2012/11/14 16:47:06 VCS NOTICE V-16-1-10619 'HAD' starting on: vcs-node1
    2012/11/14 16:47:07 VCS INFO V-16-1-10125 GAB timeout set to 30000 ms
    2012/11/14 16:47:07 VCS INFO V-16-1-10077 Received new cluster membership
    2012/11/14 16:47:07 VCS NOTICE V-16-1-10112 System (vcs-node1) - Membership: 0x1, DDNA: 0x0
    2012/11/14 16:47:07 VCS NOTICE V-16-1-10086 System vcs-node1 (Node '0') is in Regular Membership - Membership: 0x1
    2012/11/14 16:47:07 VCS NOTICE V-16-1-10322 System vcs-node1 (Node '0') changed state from CURRENT_DISCOVER_WAIT to LOCAL_BUILD
    2012/11/14 16:47:11 VCS WARNING V-16-1-10030 UseFence=NONE. Hence do not need fencing
    2012/11/14 16:47:11 VCS NOTICE V-16-1-10322 System vcs-node1 (Node '0') changed state from LOCAL_BUILD to RUNNING
    2012/11/14 16:47:11 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/HostMonitor for resource type HostMonitor successfully started at Wed Nov 14 16:47:11 2012
     
    2012/11/14 16:47:13 VCS INFO V-16-1-10151 Invoking trigger to dump environment variables
    2012/11/14 16:47:14 VCS INFO V-16-6-15015 (vcs-node1) hatrigger:/opt/VRTSvcs/bin/triggers/sysup is not a trigger scripts directory or can not be executed
    2012/11/14 16:47:15 VCS INFO V-16-6-15023 (vcs-node1) dump_tunables:
    ########## VCS Environment Variables ##########
    VCS_CONF=/etc/VRTSvcs
    VCS_DIAG=/var/VRTSvcs
    VCS_HOME=/opt/VRTSvcs
    VCS_LOG_AGENT_NAME=
    VCS_LOG_CATEGORY=6
    VCS_LOG_SCRIPT_NAME=hatrigger
    VCS_LOG=/var/VRTSvcs
    ########## Other Environment Variables ##########
    CONSOLETYPE=vt
    LANG=en_US.UTF-8
    LANGSH_SOURCED=1
    LD_LIBRARY_PATH=/opt/VRTSvcs/lib:
    PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/sbin:/usr/sbin:/bin:/usr/bin:/opt/VRTSvcs/bin
    previous=N
    PREVLEVEL=N
    PWD=/var/VRTSvcs/diag/had
    runlevel=5
    RUNLEVEL=5
    SHLVL=5
    TERM=linux
    UPSTART_EVENTS=runlevel
    UPSTART_INSTANCE=
    UPSTART_JOB=rc
    _=/usr/bin/env
     
    2012/11/14 16:47:15 VCS INFO V-16-6-15002 (vcs-node1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/internal_triggers/dump_tunables vcs-node1 1   successfully
    2012/11/14 16:47:17 VCS NOTICE V-16-1-10438 Group VCShmg has been probed on system vcs-node1
    2012/11/14 16:47:17 VCS NOTICE V-16-1-10435 Group VCShmg will not start automatically on System vcs-node1 as the system is not a part of AutoStartList attribute of the group.
     
    [root@vcs-node1 log]# tail -n2 CmdServer-log_A.log
    2012/11/14 16:45:46 VCS WARNING V-16-1-10604 Unsuccessful open of service
    2012/11/14 16:47:02 VCS INFO V-16-1-11237 CmdServer[security off] is Up
     
    [root@vcs-node1 log]# cat HostMonitor_A.log
    #
    # Log Name:     HostMonitor
    # System:       vcs-node1
    # SysInfo:      Linux:vcs-node1,#1 SMP Tue May 10 15:42:40 EDT 2011,2.6.32-131.0.15.el6.x86_64,x86_64
    # Created:      2012/11/14 16:47:17 
    #
     
    2012/11/14 16:47:17 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 11% and Swap usage = 0%.
    2012/11/14 16:47:47 VCS INFO V-16-10061-14001 HostMonitor:VCShm:monitor:Updating System attribute with CPU usage = 1% and Swap usage = 0%.
     
    [root@vcs-node1 log]# cat tmp/HostMonitor-0 
    ALL
     
    [root@vcs-node1 log]# tail -n4 hastart.log
    boottime = <1352911587>, modifytime = <1352904492>
    Wed Nov 14 17:03:04 GMT 2012
    File was modified after system boot,
    boottime = <1352911587>, modifytime = <1352911622>