VCS Cluster not starting.
Hello All, I am having difficulties trying to get VCS started on this system. I have attached what I have got so far. I apperciate any comments or suggestions as to go from here. Thank you The hostnames in the main.cf corrosponds to that of the servers. hastatus -sum VCS ERROR V-16-1-10600 Cannot connect to VCS engine VCS WARNING V-16-1-11046 Local system not available hasys -state VCS ERROR V-16-1-10600 Cannot connect to VCS engine hastop -all -force VCS ERROR V-16-1-10600 Cannot connect to VCS engine hastart / hastart -onenode dmesg: Exiting: Another copy of VCS may be running engine_A.log 2013/10/22 15:16:43 VCS NOTICE V-16-1-11051 VCS engine join version=4.1000 2013/10/22 15:16:43 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 03/03/05-14:58:00 2013/10/22 15:16:43 VCS NOTICE V-16-1-10114 Opening GAB library 2013/10/22 15:16:43 VCS NOTICE V-16-1-10619 'HAD' starting on: db1 2013/10/22 15:16:45 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms 2013/10/22 15:17:00 VCS CRITICAL V-16-1-11306 Did not receive cluster membership, manual intervention may be needed for seeding #gabconfig -a GAB Port Memberships =============================================================== #lltstat -nvv LLT node information: Node State Link Status Address * 0 db1 OPEN bge1 UP 00:03:BA:15 bge2 UP 00:03:BA:15 1 db2 CONNWAIT bge1 DOWN bge2 DOWN bash-2.05$ lltconfig LLT is running ps -ef | grep had root 826 1 0 15:16:43 ? 0:00 /opt/VRTSvcs/bin/had root 836 1 0 15:16:45 ? 0:00 /opt/VRTSvcs/bin/hashadowSolved18KViews3likes4CommentsVCS Cluster not starting.
Hi I am facing problem while trying to start VCS . From LOG : ============================================================== tail /var/VRTSvcs/log/engine_A.log 2014/01/13 21:39:14 VCS NOTICE V-16-1-11050 VCS engine version=5.1 2014/01/13 21:39:14 VCS NOTICE V-16-1-11051 VCS engine join version=5.1.00.0 2014/01/13 21:39:14 VCS NOTICE V-16-1-11052 VCS engine pstamp=Veritas-5.1-10/06/09-14:37:00 2014/01/13 21:39:14 VCS INFO V-16-1-10196 Cluster logger started 2014/01/13 21:39:14 VCS NOTICE V-16-1-10114 Opening GAB library 2014/01/13 21:39:14 VCS NOTICE V-16-1-10619 ‘HAD’ starting on: nsscls01 2014/01/13 21:39:16 VCS INFO V-16-1-10125 GAB timeout set to 30000 ms 2014/01/13 21:39:16 VCS NOTICE V-16-1-11057 GAB registration monitoring timeout set to 200000 ms 2014/01/13 21:39:16 VCS NOTICE V-16-1-11059 GAB registration monitoring action set to log system message 2014/01/13 21:39:31 VCS CRITICAL V-16-1-11306 Did not receive cluster membership, manual intervention may be needed for seeding ============================================================================================= root@nsscls01# hastatus -sum VCS ERROR V-16-1-10600 Cannot connect to VCS engine VCS WARNING V-16-1-11046 Local system not available Please advice how can I start the VCS.Solved16KViews2likes11Commentssolution needed for vxfen issue
<!-- @page { margin: 0.79in } P { margin-bottom: 0.08in } A:link { so-language: zxx } --> there is a two node cluster and we split two node cluster for upgrade. The isolated node is not coming up as vxfen is not starting /02/01 11:52:05 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:52:20 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:52:35 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:52:50 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:53:05 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying... 2013/02/01 11:53:20 VCS CRITICAL V-16-1-10031 VxFEN driver not configured. VCS Stopping. Manually restart VCS after configuring fencing ^C IOFENCING configuration seems okay on node 2 as /etc/vxfentab has the entry for co-ordinator disks and /etc/vxfendg has diskgroup entry vxfendg2 and these disks and diskgroup are visible too. DEVICE TYPE DISK GROUP STATUS c0t5006016047201339d0s2 auto:sliced - (ossdg) online c0t5006016047201339d1s2 auto:sliced - (sybasedg) online c0t5006016047201339d2s2 auto:sliced - (vxfendg1) online c0t5006016047201339d3s2 auto:sliced - (vxfendg1) online c0t5006016047201339d4s2 auto:sliced - (vxfendg1) online c0t5006016047201339d5s2 auto:sliced - (vxfendg2) online c0t5006016047201339d6s2 auto:sliced - (vxfendg2) online c0t5006016047201339d7s2 auto:sliced - - online c2t0d0s2 auto:SVM - - SVM c2t1d0s2 auto:SVM - - SVM On checking vxfen.log nvoked S97vxfen. Starting Fri Feb 1 11:50:37 CET 2013 starting vxfen.. Fri Feb 1 11:50:37 CET 2013 calling start_fun Fri Feb 1 11:50:38 CET 2013 found vxfenmode file Fri Feb 1 11:50:38 CET 2013 calling generate_vxfentab Fri Feb 1 11:50:38 CET 2013 checking for /etc/vxfendg Fri Feb 1 11:50:38 CET 2013 found /etc/vxfendg. Fri Feb 1 11:50:38 CET 2013 calling generate_disklist Fri Feb 1 11:50:38 CET 2013 Starting vxfen.. Done. Fri Feb 1 11:50:38 CET 2013 starting in vxfen-startup Fri Feb 1 11:50:38 CET 2013 calling regular vxfenconfig Fri Feb 1 11:50:38 CET 2013 return value from above operation is 1 Fri Feb 1 11:50:38 CET 2013 output was VXFEN vxfenconfig ERROR V-11-2-1003 At least three coordinator disks must be defined Log Buffer: 0xfffffffff4041090 refadm2-oss1{root} # cat /etc/vxfendg vxfendg2 and there are below mentioned two disks in vxfendg2 c0t5006016047201339d5s2 auto:sliced - (vxfendg2) online c0t5006016047201339d6s2 auto:sliced - (vxfendg2) online is it due to two disks in coordinator diskgroup? Is it a known issue ?Solved7.7KViews1like16CommentsVeritas Cluster Server Heartbeat link down, jeapordy state..
Hello Everyone, I am having this problem with the VCS hearbeat links. The VCS are being run on a Solaris machine v440. The VCS version is 4.0 on Solaris 9, I know it's old & EOL. Im just hoping to find and pinpoint the soloution to this problem. The VCS heartbeat links are running on 2 seperate Vlans. This is a 2 node cluster. Recently the old switch was taken out and a new switch CISCO 3750 was added. The switch shows the cables are connected and I am able to see link up from the switch side. The links in ce4 of both servers are not linking. Any ideas besides faulty VLAN? How do I test the communications on that particular VLAN? Here are the results of various commands, any help is apperciated! Thank you! #lltstat -n LLT node information: Node State Links 0 node1 OPEN 1 * 1 node2 OPEN 2 #lltstat -nvv|head LLT node information: Node State Link Status Address 0 node1 OPEN ce4 DOWN ce6 UP 00:03:BA:94:F8:61 * 1 node2 OPEN ce4 UP 00:03:BA:94:A4:6F ce6 UP 00:03:BA:94:A4:71 2 CONNWAIT ce4 DOWN #lltstat -n node information: Node LLT no State Links * 0 node1 OPEN 2 1 node2 OPEN 1 #lltstat -nvv|head LLT node information: Node State Link Status Address * 0 node1 OPEN ce4 UP 00:03:BA:94:F8:5F ce6 UP 00:03:BA:94:F8:61 1 node2 OPEN ce4 DOWN ce6 UP 00:03:BA:94:A4:71 2 CONNWAIT ce4 DOWN #gabconfig -a GAB Port Memberships =============================================================== Port a gen 49c917 membership 01 Port a gen 49c917 jeopardy ;1 Port h gen 49c91e membership 01 Port h gen 49c91e jeopardy ;7.2KViews1like18CommentsVCS Error Codes for all platforms
Hello Gents, Do we have a list of all erro codes for VCS ? also if all the error codes are generic and are common for all platforms (including Linux,Solaris, Windows,AIX) Need this confirmation urgently, planning to design a common monitoring agent. Best Regards, NimishSolved6.4KViews2likes6CommentsService group does not fail over on another node on force power down.
VCS 6.0.1 Hi i have configured a two node cluster with local storage and running two service groups. They both running fine and i am able to switch over them to any node on the cluster but when i forcely power down a node where both service groups are active, just one service group fails over to another node and the one running apache resource gets faild and do not fail over. below pasted the contents of main.cf file. ========================================== cat /etc/VRTSvcs/conf/config/main.cf include "OracleASMTypes.cf" include "types.cf" include "Db2udbTypes.cf" include "OracleTypes.cf" include "SybaseTypes.cf" cluster mycluster ( UserNames = { admin = IJKcJEjGKfKKiSKeJH, root = ejkEjiIhjKjeJh } ClusterAddress = "192.168.25.101" Administrators = { admin, root } ) system server3 ( ) system server4 ( ) group ClusterService ( SystemList = { server3 = 0, server4 = 1 } AutoStartList = { server3, server4 } OnlineRetryLimit = 3 OnlineRetryInterval = 120 ) IP webip ( Device = eth0 Address = "192.168.25.101" NetMask = "255.255.255.0" ) NIC csgnic ( Device = eth0 ) webip requires csgnic // resource dependency tree // // group ClusterService // { // IP webip // { // NIC csgnic // } // } group httpsg ( SystemList = { server3 = 0, server4 = 1 } AutoStartList = { server3, server4 } OnlineRetryLimit = 3 OnlineRetryInterval = 15 ) Apache apachenew ( httpdDir = "/usr/sbin" ConfigFile = "/etc/httpd/conf/httpd.conf" ) IP ipresource ( Device = eth0 Address = "192.168.25.102" NetMask = "255.255.255.0" ) apachenew requires ipresource // resource dependency tree // // group httpsg // { // Apache apachenew // { // IP ipresource // } // } # ===================== engine logs while the powerdown occurs says - 2013/03/12 16:33:02 VCS INFO V-16-1-10077 Received new cluster membership 2013/03/12 16:33:02 VCS NOTICE V-16-1-10112 System (server3) - Membership: 0x1, DDNA: 0x0 2013/03/12 16:33:02 VCS ERROR V-16-1-10079 System server4 (Node '1') is in Down State - Membership: 0x1 2013/03/12 16:33:02 VCS ERROR V-16-1-10322 System server4 (Node '1') changed state from RUNNING to FAULTED 2013/03/12 16:33:02 VCS NOTICE V-16-1-10449 Group httpsg autodisabled on node server4 until it is probed 2013/03/12 16:33:02 VCS NOTICE V-16-1-10449 Group VCShmg autodisabled on node server4 until it is probed 2013/03/12 16:33:02 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system server4 2013/03/12 16:33:02 VCS NOTICE V-16-1-10446 Group httpsg is offline on system server4 2013/03/12 16:33:02 VCS ERROR V-16-1-10205 Group ClusterService is faulted on system server4 2013/03/12 16:33:02 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system server4 2013/03/12 16:33:02 VCS INFO V-16-1-10493 Evaluating server3 as potential target node for group ClusterService 2013/03/12 16:33:02 VCS INFO V-16-1-10493 Evaluating server4 as potential target node for group ClusterService 2013/03/12 16:33:02 VCS INFO V-16-1-10494 System server4 not in RUNNING state 2013/03/12 16:33:02 VCS NOTICE V-16-1-10301 Initiating Online of Resource webip (Owner: Unspecified, Group: ClusterService) on System server3 2013/03/12 16:33:02 VCS WARNING V-16-1-11141 LLT heartbeat link status changed. Previous status =eth1, UP; Current status =eth1, DOWN. 2013/03/12 16:33:02 VCS INFO V-16-6-15015 (server3) hatrigger:/opt/VRTSvcs/bin/triggers/sysoffline is not a trigger scripts directory or can not be executed 2013/03/12 16:33:14 VCS INFO V-16-1-10298 Resource webip (Owner: Unspecified, Group: ClusterService) is online on server3 (VCS initiated) 2013/03/12 16:33:14 VCS NOTICE V-16-1-10447 Group ClusterService is online on system server3 as per the above logs, the default SG ClusterService has been failed over to another node but SG httpsg faild. please suggest on it. Thanks....Solved5.6KViews1like16CommentsHow to recover from failed attempt to switch to a different node in cluster
Hello everyone. I have a two node cluster to serve as an Oracle database server. I have the Oracle binaries installed on disks local to each of the nodes (so they are outside the control of the Cluster Manager). I have a diskgroup which is three 1TB LUNs from my SAN, six volumes on the diskgroup (u02 through u07), six mount points (/u02 through /u07), a database listener and the actual Oracle database. I was able to successfully manually bring up these individual components and confirmed that the database was up an running. I then tried a "Switch To" operation to see if everything would mode to the other node of the cluster. It turns out this was a bad idea. Within the Cluster Manager gui, the diskgroup has a state of Online, Istate of "Waiting to go offline propogate" and Flag of "Unable to offline". The volumes show as "Offline on all systems" but the mounts still show as online with "Status Unknown". When I try to take the mount points offline, I get the message "VCS ERROR V-16-1-10277 The Service Group i1025prd to which Resource Mnt_scratch belongs has failed or switch, online, and offline operations are prohibited." Can anyone tell me how I can fix this? KenSolved4.9KViews1like17CommentsResource faults issue
Hi Team, Many times we face resource faults alerts on our AIX server,but when we check it shows all are running fine. Also, when checked logs it shows not (initiated by VCS) alerts. Our main concern is to troubleshoot, why we get these alerts and if we get these alerts then why cluster resource is not showing faulty. Below are the logs, -- SYSTEM STATE -- System State Frozen A xxxibm012 RUNNING 0 A xxxibm014 RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService xxxibm012 Y N ONLINE B ClusterService xxxibm014 Y N OFFLINE B DB_INSIGHT_STAGE xxxibm012 Y N ONLINE B DB_INSIGHT_STAGE xxxibm014 Y N OFFLINE ============================================================= 2015/04/21 10:14:53 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2015/04/21 12:57:32 VCS WARNING V-16-10011-5611 (clnibm014) NIC:csgnic:monitor:Second PingTest failed for Virtual Interface en4. Resource is OFFLINE 2015/04/21 12:57:32 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg from localhost 2015/04/21 12:57:33 VCS ERROR V-16-1-54031 Resource csgnic (Owner: Unspecified, Group: ClusterService) is FAULTED on sys clnibm014 2015/04/21 12:57:33 VCS INFO V-16-6-0 (clnibm014) resfault:(resfault) Invoked with arg0=clnibm014, arg1=csgnic, arg2=ONLINE 2015/04/21 12:57:49 VCS INFO V-16-6-15002 (clnibm014) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault clnibm014 csgnic ONLINE successfully 2015/04/21 12:58:18 VCS ERROR V-16-1-54031 Resource proxy_DB_INSPRD (Owner: Unspecified, Group: DB_INSIGHT_STAGE) is FAULTED on sys clnibm014 2015/04/21 12:58:18 VCS INFO V-16-6-0 (clnibm014) resfault:(resfault) Invoked with arg0=clnibm014, arg1=proxy_DB_INSPRD, arg2=ONLINE 2015/04/21 12:58:29 VCS INFO V-16-6-15002 (clnibm014) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault clnibm014 proxy_DB_INSPRD ONLINE successfully 2015/04/21 12:58:33 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 12:58:34 VCS INFO V-16-1-10299 Resource csgnic (Owner: Unspecified, Group: ClusterService) is online on clnibm014 (Not initiated by VCS) 2015/04/21 12:58:34 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group ClusterService on all nodes 2015/04/21 12:58:34 VCS NOTICE V-16-1-51034 Failover group ClusterService is already active. Ignoring Restart 2015/04/21 12:59:18 VCS INFO V-16-1-10299 Resource proxy_DB_INSPRD (Owner: Unspecified, Group: DB_INSIGHT_STAGE) is online on clnibm014 (Not initiated by VCS) 2015/04/21 12:59:18 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group DB_INSIGHT_STAGE on all nodes 2015/04/21 12:59:18 VCS NOTICE V-16-1-51034 Failover group DB_INSIGHT_STAGE is already active. Ignoring Restart 2015/04/21 12:59:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 13:18:53 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Relying on secondary test to confirm Online status. from localhost 2015/04/21 13:19:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 13:44:49 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Relying on secondary test to confirm Online status. from localhost 2015/04/21 13:45:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded. from localhost 2015/04/21 14:14:54 VCS INFO V-16-1-53504 VCS Engine Alive message!! 2015/04/21 16:48:59 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Relying on secondary test to confirm Online status. from localhost 2015/04/21 16:49:34 VCS INFO V-16-1-50135 User root fired command: hares -modify csgnic ConfidenceMsg Primary test to confirm Online status succeeded.Solved4.8KViews1like9CommentsVCS Setup - "Install A Clustered Master Server" Greyed Out
I originally posted this in the NetBackup forum but it seems to be a VCS issue that is the root of the problem: https://www-secure.symantec.com/connect/forums/install-clustered-master-server-greyed-out I am trying to set up a test NBU 7.6.0.1 master server cluster with VCS on Windows Server 2008 R2 but the option to "install a clustered master server" is greyed out. I have built 2 new servers, installed SFHA / VCS 6.0.1 and configured disks / volumes / disk groups / VVR replication. The VCS installation has not been modified or customised in any way as from my understanding the NBU installation will create its own service group. I have carried out the initial configuration and added the remote cluster node. Within cluster explorer, the default ClusterService service group is all up and running, the global cluster option is enabled and the remote cluster status also shows that the other cluster is up. Marianne has already asked me to run two commands with the following output: 'gabconfig -a' - "GAB gabconfig ERROR V-15-2-25022" lltconfig -nvv - "LLT lltconfig ERROR V-14-2-15000 open \\.\llt failed: error number: 0x2" I've probably just made a silly mistake somewhere but I can't figure out what piece of the puzzle I have missed - any ideas?Solved4.6KViews1like37Comments