11-26-2013 10:27 AM
Hello,
We had a hardware failure and on restarting the server we could not reach our mount points or even start the server with hastart but nothing was started and we keep getting the error in Title above.
Kindly assist in resolving this issue.
Solved! Go to Solution.
11-28-2013 08:55 AM
Hello
looks like you have made your cluster to use IOFencing however fencing is not configured correctly ..
refer below logs
2013/11/28 11:12:16 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying..
See if your main.cf contains below line
UseFence = SCSI3
# cat /etc/VRTSvcs/conf/config/main.cf |grep -i usefence
Above line exists in main.cf, that means cluster is intended to use fencing which is not configured correctly.
IOFencing provided data protection from cluster split brain situations
Refer to VCS admin guide & see the article on how to configure IOFencing. If you do not intend to use IOFencing (which is not recommended), you can remove the entry from main.cf after stopping the cluster & start the cluster again.
Link to documentation
https://sort.symantec.com/documents
IOFencing link for VCS 5.1 on solaris
https://sort.symantec.com/public/documents/sf/5.1/solaris/html/vcs_admin/ch_admin_fencing.html#760094
G
11-26-2013 07:28 PM
Hi,
Is this a CFS server ? And what's the hardware failure exactly, since the local hard disk failure could lead to data loss which impact VCS configuration.
And pls paste content of below files:
/etc/VRTSvcs/conf/sysname
/etc/llttab
And output of below commands:
lltstat -nvv active
gabconfig -a
11-27-2013 02:07 AM
Hello stinsong,
Thanks for your response.
The hardware failure caused a shared mount point to become unavailable, Yes it is a CFS server, currently we could see the filesystem on one node but it is not coming up on the other node , when trying to start it up it gives a new error.
VCS ERROR V-16-1-10600 Cannot connect to VCS engine.
11-27-2013 02:25 AM
Hello,
Could not connect to VCS engine means your "had" process has not started or not running.
For VCS to run, you need to ensure that components like LLT, GAB & Fencing (if configured) are running. Please paste the output of
# lltconfig
# lltstat -vvn | head -10
# gabconfig -a
# modinfo | egrep 'gab|llt|vxfen'
# had -version
# uname -a
when you say that nothing was started .. assuming its a unix system, are your rc scripts all OK ? i.e
/etc/rc2.d/S70llt
/etc/rc2.d/S92gab
/etc/rc3.d/S99vcs
If services are configured under SMF, are the SMF services in online state ?
G
11-27-2013 03:38 AM
Hello,
This are the results:
/etc/rc2.d/S70llt
/etc/rc2.d/S92gab
/etc/rc3.d/S99vcs
11-27-2013 06:42 AM
Well 'had' is not running for some reason, but most everything else seems to be...
You may not see those 'rc'-scripts because llt, gab, and vcs may be under Solaris' SMF control on your system. Check your SMF configuration and see when 'had' (vcs) should have been started.
What run-level is your system in?:
# who -r
You may be in a run-level whereby SMF is not configured to run VCS ('had'), and then you would get an error like: 'Cluster Server not running on local node'
Either manually start VCS (via 'hastart') or transition your host to the appropriate run-level.
-HTH
11-27-2013 07:04 AM
Hello,
It is on run-level 3.
11-27-2013 10:08 AM
Have you tried to 'hastart' it yet?
Make sure to report back to us any error messages that go into VCS' message log (/opt/VRTSvcs/log/engine_A.log) and the Solaris messages log (/var/adm/messages) after you you ran 'hastart'...
Do you have a valid VCS license? -- if it has expired than you will get a message in those logs.
Run 'vxlicrep -s' and provide the output..., as well as the relevent output from the various messages files mentioned above...
-kjb
11-27-2013 11:34 PM
Also, make sure that "had -version" is same on both the nodes, above you have only pasted outputs from one node so can't confirm.
Also, as suggested above, try an hastart & let us know the output from engine_A.log
G
11-28-2013 08:55 AM
Hello
looks like you have made your cluster to use IOFencing however fencing is not configured correctly ..
refer below logs
2013/11/28 11:12:16 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying..
See if your main.cf contains below line
UseFence = SCSI3
# cat /etc/VRTSvcs/conf/config/main.cf |grep -i usefence
Above line exists in main.cf, that means cluster is intended to use fencing which is not configured correctly.
IOFencing provided data protection from cluster split brain situations
Refer to VCS admin guide & see the article on how to configure IOFencing. If you do not intend to use IOFencing (which is not recommended), you can remove the entry from main.cf after stopping the cluster & start the cluster again.
Link to documentation
https://sort.symantec.com/documents
IOFencing link for VCS 5.1 on solaris
https://sort.symantec.com/public/documents/sf/5.1/solaris/html/vcs_admin/ch_admin_fencing.html#760094
G
12-02-2013 07:21 PM
Please check your SMF services for any issues.
#svcs -a|egrep 'vxfen|vcs|llt|gab'