V-35-410: Cluster server not running on local node on solaris

Question

Hello,
We had a hardware failure and on restarting the server we could not reach our mount points or even start the server with hastart but nothing was started and we keep getting the error in Title above.
Kindly assist in resolving this issue.
&nbsp;
&nbsp;

gaurav_s · Accepted Answer

Hello
looks like you have made your cluster to use IOFencing however fencing is not configured correctly ..
refer below logs
2013/11/28 11:11:46 VCS NOTICE V-16-1-52006 UseFence=SCSI3. Fencing is enabled
2013/11/28 11:11:46 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
2013/11/28 11:12:01 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying...
2013/11/28 11:12:16 VCS CRITICAL V-16-1-10037 VxFEN driver not configured. Retrying..
&nbsp;
See if your main.cf contains below line
UseFence = SCSI3
# cat /etc/VRTSvcs/conf/config/main.cf |grep -i usefence
&nbsp;
Above line exists in main.cf, that means cluster is intended to use fencing which is not configured correctly.
IOFencing provided data protection from cluster split brain situations
&nbsp;
Refer to VCS admin guide &amp; see the article on how to configure IOFencing. If you do not intend to use IOFencing (which is not recommended), you can remove the entry from main.cf after stopping the cluster &amp; start the cluster again.
&nbsp;
Link to documentation
&nbsp;
https://sort.symantec.com/documents
IOFencing link for VCS 5.1 on solaris
https://sort.symantec.com/public/documents/sf/5.1/solaris/html/vcs_admin/ch_admin_fencing.html#760094
&nbsp;
G

stinsong · Answer

Hi,
Is this a CFS server ? And what's the hardware failure exactly, since the local hard disk failure could lead to data loss which impact VCS configuration.
And pls paste content of below files:
/etc/VRTSvcs/conf/sysname
/etc/llttab
And &nbsp;output of below commands:
lltstat -nvv active
gabconfig -a

it-sysmike · Answer

Hello stinsong,
Thanks for your response.
The hardware failure caused a shared mount point to become unavailable, Yes it is a CFS server, currently we could see the filesystem &nbsp;on one node but it is not coming up on the other node , when trying to start it up it gives a new error.
VCS ERROR V-16-1-10600 Cannot connect to VCS engine.
&nbsp;

gaurav_s · Answer

Hello,
Could not connect to VCS engine means your "had" process has not started or not running.
For VCS to run, you need to ensure that components like LLT, GAB &amp; Fencing (if configured) are running. Please paste the output of
# lltconfig
# lltstat -vvn | head -10
# gabconfig -a
# modinfo | egrep 'gab|llt|vxfen'
# had -version
# uname -a
&nbsp;
when you say that nothing was started .. assuming its a unix system, are your rc scripts all OK ? i.e
/etc/rc2.d/S70llt
/etc/rc2.d/S92gab
/etc/rc3.d/S99vcs
If services are configured under SMF, are the SMF services in online state ?
&nbsp;
G

it-sysmike · Answer

Hello,
This are the results:
root@ap1.gf.net # lltconfig
LLT is running
root@ap1.gf.net # lltstat -vvn | head -10
LLT node information:
&nbsp; &nbsp; Node &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; State &nbsp; &nbsp;Link &nbsp;Status &nbsp;Address
&nbsp; &nbsp;* 0 ap1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;OPEN &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; igb2 &nbsp; UP &nbsp; &nbsp; &nbsp;00:21:28:BB:40:3C
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; igb3 &nbsp; UP &nbsp; &nbsp; &nbsp;00:21:28:BB:40:3D
&nbsp; &nbsp; &nbsp;1 ap2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;OPEN &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; igb2 &nbsp; UP &nbsp; &nbsp; &nbsp;00:21:28:BB:0F:04
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; igb3 &nbsp; UP &nbsp; &nbsp; &nbsp;00:21:28:BB:0F:05
&nbsp; &nbsp; &nbsp;2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; CONNWAIT
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; igb2 &nbsp; DOWN &nbsp; &nbsp;
&nbsp;
root@ap1.gf.net # gabconfig -a
GAB Port Memberships
===============================================================
Port a gen &nbsp; 68f501 membership 01
Port d gen &nbsp; 68f506 membership 01
&nbsp;
root@ap1.gf.net # df -ah
Filesystem &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; size &nbsp; used &nbsp;avail capacity &nbsp;Mounted on
rpool/ROOT/s10s_u9wos_14a
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;274G &nbsp; &nbsp;26G &nbsp; 238G &nbsp; &nbsp;10% &nbsp; &nbsp;/
/devices &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/devices
ctfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/system/contract
proc &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/proc
mnttab &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/etc/mnttab
swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;53G &nbsp; 504K &nbsp; &nbsp;53G &nbsp; &nbsp; 1% &nbsp; &nbsp;/etc/svc/volatile
objfs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/system/object
sharefs &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/etc/dfs/sharetab
/platform/sun4v/lib/libc_psr/libc_psr_hwcap2.so.1
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;264G &nbsp; &nbsp;26G &nbsp; 238G &nbsp; &nbsp;10% &nbsp; &nbsp;/platform/sun4v/lib/libc_psr.so.1
/platform/sun4v/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;264G &nbsp; &nbsp;26G &nbsp; 238G &nbsp; &nbsp;10% &nbsp; &nbsp;/platform/sun4v/lib/sparcv9/libc_psr.so.1
fd &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/dev/fd
swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;53G &nbsp; &nbsp;72K &nbsp; &nbsp;53G &nbsp; &nbsp; 1% &nbsp; &nbsp;/tmp
swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;53G &nbsp; &nbsp;72K &nbsp; &nbsp;53G &nbsp; &nbsp; 1% &nbsp; &nbsp;/var/run
swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;53G &nbsp; &nbsp; 0K &nbsp; &nbsp;53G &nbsp; &nbsp; 0% &nbsp; &nbsp;/dev/vx/dmp
swap &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;53G &nbsp; &nbsp; 0K &nbsp; &nbsp;53G &nbsp; &nbsp; 0% &nbsp; &nbsp;/dev/vx/rdmp
applprod1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;150G &nbsp; &nbsp;40G &nbsp; 109G &nbsp; &nbsp;27% &nbsp; &nbsp;/applprod1
applprod2 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 98G &nbsp; &nbsp;39G &nbsp; &nbsp;59G &nbsp; &nbsp;40% &nbsp; &nbsp;/applprod2
rpool/export &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 274G &nbsp; &nbsp;23K &nbsp; 238G &nbsp; &nbsp; 1% &nbsp; &nbsp;/export
rpool/export/home &nbsp; &nbsp; &nbsp;274G &nbsp; 3.6G &nbsp; 238G &nbsp; &nbsp; 2% &nbsp; &nbsp;/export/home
rpool &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;274G &nbsp; &nbsp;97K &nbsp; 238G &nbsp; &nbsp; 1% &nbsp; &nbsp;/rpool
-hosts &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/net
auto_home &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/home
ap1.gf.net:vold(pid2375)
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/vol
/dev/odm &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0K &nbsp; &nbsp; 0% &nbsp; &nbsp;/dev/odm
root@ap1.gf.net # cfsmount all
&nbsp; Error: V-35-410: Cluster Server not running on local node: to
&nbsp;
root@ap1.gf.net # modinfo | egrep 'gab|llt|vxfen'
234 7aaea000 &nbsp;2cf88 331 &nbsp; 1 &nbsp;llt (LLT 5.1SP1)
235 7ab0e000 &nbsp;5a338 332 &nbsp; 1 &nbsp;gab (GAB device 5.1SP1)
236 7ab4c000 &nbsp;6a0c8 333 &nbsp; 1 &nbsp;vxfen (VRTS Fence 5.1SP1)
&nbsp;
root@ap1.gf.net # had -version
Engine Version &nbsp; &nbsp;5.1
Join Version &nbsp; &nbsp; &nbsp;5.1.10.0
Build Date &nbsp; &nbsp; &nbsp; &nbsp;Fri Oct 01 07:30:00 2010
PSTAMP &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;5.1.100.000-5.1SP1-2010-09-30_23.30.00
&nbsp;
root@ap1.gf.net # uname -a
SunOS ap1.gf.net 5.10 Generic_147440-19 sun4v sparc sun4v
&nbsp;
This scripts below do not exist in our server:

/etc/rc2.d/S70llt
	/etc/rc2.d/S92gab
	/etc/rc3.d/S99vcs

&nbsp;
&nbsp;
&nbsp;

kjbss · Answer

Well 'had' is not running for some reason, but most everything else seems to be...
You may not see those 'rc'-scripts because llt, gab, and vcs may be under Solaris' SMF control on your system.&nbsp; Check your SMF configuration and see when 'had' (vcs) should have been started.
What run-level is your system in?:
# who -r 
You may be in a run-level whereby SMF is not configured to run VCS ('had'), and then you would get an error like:&nbsp; 'Cluster Server not running on local node'
Either manually start VCS (via 'hastart') or transition your host to the appropriate run-level.&nbsp;
-HTH
&nbsp;

Forum Discussion

V-35-410: Cluster server not running on local node on solaris

10 Replies

Related Content

cluster

CFS cluster disks question

WebUI does not update bp.conf on the offline node of the clustered Primary server

VCS Cluster not starting.

Remote cluster status

Recent Discussions

Configure two Mount type resources of nfs FStype attribute using the same share

order

key registration and reservation

Verifying that primary and dr clusters replication is synced

vcs can create logical nic