Issues with failover in VCS - Bonding
Hello All
I wonder if you kind people can help me again.
I have a 2 node cluster on Sun x4240 servers (Redhat 5.4 x86), which I have installed VCS v5.0.
There are only about 9 service groups created on them, spead out between them, which have been running fine for the last few months.
I have bonded the 2 network interfaces eth4 and eth8 into one called bond0, which is a low priority, and I have 2 seperate heartbeats as well.
We recently had an issue wth the network cards (Sun 10g niu) which we think may be an issue with the drivers, as they failed and reported NETDEV timeouts and fell off the network.
The issues are, none of the Service Groups failed over, as for some reason VSC thinks bond0 is still up.
I've replicated the error by pulling out the 2 network interfaces eth4 and eth8, which mimmicks the issue.
So there seems to be 2 issues here, the first one is with the Sun 10G cards with the Standard redhat driver, and the other is the SG not failing over as a result.
Has any one seem this before?
My Cluster set up is:
NIC resource is in each service group
Redhat 5.4 x86
VCS v5.0 RP3
Heartbeats on = eth1 and eth3 (100mb full duplex)
Heres the output from had -version:
Engine Version=5.0
PSTAMP: Veritas-5.0MP3-07/16/08-02:01:00
And the output from rpm aq | grep VRTSvcs
VRTSvcs-5.0.30.00-MP3_GENERIC
VRTSvcsvr-5.0.30.00-MP3_GENERIC
VRTSvcsag-5.0.30.00-MP3_RHEL5
VRTSvcsor-5.0.30.00-MP3_RHEL5
VRTSvcs-5.0.30.00-MP3_RHEL5
VRTSvcsdr-5.0.30.00-MP3_RHEL5
VRTSvcsmn-5.0.30.00-MP3_GENERIC
Have a look on VCS Bundled agents guide here.... you can get full details of MultiNICA & MultiNICB resources...
MultiNIC is basically used for redundancy of NIC Cards.. (somewhat similar to what solaris IPMP does) .... i.e even if a NIC card fails, the services are unaffected...
Guide can be found here:
http://sfdoccentral.symantec.com
Gaurav