01-29-2014 02:39 AM
Hi,
Im got this error when I restarted the a box and it entered in the cluster:
2014/01/28 21:26:37 VCS ERROR V-16-10001-6505 MultiNICB:MultiNICB_Pub:monitor:The mpathd process (/usr/lib/inet/in.mpathd) does not exist
2014/01/28 21:26:37 VCS WARNING V-16-10001-6506 MultiNICB:MultiNICB_Pub:monitor:Will try to restart mpathd with (/usr/lib/inet/in.mpathd)
Just want to check if this error is relevant and what would be the cause for it.
This node in question is in the cluster but currently no SG is running in it.. only a parallel "nic" SG that has the Multinic appls.
In the mainc.cf looks like this:
group nic (
SystemList = { DP-node4 = 0, DP-node5 = 1, DP-node6 = 2, DP-node8 = 3,
dp-node9 = 4 }
Parallel = 1
)
MultiNICB MultiNICB_Pub (
UseMpathd = 1
ConfigCheck = 0
Device @DP-node4 = { nxge0 = 0, nxge4 = 1 }
Device @DP-node5 = { nxge0 = 0, nxge4 = 1 }
Device @DP-node6 = { nxge0 = 0, nxge4 = 1 }
Device @DP-node8 = { nxge0 = 0, bge0 = 0 }
Device @dp-node9 = { igb0 = 0 }
IgnoreLinkStatus = 0
NetworkTimeout = 300
GroupName = Public_Network
)
Phantom nic_phantom (
Critical = 0
)
Currently its Online at the node with the error (9).
Any suggestion?
Tks,
Joao
Solved! Go to Solution.
01-29-2014 11:28 PM
Are you seeing that error repeatedly?
In my opinion, that error must have occurred only once because in.mpathd daemon was not running earlier. The MultiNICB agent detected it and tried to restart the daemon as UseMpathd attribute was set to 1.
Ideally, that should not cause any problem while bringing the corresponding IPMultiNICB resource (in this case, mdm_PubIP) online.
01-29-2014 02:53 AM
Does mpathd exist at /usr/lib/inet/in.mpathd or is it already running with a different path name.
If so you can use "MpathdCommand" attribute on MultiNICB_Pub resource to set the correct path.
Mike
01-29-2014 04:35 AM
It is running and its i the path:
# ls -la /usr/lib/inet/in.mpathd
-r-xr-xr-x 1 root bin 87832 Nov 23 2010 /usr/lib/inet/in.mpathd
# ps -ef | grep in.mpathd
root 11088 11069 0 10:31:18 pts/1 0:00 grep in.mpathd
root 1865 1 0 21:26:38 ? 0:01 /usr/lib/inet/in.mpathd
In the types.cf I have this:
str MpathdCommand = "/usr/lib/inet/in.mpathd"
I dont have it in the main.cf..
In another node of the cluster where I dont have this error, I have the mpath started like this:
# ps -ef | grep in.mpathd
root 488 1 0 Jun 05 ? 88:46 /usr/lib/inet/in.mpathd -a
What this -a means?
01-29-2014 04:55 AM
Not sure what "-a" means, but the process should look the same on each node, and as Solaris starts mpathd, this would suggest the 2 nodes are configured differently so you should try to find this difference (have a look at /etc/default/mpathd as a starting point)
The VCS bundled agents guide gives some checks:
Checklist to ensure the proper operation of MultiNICBFor the MultiNICB agent to function properly, you must satisfy each item in thefollowing list:■ Each interface must have a unique MAC address.■ A MultiNICB resource controls all the interfaces on one IP subnet.■ At boot time, you must configure and connect all the interfaces that are underthe MultiNICB resource and give them base IP addresses.■ All base IP addresses for the MultiNICB resource must belong to the samesubnet as the virtual IP address.■ Reserve the base IP addresses, which the agent uses to test the link status, foruse by the agent. These IP addresses do not get failed over.■ The IgnoreLinkStatus attribute is set to 1 (default) when using trunkedinterfaces.■ If you specify the NetworkHosts attribute, then that host must be on the samesubnet as the base IP addresses for the MultiNICB resource.■ Test IP addresses have "nofailover" and "deprecated" flags set at boot time.■ /etc/default/mpathd has TRACK_INTERFACES_ONLY_WITH_GROUPS=yes.■ If you are not using Solaris in.mpathd, all MultiNICB resources on the systemhave the UseMpathd attribute set to 0 (default). You cannot run in.mpathd onthis system.■ If you are using Solaris in.mpathd, all MultiNICB resources on the system havethe UseMpathd attribute set to 1.
Mike
01-29-2014 06:53 AM
Can I kill the process and start it with the -a again as a Workaround?
# ps -ef | grep in.mpathd
root 11088 11069 0 10:31:18 pts/1 0:00 grep in.mpathd
root 1865 1 0 21:26:38 ? 0:01 /usr/lib/inet/in.mpathd
Something like kill -9 1865
And then:
/usr/lib/inet/in.mpathd -a
01-29-2014 08:26 AM
Hi,
yes you can do that .. just to be on safe side I would suggest to freeze service groups & run that
-a is switch used with in.mpathd commonly .. many of symantec article has that .. e.g
http://www.symantec.com/docs/TECH171008
http://www.symantec.com/docs/TECH137947
G
01-29-2014 08:34 AM
I tried but no luck:
root@dp-node9 # ps -ef | grep mpath
root 4115 2887 0 14:31:14 pts/2 0:00 grep mpath
root 1914 1 0 13:50:10 ? 0:00 /usr/lib/inet/in.mpathd
root@dp-node9 #
root@dp-node9 #
root@dp-node9 #
root@dp-node9 # kill -9 1914
root@dp-node9 # /usr/lib/inet/in.mpathd -a
root@dp-node9 # ps -ef | grep mpath
root 4125 2887 0 14:31:31 pts/2 0:00 grep mpath
root 4124 1 0 14:31:29 ? 0:00 /usr/lib/inet/in.mpathd
=====
Do you think this can cause problems when for example starting this resource?
IPMultiNICB mdm_PubIP (
BaseResName = MultiNICB_Pub
Address = "10.129.68.23"
NetMask = "255.255.255.192"
)
01-29-2014 11:28 PM
Are you seeing that error repeatedly?
In my opinion, that error must have occurred only once because in.mpathd daemon was not running earlier. The MultiNICB agent detected it and tried to restart the daemon as UseMpathd attribute was set to 1.
Ideally, that should not cause any problem while bringing the corresponding IPMultiNICB resource (in this case, mdm_PubIP) online.