cancel
Showing results for 
Search instead for 
Did you mean: 

VCS - Error with the MultiNIC resource

joaotelles
Level 4

Hi,

Im got this error  when I restarted the a box and it entered in the cluster:

2014/01/28 21:26:37 VCS ERROR V-16-10001-6505 MultiNICB:MultiNICB_Pub:monitor:The mpathd process (/usr/lib/inet/in.mpathd) does not exist
2014/01/28 21:26:37 VCS WARNING V-16-10001-6506 MultiNICB:MultiNICB_Pub:monitor:Will try to restart mpathd with (/usr/lib/inet/in.mpathd)

Just want to check if this error is relevant and what would be the cause for it.

This node in question is in the cluster but currently no SG is running in it.. only a parallel "nic" SG that has the Multinic appls.

In the mainc.cf looks like this:

group nic (
        SystemList = { DP-node4 = 0, DP-node5 = 1, DP-node6 = 2, DP-node8 = 3,
                 dp-node9 = 4 }
        Parallel = 1
        )

        MultiNICB MultiNICB_Pub (
                UseMpathd = 1
                ConfigCheck = 0
                Device @DP-node4 = { nxge0 = 0, nxge4 = 1 }
                Device @DP-node5 = { nxge0 = 0, nxge4 = 1 }
                Device @DP-node6 = { nxge0 = 0, nxge4 = 1 }
                Device @DP-node8 = { nxge0 = 0, bge0 = 0 }
                Device @dp-node9 = { igb0 = 0 }
                IgnoreLinkStatus = 0
                NetworkTimeout = 300
                GroupName = Public_Network
                )

        Phantom nic_phantom (
                Critical = 0
                )
 

Currently its Online at the node with the error (9).

Any suggestion?

Tks,

Joao

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Setu_Gupta
Level 3
Accredited

Are you seeing that error repeatedly?

In my opinion, that error must have occurred only once because in.mpathd daemon was not running earlier. The MultiNICB agent detected it and tried to restart the daemon as UseMpathd attribute was set to 1.

Ideally, that should not cause any problem while bringing the corresponding IPMultiNICB resource (in this case, mdm_PubIP) online.

View solution in original post

7 REPLIES 7

mikebounds
Level 6
Partner Accredited

Does mpathd exist at /usr/lib/inet/in.mpathd or is it already running with a different path name.

If so you can use "MpathdCommand" attribute on MultiNICB_Pub resource to set the correct path.

Mike 

joaotelles
Level 4

It is running and its i the path:

# ls -la /usr/lib/inet/in.mpathd
-r-xr-xr-x   1 root     bin        87832 Nov 23  2010 /usr/lib/inet/in.mpathd
# ps -ef | grep in.mpathd
    root 11088 11069   0 10:31:18 pts/1       0:00 grep in.mpathd
    root  1865     1   0 21:26:38 ?           0:01 /usr/lib/inet/in.mpathd

In the types.cf I have this:

str MpathdCommand = "/usr/lib/inet/in.mpathd"
 

I dont have it in the main.cf..

In another node of the cluster where I dont have this error, I have the mpath started like this:

# ps -ef | grep in.mpathd
    root   488     1   0   Jun 05 ?          88:46 /usr/lib/inet/in.mpathd -a
 

What this -a means?

 

mikebounds
Level 6
Partner Accredited

Not sure what "-a" means, but the process should look the same on each node, and as Solaris starts mpathd, this would suggest the 2 nodes are configured differently so you should try to find this difference (have a look at /etc/default/mpathd as a starting point)

The VCS bundled agents guide gives some checks:

Checklist to ensure the proper operation of MultiNICB
For the MultiNICB agent to function properly, you must satisfy each item in the
following list:
■ Each interface must have a unique MAC address.
■ A MultiNICB resource controls all the interfaces on one IP subnet.
■ At boot time, you must configure and connect all the interfaces that are under
the MultiNICB resource and give them base IP addresses.
■ All base IP addresses for the MultiNICB resource must belong to the same
subnet as the virtual IP address.
■ Reserve the base IP addresses, which the agent uses to test the link status, for
use by the agent. These IP addresses do not get failed over.
■ The IgnoreLinkStatus attribute is set to 1 (default) when using trunked
interfaces.
■ If you specify the NetworkHosts attribute, then that host must be on the same
subnet as the base IP addresses for the MultiNICB resource.
■ Test IP addresses have "nofailover" and "deprecated" flags set at boot time.
■ /etc/default/mpathd has TRACK_INTERFACES_ONLY_WITH_GROUPS=yes.
■ If you are not using Solaris in.mpathd, all MultiNICB resources on the system
have the UseMpathd attribute set to 0 (default). You cannot run in.mpathd on
this system.
■ If you are using Solaris in.mpathd, all MultiNICB resources on the system have
the UseMpathd attribute set to 1.

Mike

joaotelles
Level 4

Can I kill the process and start it with the -a again as a Workaround?

# ps -ef | grep in.mpathd
    root 11088 11069   0 10:31:18 pts/1       0:00 grep in.mpathd
    root  1865     1   0 21:26:38 ?           0:01 /usr/lib/inet/in.mpathd

Something like kill -9 1865

And then:

/usr/lib/inet/in.mpathd -a

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hi,

yes you can do that .. just to be on safe side I would suggest to freeze service groups & run that

-a is switch used with in.mpathd commonly .. many of symantec article has that  .. e.g

http://www.symantec.com/docs/TECH171008

http://www.symantec.com/docs/TECH137947

 

G

joaotelles
Level 4

I tried but no luck:

root@dp-node9 # ps -ef | grep mpath
    root  4115  2887   0 14:31:14 pts/2       0:00 grep mpath
    root  1914     1   0 13:50:10 ?           0:00 /usr/lib/inet/in.mpathd
root@dp-node9 #
root@dp-node9 #
root@dp-node9 #
root@dp-node9 # kill -9 1914
root@dp-node9 # /usr/lib/inet/in.mpathd -a
root@dp-node9 # ps -ef | grep mpath
    root  4125  2887   0 14:31:31 pts/2       0:00 grep mpath
    root  4124     1   0 14:31:29 ?           0:00 /usr/lib/inet/in.mpathd
 

=====

Do you think this can cause problems when for example starting this resource?

        IPMultiNICB mdm_PubIP (
                BaseResName = MultiNICB_Pub
                Address = "10.129.68.23"
                NetMask = "255.255.255.192"
                )
 

 


 

 

Setu_Gupta
Level 3
Accredited

Are you seeing that error repeatedly?

In my opinion, that error must have occurred only once because in.mpathd daemon was not running earlier. The MultiNICB agent detected it and tried to restart the daemon as UseMpathd attribute was set to 1.

Ideally, that should not cause any problem while bringing the corresponding IPMultiNICB resource (in this case, mdm_PubIP) online.