Forum Discussion

janet_marshal's avatar
19 years ago

VCS IPmultiNICB Errors

Hi,

we are runninig the VCS 4.1 symetric cluster on two nodes.

I installed the MP1 patch for veritas foundation suite 4.1 after that license for veritas is expired and my cluster was unable to come online.

according to veritas its a known bug and they are working on it meanwhile they gave me a temp license which i applied to cluster through #vxlicinst -k.

Now the cluster is up but it unable to load the service groups and it seems it because of IPMULTINICB resource

I would appreciate if any one can help me in this regard i am sending the error message

2006/01/03 18:05:08 VCS WARNING V-16-1-10023 Agent IPMultiNICB not sending alive messages since Tue Jan 3 18:02:55 2006

2006/01/03 18:05:08 VCS WARNING V-16-1-53025 Agent IPMultiNICB has faulted; ipm connection was lost; restarting the agent

2006/01/03 18:05:08 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/IPMultiNICB/IPMultiNICBAgent for resource type IPMultiNICB successfully started at Tue Jan 3 18:05:08 2006

2006/01/03 18:07:20 VCS WARNING V-16-1-10023 Agent IPMultiNICB not sending alive messages since Tue Jan 3 18:05:08 2006

2006/01/03 18:07:20 VCS WARNING V-16-1-53025 Agent IPMultiNICB has faulted; ipm connection was lost; restarting the agent

2006/01/03 18:07:20 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/IPMultiNICB/IPMultiNICBAgent for resource type IPMultiNICB successfully started at Tue Jan 3 18:07:20 2006

2006/01/03 18:09:32 VCS WARNING V-16-1-10023 Agent IPMultiNICB not sending alive messages since Tue Jan 3 18:07:20 2006

2006/01/03 18:09:32 VCS WARNING V-16-1-53025 Agent IPMultiNICB has faulted; ipm connection was lost; restarting the agent

2006/01/03 18:09:32 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/IPMultiNICB/IPMultiNICBAgent for resource type IPMultiNICB successfully started at Tue Jan 3 18:09:32 2006

2006/01/03 18:11:44 VCS WARNING V-16-1-10023 Agent IPMultiNICB not sending alive messages since Tue Jan 3 18:09:32 2006

2006/01/03 18:11:44 VCS WARNING V-16-1-53025 Agent IPMultiNICB has faulted; ipm connection was lost; restarting the agent

2006/01/03 18:11:44 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/IPMultiNICB/IPMultiNICBAgent for resource type IPMultiNICB successfully started at Tue Jan 3 18:11:44 2006

2006/01/03 18:13:56 VCS WARNING V-16-1-10023 Agent IPMultiNICB not sending alive messages since Tue Jan 3 18:11:44 2006

2006/01/03 18:13:56 VCS WARNING V-16-1-53025 Agent IPMultiNICB has faulted; ipm connection was lost; restarting the agent

2006/01/03 18:13:56 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/IPMultiNICB/IPMultiNICBAgent for resource type IPMultiNICB successfully started at Tue Jan 3 18:13:56 2006

2006/01/03 18:15:25 VCS INFO V-16-6-15004 (EBS) hatrigger:Failed to send trigger for unable_to_restart_agent; script doesn't exist

2006/01/03 18:16:08 VCS WARNING V-16-1-10023 Agent IPMultiNICB not sending alive messages since Tue Jan 3 18:13:56 2006


2006/01/03 18:16:08 VCS WARNING V-16-1-53025 Agent IPMultiNICB has faulted; ipm connection was lost; restarting the agent
2006/01/03 18:16:08 VCS ERROR V-16-1-10008 Agent IPMultiNICB has faulted 6 times since Tue Jan 3 18:05:08 2006

2006/01/03 18:16:08 VCS ERROR V-16-1-10009 Agent IPMultiNICB has faulted 6 times in less than 950 seconds -- Will not attempt to restart. Correct the problem and use haagent -start to start the agent

2006/01/03 18:16:08 VCS INFO V-16-6-15004 (DB) hatrigger:Failed to send trigger for unable_to_restart_agent; script doesn't exist
  • Exact same problem for me on a fresh install of VCS 4.1 w/ MP! on Solaris 10. Here's the fix:

    hastatus -sum
    haconf -dump
    haconf -makerw
    haattr -add IPMultiNICB ContainerName
    hatype -modify IPMultiNICB ArgList -add ContainerName
    haconf -dump -makero
    haagent -start IPMultiNICB -sys "node1"
    haagent -start IPMultiNICB -sys "node2"
    ...

    Seems like soomething broke on MP1regarding zones.
    Hope this helps
  • Hi Janet,

    please check your types.cf versus the types.cf from MP1 (installed in /etc/VRTSvcs/conf). There might be changes for the IPmultiNICB agent type which are necessary for the agent to come up.

    Regards,
    Carsten
  • Hi Carsten Hennig,

    thnaks for the reply i appreciate your help I try to find out the that what ist he differnce between these two files it seems only the ifconfigtwice parameter is set to 0 in (/etc/VRTSvcs/conf).

    while in type.cf its has no value.

    if you want to have a look on two files i am attaching the file contain the diffrences your response is highly appreciated.
    First entry is for type.cf and second for (type.cf(MP1))
    type IP (
    static str ArgList[] = { Device, Address, NetMask, Options, ArpDelay, IfconfigTwice }
    str Device
    str Address
    str NetMask
    str Options
    int ArpDelay = 1
    int IfconfigTwice


    ) type IP (
    static keylist SupportedActions = { "device.vfd", "route.vfd" }
    static str ArgList[] = { Device, Address, NetMask, Options, ArpDelay, IfconfigTwice, ContainerName }
    str Device
    str Address
    str NetMask
    str Options
    int ArpDelay = 1
    int IfconfigTwice = 0
    str ContainerName
    )





    type IPMultiNIC (
    static int MonitorTimeout = 120
    static str ArgList[] = { "MultiNICResName:Device", Address, NetMask, "MultiNICResName:ArpDelay", Options, "MultiNICResName:Probed", MultiNICResName, IfconfigTwice }
    str Address
    str NetMask
    str Options
    str MultiNICResName
    int IfconfigTwice
    )


    type IPMultiNIC (
    static str ArgList[] = { "MultiNICResName:Device", Address, NetMask, "MultiNICResName:ArpDelay", Options, "MultiNICResName:Probed", MultiNICResName, IfconfigTwice, ContainerName }
    static int MonitorTimeout = 120
    str Address
    str NetMask
    str Options
    str MultiNICResName
    int IfconfigTwice = 0
    str ContainerName
    )






    type IPMultiNICB (
    static str ArgList[] = { BaseResName, Address, NetMask, DeviceChoice }
    str BaseResName
    str Address
    str NetMask
    str DeviceChoice = 0
    )



    type IPMultiNICB (
    static str ArgList[] = {BaseResName, Address, NetMask, DeviceChoice, ContainerName}

    str BaseResName
    str Address
    str NetMask
    str DeviceChoice = "0"
    str ContainerName
    )




    type MultiNICA (
    static int MonitorTimeout = 300
    static int OfflineMonitorInterval = 60
    static str ArgList[] = { Device, NetMask, ArpDelay, RetestInterval, Options, RouteOptions, PingOptimize, MonitorOnly, IfconfigTwice, HandshakeInterval, NetworkHosts }
    static str Operations = None
    str Device{}
    str NetMask
    int ArpDelay = 1
    int RetestInterval = 5
    str Options
    str RouteOptions
    int PingOptimize = 1
    int IfconfigTwice
    int HandshakeInterval = 20
    str NetworkHosts[]
    )



    type MultiNICA (
    static str ArgList[] = { Device, NetMask, ArpDelay, RetestInterval, Options, RouteOptions, PingOptimize, MonitorOnly, IfconfigTwice, HandshakeInterval, NetworkHosts}
    static int OfflineMonitorInterval = 60
    static int MonitorTimeout = 300
    static str Operations = None
    str Device{}
    str NetMask
    int ArpDelay = 1
    int RetestInterval = 5
    str Options
    str RouteOptions
    int PingOptimize = 1
    int IfconfigTwice = 0
    int HandshakeInterval = 20
    str NetworkHosts[]
    )



    type MultiNICB (
    static int MonitorInterval = 10
    static int OfflineMonitorInterval = 60
    static str ArgList[] = { UseMpathd, MpathdCommand, ConfigCheck, MpathdRestart, Device, NetworkHosts, LinkTestRatio, IgnoreLinkStatus, NetworkTimeout, OnlineTestRepeatCount, OfflineTestRepeatCount, NoBroadcast, DefaultRouter, Failback }
    static str Operations = None
    int UseMpathd
    str MpathdCommand = "/sbin/in.mpathd"
    int ConfigCheck = 1
    int MpathdRestart = 1
    str Device{}
    str NetworkHosts[]
    int LinkTestRatio = 1
    int IgnoreLinkStatus = 1
    int NetworkTimeout = 100
    int OnlineTestRepeatCount = 3
    int OfflineTestRepeatCount = 3
    int NoBroadcast
    str DefaultRouter = "0.0.0.0"
    int Failback
    )


    type MultiNICB (

    static int MonitorInterval = 10
    static int OfflineMonitorInterval = 60
    static int MonitorTimeout = 60
    static str Operations = None

    static str ArgList[] = { UseMpathd, MpathdCommand, ConfigCheck, MpathdRestart, Device, NetworkHosts, LinkTestRatio, IgnoreLinkStatus, NetworkTimeout, OnlineTestRepeatCount, OfflineTestRepeatCount, NoBroadcast, DefaultRouter, Failback}

    int UseMpathd = 0
    str MpathdCommand = "/sbin/in.mpathd"

    int ConfigCheck = 1
    int MpathdRestart = 1

    str Device{}
    str NetworkHosts[]

    int LinkTestRatio = 1
    int IgnoreLinkStatus = 1
    int NetworkTimeout = 100

    int OnlineTestRepeatCount = 3
    int OfflineTestRepeatCount = 3

    int NoBroadcast = 0

    str DefaultRouter = "0.0.0.0"

    int Failback = 0
    )





    type NIC (
    static int OfflineMonitorInterval = 60
    static str ArgList[] = { Device, NetworkType, PingOptimize, NetworkHosts }
    static str Operations = None
    str Device
    str NetworkType
    int PingOptimize = 1
    str NetworkHosts[]
    )


    type NIC (
    static keylist SupportedActions = { "device.vfd" }
    static str ArgList[] = { Device, NetworkType, PingOptimize, NetworkHosts}
    static int OfflineMonitorInterval = 60
    static str Operations = None
    str Device
    str NetworkType
    int PingOptimize = 1
    str NetworkHosts[]
    )
  • Hi,

    if these are all the differences between the two types.cf files, you should stop your cluster, replace the types.cf with the new one from MP1 and start over again.

    Regards,
    Carsten
  • I am having the same exact problem. After upgrading to VCS 4.1MP1 my IPMultiNicB stooped working. I have asked Veritas support how to backout of MP1 so that i can have redundancy on my network interfaces.
  • Exact same problem for me on a fresh install of VCS 4.1 w/ MP! on Solaris 10. Here's the fix:

    hastatus -sum
    haconf -dump
    haconf -makerw
    haattr -add IPMultiNICB ContainerName
    hatype -modify IPMultiNICB ArgList -add ContainerName
    haconf -dump -makero
    haagent -start IPMultiNICB -sys "node1"
    haagent -start IPMultiNICB -sys "node2"
    ...

    Seems like soomething broke on MP1regarding zones.
    Hope this helps