Forum Discussion

symsonu's avatar
symsonu
Level 6
9 years ago

MULTINICB resource faulty and not getting cleared

Hi Team,

 

I am seeing MultinicB  resource fault as shown below

D  Ossfs           Proxy                ossfs_p1             et-coreg-admin2

D  PubLan          MultiNICB            pub_mnic             et-coreg-admin2

D  Sybase1         Proxy                syb1_p1              et-coreg-admin2

 

 

 

Pub_mnic is faulted  and in turn proxy resources that mirror the status of MUltinICB resources.

 

 

Below error seen on 3rd June

 

Jun  3 10:39:17 et-coreg-admin2 in.mpathd[6604]: [ID 168056 daemon.error] All Interfaces in group pub_mnic have failed

Jun  3 10:39:18 et-coreg-admin2 Had[6102]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys et-coreg-admin2

 

 

 

 

As of now interfaces seems ok and network is ok.

 

I want to clear this resource but being a Persistent resource it should recover itself once network issue resolved.

 

# ifconfig -a
lo0: flags=1001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8131 index 1
        inet 117.0.0.1 netmask ff000000
bnxe0: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 1
        inet 10.106.111.66 netmask ffffff80 broadcast 10.106.111.117
        groupname pub_mnic
        ether 14:58:d0:54:18:18
bnxe0:1: flags=11000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FAILED> mtu 1500 index 1
        inet 10.106.111.70 netmask ffffff80 broadcast 10.106.111.117
bnxe1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 3
        inet 10.106.111.68 netmask ffffff80 broadcast 10.106.111.117
        groupname pub_mnic
        ether 14:58:d0:54:18:1c

 

 

 hares -display pub_mnic
#Resource    Attribute              System          Value
pub_mnic     Group                  global          PubLan
pub_mnic     Type                   global          MultiNICB
pub_mnic     AutoStart              global          1
pub_mnic     Critical               global          1
pub_mnic     Enabled                global          1
pub_mnic     LastOnline             global          admin1
pub_mnic     MonitorOnly            global          0
pub_mnic     ResourceOwner          global
pub_mnic     TriggerEvent           global          0
pub_mnic     ArgListValues          admin1 UseMpathd   1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       1       MpathdRestart   1       1       Device  4       bnxe0   0       bnxe1   1       NetworkHosts    1       10.106.111.51   LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout  1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount  1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       ""      Protocol        1       IPv4
pub_mnic     ArgListValues          admin1 UseMpathd   1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       1       MpathdRestart   1       1       Device  4       bnxe0   0       bnxe1   1       NetworkHosts    1       10.106.111.51   LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout  1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount  1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       ""      Protocol        1       IPv4
pub_mnic     ConfidenceLevel        admin1 0
pub_mnic     ConfidenceLevel        admin1 0
pub_mnic     ConfidenceMsg          admin1
pub_mnic     ConfidenceMsg          admin1
pub_mnic     Flags                  admin1
pub_mnic     Flags                  admin1
pub_mnic     IState                 admin1 not waiting
pub_mnic     IState                 admin1 not waiting
pub_mnic     MonitorMethod          admin1 Traditional
pub_mnic     MonitorMethod          admin1 Traditional
pub_mnic     Probed                 admin1 1
pub_mnic     Probed                 admin1 1
pub_mnic     Start                  admin1 0
pub_mnic     Start                  admin1 0
pub_mnic     State                  admin1 ONLINE
pub_mnic     State                  admin1 FAULTED
pub_mnic     ComputeStats           global          0
pub_mnic     ConfigCheck            global          1
pub_mnic     DefaultRouter          global          0.0.0.0
pub_mnic     Failback               global          0
pub_mnic     GroupName              global
pub_mnic     IgnoreLinkStatus       global          1
pub_mnic     LinkTestRatio          global          1
pub_mnic     MpathdCommand          global          /usr/lib/inet/in.mpathd
pub_mnic     MpathdRestart          global          1
pub_mnic     NetworkHosts           global          10.106.111.51
pub_mnic     NetworkTimeout         global          100
pub_mnic     NoBroadcast            global          0
pub_mnic     OfflineTestRepeatCount global          3
pub_mnic     OnlineTestRepeatCount  global          3
pub_mnic     Protocol               global          IPv4
pub_mnic     TriggerResStateChange  global          0
pub_mnic     UseMpathd              global          1
pub_mnic     ContainerInfo          admin1 Type                Name            Enabled
pub_mnic     ContainerInfo          admin1 Type                Name            Enabled
pub_mnic     Device                 admin1 bnxe0       0       bnxe1   1
pub_mnic     Device                 admin1 bnxe0       0       bnxe1   1
pub_mnic     MonitorTimeStats       admin1 Avg 0       TS
pub_mnic     MonitorTimeStats       admin1 Avg 0       TS
pub_mnic     ResourceInfo           admin1 State       Valid   Msg             TS
pub_mnic     ResourceInfo           admin1 State       Stale   Msg             TS

 

 

Please help to solve this asap

 

  • I see the failed flag is still present in the ifconfig output. Clear the FAILED flag from OS ( if_mpadm -d <interface>, if_mpadm -r <interface> ) and try clearing the fault once again.

     

    # ifconfig -a
    lo0: flags=1001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8131 index 1
            inet 117.0.0.1 netmask ff000000
    bnxe0: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 1
            inet 10.106.111.66 netmask ffffff80 broadcast 10.106.111.117
            groupname pub_mnic
            ether 14:58:d0:54:18:18
    bnxe0:1: flags=11000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FAILED> mtu 1500 index 1
            inet 10.106.111.70 netmask ffffff80 broadcast 10.106.111.117
    bnxe1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 3
            inet 10.106.111.68 netmask ffffff80 broadcast 10.106.111.117
            groupname pub_mnic
            ether 14:58:d0:54:18:1c

  • I see the failed flag is still present in the ifconfig output. Clear the FAILED flag from OS ( if_mpadm -d <interface>, if_mpadm -r <interface> ) and try clearing the fault once again.

     

    # ifconfig -a
    lo0: flags=1001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8131 index 1
            inet 117.0.0.1 netmask ff000000
    bnxe0: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 1
            inet 10.106.111.66 netmask ffffff80 broadcast 10.106.111.117
            groupname pub_mnic
            ether 14:58:d0:54:18:18
    bnxe0:1: flags=11000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FAILED> mtu 1500 index 1
            inet 10.106.111.70 netmask ffffff80 broadcast 10.106.111.117
    bnxe1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 3
            inet 10.106.111.68 netmask ffffff80 broadcast 10.106.111.117
            groupname pub_mnic
            ether 14:58:d0:54:18:1c

  • Hello Dev,

     

    Any impact of detach and reattach . ?

    can we do it directly ? or we need to take precation before executing ?

     

     

    Regards

    S

  • IP address associated with interface will be unavailable during this process, you will need to clear the FAULT and bring the resources online to bring the IPs back online.