cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

VCS Warning errors in IPMultiNICB

Mark_Davies
Level 3

Hi,

I am using VCS5.1 SP1 on two Solaris 10 servers with a Solaris 9 container as a service.  I have three networks on the physical hosts using link based ip and then through VCS setting the cluster to ip the three networks onto the Solaris 9 container.

I am receiving the following errors every 30 seconds in the IPMultiNICB logs:

2012/02/08 15:09:03 VCS WARNING V-16-10001-6559 IPMultiNICB:ipb_backup:monitor:Unknown Protocol () type. Set Protocol to default (IPv4).
2012/02/08 15:09:04 VCS WARNING V-16-10001-6559 IPMultiNICB:ipb_ukcsn:monitor:Unknown Protocol () type. Set Protocol to default (IPv4).
2012/02/08 15:09:04 VCS WARNING V-16-10001-6559 IPMultiNICB:ipb_prod:monitor:Unknown Protocol () type. Set Protocol to default (IPv4).

Here is what is set in the main.cf file for each MultiNICB resource and IPMultiNICB resource and types file entries:

IPMultiNICB ipb_backup (
                Critical = 0
                BaseResName = mnb_backup
                Address = "10.6.241.83"
                NetMask = "255.255.255.0"
                )

        IPMultiNICB ipb_prod (
                Critical = 0
                BaseResName = mnb_prod
                Address = "10.6.8.66"
                NetMask = "255.255.255.0"
                )

        IPMultiNICB ipb_ukcsn (
                Critical = 0
                BaseResName = mnb_ukcsn
                Address @ecsclapmanu001 = "10.101.14.65"
                Address @ecscscomanu001 = "10.101.142.128"
                NetMask = "255.255.255.0"
                )

 MultiNICB mnb_backup (
                Critical = 0
                UseMpathd = 1
                ConfigCheck = 0
                Device = { nxge2 = 0, nxge6 = 0 }
                GroupName = backup
                )

        MultiNICB mnb_prod (
                Critical = 0
                UseMpathd = 1
                ConfigCheck = 0
                Device = { nxge0 = 0, nxge4 = 0 }
                GroupName = prod
                )

        MultiNICB mnb_ukcsn (
                Critical = 0
                UseMpathd = 1
                ConfigCheck = 0
                Device = { nxge1 = 0, nxge5 = 0 }
                GroupName = ukcsn
                )

type IPMultiNICB (
        static int MonitorInterval = 30
        static int OnlineRetryLimit = 1
        static int ToleranceLimit = 1
        static str ArgList[] = { BaseResName, Address, NetMask, DeviceChoice, RouteOptions, PrefixLen, IgnoreMultiNICBFailure, "BaseResName:Protocol", Option
s }
        static int ContainerOpts{} = { RunInContainer=0, PassCInfo=1 }
        str BaseResName
        str Address
        str NetMask
        str DeviceChoice = 0
        str RouteOptions
        int PrefixLen
        int IgnoreMultiNICBFailure
        str Options
)

type MultiNICB (
        static int MonitorInterval = 10
        static int OfflineMonitorInterval = 60
        static str ArgList[] = { UseMpathd, MpathdCommand, ConfigCheck, MpathdRestart, Device, NetworkHosts, LinkTestRatio, IgnoreLinkStatus, NetworkTimeout,
 OnlineTestRepeatCount, OfflineTestRepeatCount, NoBroadcast, DefaultRouter, Failback, GroupName, Protocol }
        static str Operations = None
        int UseMpathd
        str MpathdCommand = "/usr/lib/inet/in.mpathd"
        int ConfigCheck = 1
        int MpathdRestart = 1
        str Device{}
        str NetworkHosts[]
        int LinkTestRatio = 1
        int IgnoreLinkStatus = 1
        int NetworkTimeout = 100
        int OnlineTestRepeatCount = 3
        int OfflineTestRepeatCount = 3
        int NoBroadcast
        str DefaultRouter = "0.0.0.0"
        int Failback
        str GroupName
        str Protocol = IPv4
)

Can anyone recommend a solution - the ip addresses are up.

Regards,

Mark Davies

1 ACCEPTED SOLUTION

Accepted Solutions

mikebounds
Level 6
Partner Accredited

Are you able to run "ha" commands from local zone without been prompted for a password - this file is normally needed so you have permissions to run VCS commands from an essentially unauthorised node.

Also just making sure you noticed the file has a dot at front, so obvioulsly need to use "ls -a" to list and that this should be in roots home directory (normally /root)

I believe the permissions work differently in VCS 6.0, so maybe Symantec brought them in early so work differently from 5.1SP1

Mike

View solution in original post

15 REPLIES 15

mikebounds
Level 6
Partner Accredited

Hi Mark,

Hope you are well.

You have not upgraded your types files when you installed SP1.  Your types file should contain a "Protocol" attribute.

So you need to extract any tuned setting from types.cf file (custom Restarts, Retrys, Timeouts etc) .  Then copy SP1 types files from /etc/VRTSvcs/conf to /etc/VRTSvcs/conf/config and apply any tuned settings you extracted.

You will then need to bounce VCS

Mike

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hi Mark,

I trust the types.cf content above is from /etc/VRTSvcs/conf/config directory .. well this seems little wierd as types.cf is clearly setting the string for IPv4 so doesn't makes sense for an protocol error..

I checked the resource state definations from 5.1SP1 bundled guide, all looks pretty much in shape.

would like to know any changes being done manually to the monitor script of IPMultiNICB agent ? also at the same time, are you noticing any error message in engine_A.log or system messages ?

 

Gaurav

Mark_Davies
Level 3

Hi Gaurav,

I have not made any changes to the monitoring scripts from what was installed.

No errors in either the engine_A.log or system messages only errors are in the IPMultiNICB_A.log

Regards,

Mark

Mark_Davies
Level 3

Here is what is in the zones type.cf file:

type IPMultiNICB (
        static int ToleranceLimit = 1
        static int MonitorInterval = 30
        static int OnlineRetryLimit=1
        static str ArgList[] = { BaseResName, Address, NetMask, DeviceChoice, RouteOptions, PrefixLen, IgnoreMultiNICBFailure, "BaseResName:Protocol" }
        static int ContainerOpts{} = { RunInContainer=0, PassCInfo=1 }
        str BaseResName
        str Address
        str NetMask
        str DeviceChoice = 0
        str RouteOptions
        int PrefixLen
        int IgnoreMultiNICBFailure = 0
)

type MultiNICB (
        static int MonitorInterval = 10
        static int OfflineMonitorInterval = 60
        static str Operations = None
        static str ArgList[] = { UseMpathd, MpathdCommand, ConfigCheck, MpathdRestart, Device, NetworkHosts, LinkTestRatio, IgnoreLinkStatus, NetworkTimeout,
 OnlineTestRepeatCount, OfflineTestRepeatCount, NoBroadcast, DefaultRouter, Failback, GroupName, Protocol }
        int UseMpathd
        str MpathdCommand = "/usr/lib/inet/in.mpathd"
        int ConfigCheck = 1
        int MpathdRestart = 1
        str Device{}
        str NetworkHosts[]
        int LinkTestRatio = 1
        int IgnoreLinkStatus = 1
        int NetworkTimeout = 100
        int OnlineTestRepeatCount = 3
        int OfflineTestRepeatCount = 3
        int NoBroadcast
        str DefaultRouter = "0.0.0.0"
        int Failback
        str GroupName
        str Protocol
)

 

Does this need to change in the Soalris 9 zone ?

mikebounds
Level 6
Partner Accredited

Sorry Mark, was looking at IPMultiNICB type (as error is shown against IPMultiNICB resource not MultiNICB resource) in your post when looking for Protocol attribute, but Protocol attribute should be in  MultiNICB type, so I see you have upgraded types file.

Can you check output of "hares -display mnb_prod" to check resources shows "IPv4" for Protocol attribute (this should have been brought though from types file)

Mike

mikebounds
Level 6
Partner Accredited

I would create an IPMultiNICB resource that runs in the global zone to see if you get the same error - if you get the same error, then you know it has nothing to do with zones, but if you don't get error, then quite possibly it is a bug or undocumented step you need to do as there is very little info on branded zones - there is just a short section in the SFHA Virtualisation Guide which I presume you have read to get as far as you have.

Mike

Mark_Davies
Level 3

Hi Mike,

I am fine thanks and still working with EC !

I hope you are well.

Here is the output, as you can see the MultiNICB process is in a commonsg group which is not part of the zone and the IPMultiNIB is is part of the zone as it has to be:


bash-3.2# hares -display mnb_prod
#Resource    Attribute              System         Value
mnb_prod     Group                  global         commonsg
mnb_prod     Type                   global         MultiNICB
mnb_prod     AutoStart              global         1
mnb_prod     Critical               global         0
mnb_prod     Enabled                global         1
mnb_prod     LastOnline             global         ecsclapmanu001
mnb_prod     MonitorOnly            global         0
mnb_prod     ResourceOwner          global
mnb_prod     TriggerEvent           global         0
mnb_prod     ArgListValues          ecsclapmanu001 UseMpathd    1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       0       MpathdRestart   1       1       Device  4       nxge0   0       nxge4   0       NetworkHosts    0       LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout  1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount  1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       prod    Protocol        1       IPv4
mnb_prod     ArgListValues          ecscscomanu001 UseMpathd    1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       0       MpathdRestart   1       1       Device  4       nxge0   0       nxge4   0       NetworkHosts    0       LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout  1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount  1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       prod    Protocol        1       IPv4
mnb_prod     ConfidenceLevel        ecsclapmanu001 0
mnb_prod     ConfidenceLevel        ecscscomanu001 0
mnb_prod     ConfidenceMsg          ecsclapmanu001
mnb_prod     ConfidenceMsg          ecscscomanu001
mnb_prod     Flags                  ecsclapmanu001
mnb_prod     Flags                  ecscscomanu001
mnb_prod     IState                 ecsclapmanu001 not waiting
mnb_prod     IState                 ecscscomanu001 not waiting
mnb_prod     MonitorMethod          ecsclapmanu001 Traditional
mnb_prod     MonitorMethod          ecscscomanu001 Traditional
mnb_prod     Probed                 ecsclapmanu001 1
mnb_prod     Probed                 ecscscomanu001 1
mnb_prod     Start                  ecsclapmanu001 0
mnb_prod     Start                  ecscscomanu001 0
mnb_prod     State                  ecsclapmanu001 ONLINE
mnb_prod     State                  ecscscomanu001 ONLINE
mnb_prod     ComputeStats           global         0
mnb_prod     ConfigCheck            global         0
mnb_prod     DefaultRouter          global         0.0.0.0
mnb_prod     Device                 global         nxge0        0       nxge4   0
mnb_prod     Failback               global         0
mnb_prod     GroupName              global         prod
mnb_prod     IgnoreLinkStatus       global         1
mnb_prod     LinkTestRatio          global         1
mnb_prod     MpathdCommand          global         /usr/lib/inet/in.mpathd
mnb_prod     MpathdRestart          global         1
mnb_prod     NetworkHosts           global
mnb_prod     NetworkTimeout         global         100
mnb_prod     NoBroadcast            global         0
mnb_prod     OfflineTestRepeatCount global         3
mnb_prod     OnlineTestRepeatCount  global         3
mnb_prod     Protocol               global         IPv4
mnb_prod     TriggerResStateChange  global         0
mnb_prod     UseMpathd              global         1
mnb_prod     ContainerInfo          ecsclapmanu001 Type         Name            Enabled
mnb_prod     ContainerInfo          ecscscomanu001 Type         Name            Enabled
mnb_prod     MonitorTimeStats       ecsclapmanu001 Avg  0       TS
mnb_prod     MonitorTimeStats       ecscscomanu001 Avg  0       TS
mnb_prod     ResourceInfo           ecsclapmanu001 State        Valid   Msg             TS
mnb_prod     ResourceInfo           ecscscomanu001 State        Valid   Msg             TS

bash-3.2# hares -display ipb_prod
#Resource    Attribute              System         Value
ipb_prod     Group                  global         MGX1P
ipb_prod     Type                   global         IPMultiNICB
ipb_prod     AutoStart              global         1
ipb_prod     Critical               global         0
ipb_prod     Enabled                global         1
ipb_prod     LastOnline             global         ecsclapmanu001
ipb_prod     MonitorOnly            global         0
ipb_prod     ResourceOwner          global
ipb_prod     TriggerEvent           global         0
ipb_prod     ArgListValues          ecsclapmanu001 BaseResName  1       mnb_prod        Address 1       10.6.8.66       NetMask 1       255.255.255.0   DeviceChoice    1       0       RouteOptions    1       ""      PrefixLen       1       0       IgnoreMultiNICBFailure  1       0       BaseResName:Protocol    1       ""      Options 1       ""
ipb_prod     ArgListValues          ecscscomanu001 BaseResName  1       mnb_prod        Address 1       10.6.8.66       NetMask 1       255.255.255.0   DeviceChoice    1       0       RouteOptions    1       ""      PrefixLen       1       0       IgnoreMultiNICBFailure  1       0       BaseResName:Protocol    1       ""      Options 1       ""
ipb_prod     ConfidenceLevel        ecsclapmanu001 0
ipb_prod     ConfidenceLevel        ecscscomanu001 0
ipb_prod     ConfidenceMsg          ecsclapmanu001
ipb_prod     ConfidenceMsg          ecscscomanu001
ipb_prod     Flags                  ecsclapmanu001
ipb_prod     Flags                  ecscscomanu001
ipb_prod     IState                 ecsclapmanu001 not waiting
ipb_prod     IState                 ecscscomanu001 not waiting
ipb_prod     MonitorMethod          ecsclapmanu001 Traditional
ipb_prod     MonitorMethod          ecscscomanu001 Traditional
ipb_prod     Probed                 ecsclapmanu001 1
ipb_prod     Probed                 ecscscomanu001 1
ipb_prod     Start                  ecsclapmanu001 1
ipb_prod     Start                  ecscscomanu001 0
ipb_prod     State                  ecsclapmanu001 ONLINE
ipb_prod     State                  ecscscomanu001 OFFLINE
ipb_prod     Address                global         10.6.8.66
ipb_prod     BaseResName            global         mnb_prod
ipb_prod     ComputeStats           global         0
ipb_prod     DeviceChoice           global         0
ipb_prod     IgnoreMultiNICBFailure global         0
ipb_prod     NetMask                global         255.255.255.0
ipb_prod     Options                global
ipb_prod     PrefixLen              global         0
ipb_prod     ResourceInfo           global         State        Stale   Msg             TS
ipb_prod     RouteOptions           global
ipb_prod     TriggerResStateChange  global         0
ipb_prod     ContainerInfo          ecsclapmanu001 Type Zone    Name    ebs-manu51prod  Enabled 1
ipb_prod     ContainerInfo          ecscscomanu001 Type Zone    Name    ebs-manu51prod  Enabled 1
ipb_prod     MonitorTimeStats       ecsclapmanu001 Avg  0       TS
ipb_prod     MonitorTimeStats       ecscscomanu001 Avg  0       TS
bash-3.2#

Mark_Davies
Level 3

Hi Mike,

I have created a IPMultiNIB resource in the global zone and doesn't report any errors.

So it must be something in running the IPMultiNICB resource from within the zone.

Mark

mikebounds
Level 6
Partner Accredited

Hi Mark,

Could just check that when you zlogin into the local zone, that in the home directory (of root) there is a file called .vcspwd and that you can run ha commands from the local zone (this will only work if .vcspwd exists and is correct)

I may I have identified why the issue might be occuring.  Normally the ArgListValues of a resource passes just the attributes of that resource.  It doesn't have to and I have seen agents that pass type attributes (OfflineTimeOut etc) and although this works I have seen issues with this as for instance you can't copy this type of resource properly in the 5.0 GUI as the copy bombs out as it seems to use the ArglistValues to create the resource.  Your IPMultiNIC resource has a value in the ArgListValues of:

BaseResName:Protocol    1       ""

So this presumbly, SHOULD take the resource that is in BaseResName which is the MultiNICB resource and extract the Protocol attribute from this resource, but this is "", where I think it should be IPv4.  This might not be the case so check your global IPMultiNICB resource to see if IPv4 is set there.  If it is not set then it maybe the agent generates this on the fly and this is why it may work in the global zone and not for the agent running in the local zone which is why I asked you to check ha commands work in the local zone.

Mike

 

Mark_Davies
Level 3

Hi Mike,

There is no .vcspwd file in the /root of the zone.

Regards,

Mark

Mark_Davies
Level 3

Mike,

How strange ! I have restarted VCS and the I am no longer seeing the errors  and the hares -display is now showing the correct IPv4 !

bash-3.2# hares -display ipb_prod
#Resource    Attribute              System         Value
ipb_prod     Group                  global         MGX1P
ipb_prod     Type                   global         IPMultiNICB
ipb_prod     AutoStart              global         1
ipb_prod     Critical               global         0
ipb_prod     Enabled                global         1
ipb_prod     LastOnline             global         ecsclapmanu001
ipb_prod     MonitorOnly            global         0
ipb_prod     ResourceOwner          global
ipb_prod     TriggerEvent           global         0
ipb_prod     ArgListValues          ecsclapmanu001 BaseResName  1       mnb_prod        Address 1       10.6.8.66       NetMask 1       255.255.255.0   DeviceChoice    1       0       RouteOptions    1       ""      PrefixLen       1       0       IgnoreMultiNICBFailure  1       0       BaseResName:Protocol    1       IPv4    Options 1       ""
ipb_prod     ArgListValues          ecscscomanu001 BaseResName  1       mnb_prod        Address 1       10.6.8.66       NetMask 1       255.255.255.0   DeviceChoice    1       0       RouteOptions    1       ""      PrefixLen       1       0       IgnoreMultiNICBFailure  1       0       BaseResName:Protocol    1       IPv4    Options 1       ""
ipb_prod     ConfidenceLevel        ecsclapmanu001 0
ipb_prod     ConfidenceLevel        ecscscomanu001 0
ipb_prod     ConfidenceMsg          ecsclapmanu001
ipb_prod     ConfidenceMsg          ecscscomanu001
ipb_prod     Flags                  ecsclapmanu001
ipb_prod     Flags                  ecscscomanu001
ipb_prod     IState                 ecsclapmanu001 not waiting
ipb_prod     IState                 ecscscomanu001 not waiting
ipb_prod     MonitorMethod          ecsclapmanu001 Traditional
ipb_prod     MonitorMethod          ecscscomanu001 Traditional
ipb_prod     Probed                 ecsclapmanu001 1
ipb_prod     Probed                 ecscscomanu001 1
ipb_prod     Start                  ecsclapmanu001 1
ipb_prod     Start                  ecscscomanu001 0
ipb_prod     State                  ecsclapmanu001 ONLINE
ipb_prod     State                  ecscscomanu001 OFFLINE
ipb_prod     Address                global         10.6.8.66
ipb_prod     BaseResName            global         mnb_prod
ipb_prod     ComputeStats           global         0
ipb_prod     DeviceChoice           global         0
ipb_prod     IgnoreMultiNICBFailure global         0
ipb_prod     NetMask                global         255.255.255.0
ipb_prod     Options                global
ipb_prod     PrefixLen              global         0
ipb_prod     ResourceInfo           global         State        Valid   Msg             TS
ipb_prod     RouteOptions           global
ipb_prod     TriggerResStateChange  global         0
ipb_prod     ContainerInfo          ecsclapmanu001 Type Zone    Name    ebs-manu51prod  Enabled 1
ipb_prod     ContainerInfo          ecscscomanu001 Type Zone    Name    ebs-manu51prod  Enabled 1
ipb_prod     MonitorTimeStats       ecsclapmanu001 Avg  0       TS
ipb_prod     MonitorTimeStats       ecscscomanu001 Avg  0       TS

mikebounds
Level 6
Partner Accredited

Mark,

So presumebly this means ha commands don't work then from the local zone

To fix:

First make sure local zone can resolve the global zone by name - if it can't at the moment, this is why you won't have a .vcspwd file.

Then bounce zone resource in VCS and when zone resource onlines it should create the .vcspwd file and you should be able to run ha commands from the local zone.

Mike

mikebounds
Level 6
Partner Accredited

Do you have a .vcspwd file now and are ha commands working from local zone.

Mike

Mark_Davies
Level 3

Hi Miek,

I have put the servers ip addresses into the local zone and proved I can ping the addresses.  I have managed to bounce the zone, but I am still not getting that file.

Regards,

Mark

mikebounds
Level 6
Partner Accredited

Are you able to run "ha" commands from local zone without been prompted for a password - this file is normally needed so you have permissions to run VCS commands from an essentially unauthorised node.

Also just making sure you noticed the file has a dot at front, so obvioulsly need to use "ls -a" to list and that this should be in roots home directory (normally /root)

I believe the permissions work differently in VCS 6.0, so maybe Symantec brought them in early so work differently from 5.1SP1

Mike