cancel
Showing results for 
Search instead for 
Did you mean: 

VCS add new IPMultiNIC agent,Service Group can't switch to A site from B site

isaac_tsang
Level 3

Software:Veritas Storage Foundation Cluster File System 5.0MP3

Hareware:Sun Blade X6270

System:Sun Solaris 10

 

Description

1.Cluster has two group,one oracle agant,another is tomcat,every group has local vip and external vip。

after hastart,All group,dg,agant and resource can run ,no problem at first time。

NodeA can switch Group1 to NodeB(hagrp -switch Group1 -to NodeB) 

But NodeB can't switch Group1 to NodeA (hagrp -switch Group1 -to NodeA), the processes stop on offline external vip.

hastatus:

NodeA:/# hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen             

A  NodeA         RUNNING              0                   
A  NodeB         RUNNING              0                   

-- GROUP STATE
-- Group           System       Probed     AutoDisabled    State         

B  Group1          NodeA         Y          N               OFFLINE       
B  Group1          NodeB         Y          N               STOPPING|PARTIAL
B  Network         NodeA         Y          N               ONLINE        
B  Network         NodeB         Y          N               ONLINE        
B  Network_OM      NodeA         Y          N               ONLINE        
B  Network_OM      NodeB         Y          N               ONLINE        
B  Group2          NodeA         Y          N               ONLINE        
B  Group2          NodeB         Y          N               OFFLINE           
B  SNMPMasterAgent NodeA         Y          N               ONLINE        
B  SNMPMasterAgent NodeB         Y          N               ONLINE               
B  cvm             NodeA         Y          N               ONLINE        
B  cvm             NodeB         Y          N               ONLINE              
B  ora_DG          NodeA         Y          N               ONLINE        
B  ora_DG          NodeB         Y          N               ONLINE        

-- RESOURCES OFFLINING
-- Group           Type            Resource             System               IState

F  Group1          IPMultiNIC      Group1_OM_IP         NodeB         W_OFFLINE_PROPAGATE

 

 

Log file:

NodeB:/var/VRTSvcs/log# cat engine_A.log |grep Group1_OM_IP
2011/08/29 18:16:21 VCS NOTICE V-16-1-10300 Initiating Offline of Resource Group1_OM_IP (Owner: unknown, Group: FMMgrp) on System NodeB
2011/08/29 18:16:24 VCS ERROR V-16-2-13064 (NodeB) Agent is calling clean for resource(Group1_OM_IP) because the resource is up even after offline completed.
2011/08/29 18:16:26 VCS INFO V-16-2-13001 (NodeB) Resource(Group1_OM_IP): Output of the completed operation (clean)
2011/08/29 18:16:26 VCS INFO V-16-2-13068 (NodeB) Resource(Group1_OM_IP) -clean completed successfully.
2011/08/29 18:16:27 VCS ERROR V-16-2-13077 (NodeB) Agent is unable to offline resource(Group1_OM_IP). Administrative intervention may be required.
NodeB:/var/VRTSvcs/log# cat IPMultiNIC_A.log
2011/08/29 18:16:24 VCS ERROR V-16-2-13064 Thread(5) Agent is calling clean for resource(Group1_OM_IP) because the resource is up even after offline completed.
2011/08/29 18:16:26 VCS ERROR V-16-2-13068 Thread(5) Resource(Group1_OM_IP) -clean completed successfully.
2011/08/29 18:16:27 VCS ERROR V-16-2-13077 Thread(4) Agent is unable to offline resource(Group1_OM_IP). Administrative intervention may be required.
2011/08/29 18:16:28 VCS ERROR V-16-2-13067 Thread(4) Agent is calling clean for resource(Group1_IP) because the resource became OFFLINE unexpectedly, on its own.
2011/08/29 18:16:29 VCS ERROR

 

Group1_OM_IP is external vip

Group1_IP is local vip

 

Service Group1 and Network configuration in main.cf :

group Group1 (
        SystemList = { NodeA = 2, NodeB = 1 }
        AutoStartList = { NodeB }
        )

        IPMultiNIC Group1_IP (
                Address = "192.168.100.71"
                NetMask = "255.255.255.192"
                MultiNICResName = MultiNICA
                IfconfigTwice = 1
                )

        IPMultiNIC Group1_OM_IP (
                Address = "10.10.10.1"
                NetMask = "255.255.255.0"
                MultiNICResName = MultiNICA_OM
                IfconfigTwice = 1
                )

        ORACLE G1 (
                )

        Proxy Group1_NIC_PROXY (
                TargetResName = MultiNICA
                )

        Proxy Group1_OM_NIC_PROXY (
                TargetResName = MultiNICA_OM
                )

        Tomcat G1web (
                )


        requires group Group4 online global soft
        Group1_IP requires Group1_NIC_PROXY
        Group1_OM_IP requires Group1_OM_NIC_PROXY
        G1 requires Group1_IP
        G1web requires G1

 

        // resource dependency tree
        //
        //      group Group1
        //      {
        //      IPMultiNIC Group1_OM_IP
        //          {
        //          Proxy Group1_OM_NIC_PROXY
        //          }
        //      Tomcat G1web
        //          {
        //          ORACLE G1
        //              {
        //              IPMultiNIC Group1_IP
        //                  {
        //                  Proxy Group1_NIC_PROXY
        //                  }
        //              }
        //          }
        //      }

 

group Network (
        SystemList = { NodeA = 1, NodeB = 2 }
        Parallel = 1
        AutoStartList = { NodeA, NodeB }
        )

        MultiNICA MultiNICA (
                Device @NodeA = { e1000g2 = "192.168.100.66" }
                Device @NodeB = { e1000g2 = "192.168.100.67" }
                NetMask = "255.255.255.192"
                ArpDelay = 5
                RouteOptions = "192.168.100.65"
                IfconfigTwice = 1
                NetworkHosts = { "192.168.100.65", "192.168.100.126" }
                )

        Phantom Phantom (
                )

 

        // resource dependency tree
        //
        //      group Network
        //      {
        //      MultiNICA MultiNICA
        //      Phantom Phantom
        //      }

 

group Network_OM (
        SystemList = { NodeA = 1, NodeB = 2 }
        Parallel = 1
        AutoStartList = { NodeA, NodeB }
        )

        MultiNICA MultiNICA_OM (
                Device @NodeA = { e1000g4 = "10.10.10.2" }
                Device @NodeB = { e1000g4 = "10.10.10.3" }
                NetMask = "255.255.255.0"
                ArpDelay = 5
                RouteOptions = "10.10.10.10"
                IfconfigTwice = 1
                NetworkHosts = { "10.10.10.10", "10.10.10.4" }
                )

        Phantom Phantom_OM (
                )

 

        // resource dependency tree
        //
        //      group Network_OM
        //      {
        //      MultiNICA MultiNICA_OM
        //      Phantom Phantom_OM
        //      }

 

 

if unplumb e1000g4:1 interface, NodeB Service Group1 can switch to NodeA !! 

e1000g4:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 10.10.10.1 netmask fffffff0 broadcast 10.10.10.255

 

hastatus:

NodeA:/# hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen             

A  NodeA         RUNNING              0                   
A  NodeB         RUNNING              0                   

-- GROUP STATE
-- Group           System       Probed     AutoDisabled    State         

B  Group1          NodeA         Y          N               ONLINE       
B  Group1          NodeB         Y          N               OFFLINE
B  Network         NodeA         Y          N               ONLINE        
B  Network         NodeB         Y          N               ONLINE        
B  Network_OM      NodeA         Y          N               ONLINE        
B  Network_OM      NodeB         Y          N               ONLINE        
B  Group2          NodeA         Y          N               ONLINE        
B  Group2          NodeB         Y          N               OFFLINE           
B  SNMPMasterAgent NodeA         Y          N               ONLINE        
B  SNMPMasterAgent NodeB         Y          N               ONLINE               
B  cvm             NodeA         Y          N               ONLINE        
B  cvm             NodeB         Y          N               ONLINE              
B  ora_DG          NodeA         Y          N               ONLINE        
B  ora_DG          NodeB         Y          N               ONLINE 

 

i don't know why, please give me a solution  to fix it ,tks.

4 REPLIES 4

mikebounds
Level 6
Partner Accredited

The problem could be that the netmask shown in ifconfig is ffffffe0 (255.255.255.224), but in VCS you have this defined as ffffff00 (

NetMask = "255.255.255.0" for resource Group1_OM_IP ) and you also have it defined as something different agai ffffffc0 on the base MultiNICA resource (NetMask = "255.255.255.192" for MultiNICA resource)

A few other points:

  1. I would only create one parallel group called "networks" and put both NIC resources in there with one Phantom, all resources non-crit, no dependencies - there is no real point having 2 parallel network service groups.
     
  2. I don't know why you are using MultiNICA as, only one interface is defined, so you could use NIC resource.  If you do have 2 NICs and you haven't added the second one yet, then I would use IPMP (using MultiNICB) as IPMP gives much quicker NIC failovers than VCS MultiNIC agents
     
  3. You have called one resource MultiNICA, the same name as the type.  I have never seen this done and although it may not cause problems, I would avoid doing this - call resource names something more meaningful like e1000g2_nic or public_nic.

Mike

 

isaac_tsang
Level 3

i modify log has a fault, because this system is not public. i just fix it .

thanks you ...

i will thinking your points

AHerr
Level 5
Employee Accredited Certified

The other question I would have is if you have these interfaces starting when the system starts?

There are several things that could be going wrong.  If you have a support contract, I would work with Symantec Technical Support as they have the tools to determine where the error is.  The reason the IP is unable to be offlined, even after the clean process is run is not evident though MikeBounds does bring up a good point.

 

Regards,
Anthony

isaac_tsang
Level 3

This configuration is ok on different area other Node, The same goes for software hardware and system.

when the system starts only start e1000g2 for local ip. and e1000g4 for external ip start when the HA starts