cancel
Showing results for 
Search instead for 
Did you mean: 

Storage Foundation 5 HA Cluster Oracle RAC 10g, VIP connection loss problem

aamir
Level 3
Hi,
I have a setup of 2 node cluster Sun Aparc platformm Veritas Storage Founcation 5, HA, for Oracle RAC 10g, sun Storage 6140 with cluster file system for oracle, and Netbackup 6.5 environment. I have used Oracle RAC for HA, with both nodes different VIPs, and Application with Single VIP to be fail overed when ever one node is down. I have problem my oracle instances are running, services are running but any time i lost connection from each node to the other for Oracle VIPs, i.e. each node unable to ping other's Oracle VIP.
Aleardy installed MP2 and all updates, system OS wise is 100% fine and ieal condition.

Need expert openin.
Regards,
aamir
8 REPLIES 8

Gaurav_S
Moderator
Moderator
   VIP    Certified
Hello Aamir,

Little confused with your setup here....

1. So you have two different Oracle VIP (for RAC use) & you have another service group with another VIP used for failover operation ? Is it not a Parallel RAC service ?

2. When you say anytime you lost connection, what connection you are speaking of ? you are saying node goes down ? or you are not able to ping that server ? What is state of physical interface where this VIP was plumbed ?


Gaurav

Eric_Gao
Level 4
Hi Aamir,

With your wordings "services are running but any time i lost connection from each node to the other for Oracle VIPs"

in terms of connection lost, from where to where?  client lost the connection to the oracle server?

" i.e. each node unable to ping other's Oracle VIP",   how about you ping each VIP from the other machine at same subnet?

For RAC VIP,  you may try not to put them under vcs control, let them be on each node respectively and independently,  then configure RAC TAF on either server side, or client side,  even both.

aamir
Level 3
Dear Friends,

Sorry for inconvience, I HAVE 2 T5220 SERVERS, ONE SAN 6140,
OS SOLARIS10
ORACLE RAC 10G R2
STORAGE FOUNDATION 5 HA
VERITAS CLUSTER FOR FAIL OVER ONLY THE APPLICATION IP 183
ORACEL VIP DEFINED ON BOTH NODES 181 AND 182 IPS

PROBLEM:1. ORACLE RAC VIP172.16.14.182 NOT PINGING FROM SP1, BUT SP1 CAN PING APPLICATION IP 183., SAME CASE WITH SECOND SP2. IT CAN'T PING ORACLE VIP 181, AND APPLICATION 183 ALSO REACH ABLE FROM BOTH SP1 AND SP2.
xyz-test-SP2 # more defaultrouter
172.16.14.129
xyz-test-SP1 # ping 172.16.14.181
172.16.14.181 is alive
xyz-test-SP1 # ping 172.16.14.182
no answer from 172.16.14.182
xyz-test-SP1 # ping 172.16.14.183
172.16.14.183 is alive  # THIS IS APPLICATION WHICH IS TO BE FAIL OVERED ON BOTH NODES
 IP ADDRESSES AND THE HOSTS FILE DETAIL
xyz-test-SP1 # more /etc/hosts

#
# Internet host table
#
#::1 localhost
127.0.0.1 localhost
172.16.14.141 test-SP1 test-SP1.test-SP1.com loghost
172.16.14.143 test-SP2 test-SP2.test-SP2.com
192.168.10.1 test-SP1-e1000g2
192.168.10.2 test-SP2-e1000g2
10.1.1.1 test-SP1-e1000g3
10.1.1.2 test-SP2-e1000g3
172.16.14.181 test-SP1-vip
172.16.14.182 test-SP2-vip
172.16.14.183 istswitch
172.16.14.150 sunx4150
xyz@test-SP1 # cd etc
xyz@test-SP1 # pwd
/etc
xyz@test-SP1 # bash
The IP addresses definded in /etc
xyz@test-SP1 # more hostname.e1000g*
::::::::::::::
hostname.e1000g0
::::::::::::::
test-SP1
::::::::::::::
hostname.e1000g0:1
::::::::::::::
172.16.14.181 # Oracle VIP
::::::::::::::
hostname.e1000g2
::::::::::::::
192.168.10.1
::::::::::::::
hostname.e1000g3
::::::::::::::
10.1.1.1
SAME CONFIGURATION ON BOTH NODES SP1 AND SP2
xyz@test-SP1 #ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
inet 172.16.14.141 netmask ffffffc0 broadcast 172.16.14.191
ether 0:14:4f:eb:60:6a
e1000g0:2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
inet 172.16.14.160 netmask ffffff00 broadcast 172.16.14.255
e1000g0:3: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
inet 172.16.14.183 netmask ffffff00 broadcast 172.16.14.255
e1000g2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3
inet 192.168.10.1 netmask ffffff00 broadcast 192.168.10.255
ether 0:14:4f:eb:60:6c
e1000g2:1: flags=201040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,CoS> mtu 1500 index 3
inet 172.16.14.181 netmask ffffff00 broadcast 172.16.14.255
e1000g3: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 4
inet 10.1.1.1 netmask ff000000 broadcast 10.255.255.255
ether 0:14:4f:eb:60:6d.

HOPE NOW YOU CONCEIVE MY PROBLEM.
MY OBSERVATION: THE VIRTUAL IP DEFIND IN THE /etc in file hostname.e1000g0:1 172.16.14.181, while in the ifconfig it is not like that, i discussed with implementationteam but they say when system reboots, it decices itself where to assign the VIP.

Best Regards,
aamir

aamir
Level 3

Dear friend,

I checked with my DBA, to check the status of oracle on both the nodes, he says, oracle services, instances are running on both the nodes. Connection loss= " my problem, why the oracle VIPs on both nodes 181 and 182, can not ping each other, while all other IPs defned on all interfaces of both nodes are reachable, and with ping reply.

Best Regards,

aamir

Eric_Gao
Level 4
This sounds like a network problem or some solaris configuration problems, rather than vcs, oracle whatever.

My next question, is the VIP of each instance configurated within the same subnet of public network?  are there any other connectivity problems related to those public network?

Basically, while RAC is running, each instance will configure a dedicate VIP on its node, as the target where client connects to.   This is basicly a standard network configuration.  Say you shut down the whole rac, without any resource running,  then manually configure VIP on your network interface (ifconfig addif),  Is there any connectivity problem under this case?

aamir
Level 3
DearAll,

I have tried one idea to resolve the issue.
I unplumbed the VIP from interface e1000g2:1 to e1000g0:4.
i.e. unplumbed from both the nodes, the previous IPs 181  and 182, and plubmed new virtual interface from physical interface of e1000g0:4.
Both the nodes start pinging.
But after some time, no record found in the /var/adm/messages file again the VIPs revert back to previous setting on Virtual interface of e1000g2:1 with 172.16.14.181 and 182 respectively on both the nodes.

There is not entry for log in any of the system log files.

Any indea some one have tried before this kind of excercise, please share.

Best Regards,

Aamir Zahoor

Eric_Gao
Level 4
Hi Aamir,

If the vip internface reverted back to e1000g2:1,  the vip resource must have been restarted.  As we know, rac vip is controled by oracle CRS, not veritas ha, so go to check your crs vip log,   $CRS_HOME/log/{nodename}/racg.

basically, CRS will issue ifconfig removeif/addif to restart vip resource, if you want they stay on e1000g2:4,  pls ensure e1000g2:1, e1000g2:2 and e1000g2:3 being used all the time.  just plumb them should be fine.

in regard to your problem when they are plumbed on eg10002:1 and not pingable,  have you check if the up flag is there?  (check with ifconfig command)

Also check the resource from crs level  (crs_stat -t)

johnwarner
Not applicable
I have Oracle 10.2 installed on OpenSolaris 64Bit (AMD64) without any problem. It even runs in a Solaris Container on my Acer AMD64 Laptop (having Solaris zones for database app development is great, you only need to boot the zone you really need for the current project, so I have MySQL, PostgreSQL, DB2, Oracle and lots of other things in zones).You can find you solution in a book on OLAP.The book is available on Amazon .