cancel
Showing results for 
Search instead for 
Did you mean: 

SF HA 5.1 with Oracle 11g failover support

Ulysses_Leo_Lee
Level 3

I am writing to ask for help regarding Storage Foundation HA 5.1 with Oracle 11g failover support. We have two-nodes Sun Cluster running on M3000 also connect to Netapp LUN Sharing through FC. DB is designed running on the solaris container.

Currently we have finished the container failover testing, Oracle listener failover testing, all of them get passed. Also the VCS Oracle Agent has been installed.

My Question is why the Oracle 11g failover testing is failed(Even we got the listener failover testing passed)? We can not switch this resource from one node to another also we can not bring online the database with using Cluster Manager (Java Console). DB could be started manually in sqlplus.

Looking forward for your reply. Thanks a lot.

Leo

1 ACCEPTED SOLUTION

Accepted Solutions

Ulysses_Leo_Lee
Level 3

Hi, All, thanks for your great help. Actually we need install SF5.1 SP1 to support the Oracle 11g DB because its Oracle 11gR2.

Thanks again for everyone's great help and I would close this ticket now.

View solution in original post

17 REPLIES 17

mikebounds
Level 6
Partner Accredited

Engine_A.log or Oracle_A.log should tell you why Oracle fails to start (in /var/VTRSvcs/log) - if you can't find issue from logs, then please post main.cf

Mike 

Ulysses_Leo_Lee
Level 3

Here comes Oracle_A.log

2011/08/24 18:48:45 VCS ERROR V-16-2-13066 Thread(3) Agent is calling clean for resource(testzone_Oracle_testdb) because the resource is not up even after online completed.
2011/08/24 18:48:46 VCS ERROR V-16-2-13069 Thread(3) Resource(testzone_Oracle_testdb) - clean failed.
 

Engine_A.log:

==============================================
VCS WARNING V-16-1-52529 Login Incorrect, Invalid username/password
==============================================

2011/08/24 18:52:19 VCS INFO V-16-1-10298 Resource testzone_testzonedb_listener (Owner: unknown, Group: testzone_SG) is online on node1 (VCS initiated)
2011/08/24 18:52:19 VCS NOTICE V-16-1-10447 Group testzone_SG is online on system node1
2011/08/24 18:59:18 VCS NOTICE V-16-1-10016 Agent /opt/VRTSagents/ha/bin/Oracle/OracleAgent for resource type Oracle successfully started at Wed Aug 24 18:59:18 2011
2011/08/24 18:59:23 VCS INFO V-16-1-10304 Resource testzone_oracle_testdb (Owner: unknown, Group: testzone_SG) is offline on node2 (First probe)
2011/08/24 18:59:24 VCS INFO V-16-1-10304 Resource testzone_oracle_testdb (Owner: unknown, Group: testzone_SG) is offline on node1 (First probe)
2011/08/24 18:59:33 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group testzone_SG on all nodes
2011/08/24 18:59:33 VCS NOTICE V-16-1-10301 Initiating Online of Resource testzone_oracle_testdb (Owner: unknown, Group: testzone_SG) on System node1
2011/08/24 19:01:36 VCS ERROR V-16-2-13066 (node1) Agent is calling clean for resource(testzone_oracle_testdb) because the resource is not up even after online completed.
2011/08/24 19:01:37 VCS ERROR V-16-2-13069 (node1) Resource(testzone_oracle_testdb) - clean failed.
2011/08/24 19:08:28 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group testzone_SG on all nodes
2011/08/24 19:11:04 VCS ERROR V-16-2-13079 (node1) Resource(testzone_oracle_testdb): The last 10 invocations of the clean procedure have failed.
2011/08/24 19:12:59 VCS NOTICE V-16-1-10022 Agent Oracle stopped
2011/08/24 19:12:59 VCS NOTICE V-16-1-10447 Group testzone_SG is online on system node1
2011/08/25 12:17:09 VCS NOTICE V-16-1-10208 Initiating switch of group testzone_SG from system node1 to system node2
2011/08/25 12:17:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource testzone_testzonedb_listener (Owner: unknown, Group: testzone_SG) on System node1
2011/08/25 12:17:15 VCS INFO V-16-2-13716 (node1) Resource(testzone_testzonedb_listener): Output of the completed operation (offline)

TonyGriffiths
Level 6
Employee Accredited Certified

Hi

So at a high level the logs are saying:

Resource testzone_oracle_testdb was probed on both systems. Probes (monitors) completed and resource to be offline on both nodes

Resource testzone_oracle_testdb attempted to go online on node1

The online routine for the resource completed, but the monitor routine did not declare the resource was online. *Possibly* there is a configuration issue with the resource

 

Can you post the main.cf extract for the resource/service group

Are there any other dependencies that the Resource/SG requires ?

Ulysses_Leo_Lee
Level 3

Thanks. I will post the main.cf in another comment.

Just now I switch the resource group to node2 and manually start the oracle db, make the dependency but the db could not be offline! It print out below error:

2011/08/25 12:18:38 VCS NOTICE V-16-1-10447 Group testzone_SG is online on system node2
2011/08/25 15:05:32 VCS NOTICE V-16-1-10016 Agent /opt/VRTSagents/ha/bin/Oracle/OracleAgent for resource type Oracle successfully started at Thu Aug 25 15:05:32 2011
2011/08/25 15:05:37 VCS INFO V-16-1-10304 Resource testzone_oracle_testdb (Owner: unknown, Group: testzone_SG) is offline on node1 (First probe)
2011/08/25 15:05:38 VCS INFO V-16-1-10297 Resource testzone_oracle_testdb (Owner: unknown, Group: testzone_SG) is online on node2 (First probe)
2011/08/25 15:05:38 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group testzone_SG on all nodes
2011/08/25 15:06:17 VCS NOTICE V-16-1-10208 Initiating switch of group testzone_SG from system node2 to system node1
2011/08/25 15:06:17 VCS NOTICE V-16-1-10300 Initiating Offline of Resource testzone_oracle_testdb (Owner: unknown, Group: testzone_SG) on System node2
2011/08/25 15:06:19 VCS ERROR V-16-2-13064 (node2) Agent is calling clean for resource(testzone_oracle_testdb) because the resource is up even after offline completed.
2011/08/25 15:06:21 VCS ERROR V-16-2-13069 (node2) Resource(testzone_oracle_testdb) - clean failed.
2011/08/25 15:07:22 VCS ERROR V-16-2-13077 (node2) Agent is unable to offline resource(testzone_oracle_testdb). Administrative intervention may be required.
2011/08/25 15:15:39 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 10 invocations of the clean procedure have failed.
2011/08/25 15:25:59 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 20 invocations of the clean procedure have failed.
2011/08/25 15:36:19 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 30 invocations of the clean procedure have failed.
2011/08/25 15:46:46 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 40 invocations of the clean procedure have failed.
2011/08/25 15:57:17 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 50 invocations of the clean procedure have failed.
2011/08/25 16:07:38 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 60 invocations of the clean procedure have failed.
2011/08/25 16:17:58 VCS ERROR V-16-2-13079 (node2) Resource(testzone_oracle_testdb): The last 70 invocations of the clean procedure have failed.

Ulysses_Leo_Lee
Level 3

include "OracleASMTypes.cf"
include "types.cf"
include "Db2udbTypes.cf"
include "OracleTypes.cf"
include "SybaseTypes.cf"

cluster PROD (
        UserNames = { admin = aLJnJTiULlJTiOLsJQiULoJUiULrKMiULnJTiR,
                 1 = bkiHknJtlFieHmkSkrJrkIhf,
                 2 = INOnNMmTNjNWmUNl,
                 3 = aLKhMJlHLnMLkKJeIGhHIh,
                 z_zoners_node1 = eHHeGCdKHpGChDHeHK,
                 z_zoners_node2 = bopPovMioUprOroPpl }
        ClusterAddress = "X.X.X.X"
        Administrators = { admin, 1, 2, 3 }
        )

system node1 (
        )

system node2 (
        )

group ClusterService (
        SystemList = { node1 = 0, node2 = 1 }
        AutoStartList = { node1 }
        )

        IP Cluster_IP_address (
                Device = aggr110001
                Address = "10.71.5.203"
                )

        NIC NIC (
                Device @node1 = aggr110001
                Device @node2 = aggr110001
                )

        NotifierMngr Notifier (
                SmtpServer = "X.X.X.X"
                SmtpReturnPath = "X.X.X.X"
                SmtpRecipients = { "X.X.X.X" = Information }
                )

        Cluster_IP_address requires NIC
        Notifier requires NIC


        // resource dependency tree
        //
        //      group ClusterService
        //      {
        //      IP Cluster_IP_address
        //          {
        //          NIC NIC
        //          }
        //      NotifierMngr Notifier
        //          {
        //          NIC NIC
        //          }
        //      }


group testzone_SG (
        SystemList = { node1 = 0, node2 = 1 }
        ContainerInfo @node1 = { Name = testzone, Type = Zone, Enabled = 1 }
        ContainerInfo @node2 = { Name = testzone, Type = Zone, Enabled = 1 }
        AutoStartList = { node1, node2 }
        Administrators = { z_zoners_node1, z_zoners_node2 }
        )

        DiskGroup testzone_testzoneBARdg (
                DiskGroup = testzoneBARdg
                )

        DiskGroup testzone_testzoneDATAdg (
                DiskGroup = testzoneDATAdg
                )

        DiskGroup testzone_testzonedg (
                DiskGroup = testzonedg
                )

        Mount testzone_container_mount (
                MountPoint = "/export/zones/testzone"
                BlockDevice = "/dev/vx/dsk/testzonedg/testzonedg_V01"
                FSType = vxfs
                FsckOpt = "-n"
                MntPtPermission = 700
                MntPtOwner = 0
                MntPtGroup = 0
                )

        Mount testzone_oraarch_mount (
                MountPoint = "/export/zones/testzone/oraarch"
                BlockDevice = "/dev/vx/dsk/testzoneDATAdg/testzoneDATAdg_V02"
                FSType = vxfs
                FsckOpt = "-n"
                )

        Mount testzone_oracle_mount (
                MountPoint = "/export/zones/testzone/oracle"
                BlockDevice = "/dev/vx/dsk/testzoneBARdg/testzoneBARdg_V01"
                FSType = vxfs
                FsckOpt = "-n"
                )

        Mount testzone_oradata_mount (
                MountPoint = "/export/zones/testzone/oradata"
                BlockDevice = "/dev/vx/dsk/testzoneDATAdg/testzoneDATAdg_V01"
                FSType = vxfs
                FsckOpt = "-n"
                )

        Netlsnr testzone_testzonedb_listener (
                Owner = oracle
                Home = "/oracle/product/11.2.0/dbhome_1"
                Listener = LISTENER
                EnvFile = "/oracle/.profile"
                )

        Oracle testzone_oracle_testdb (
                Sid = testdb
                Owner = oracle
                Home = "/oracle/product/11.2.0/dbhome_1"
                Pfile = "/oracle/product/11.2.0/dbhome_1/dbs/spfiletestdb.ora"
                EnvFile = "/oracle/.profile"
                )

        Volume testzone_container_volume (
                Volume = testzonedg_V01
                DiskGroup = testzonedg
                )

        Volume testzone_oraBAR_V01 (
                Volume = testzoneBARdg_V01
                DiskGroup = testzoneBARdg
                )

        Volume testzone_oraDATA_V01 (
                Volume = testzoneDATAdg_V01
                DiskGroup = testzoneDATAdg
                )

        Volume testzone_oraDATA_V02 (
                Volume = testzoneDATAdg_V02
                DiskGroup = testzoneDATAdg
                )

        Zone zoners (
                )

        testzone_container_mount requires testzone_container_volume
        testzone_container_volume requires testzone_testzonedg
        testzone_oraBAR_V01 requires testzone_testzoneBARdg
        testzone_oraDATA_V01 requires testzone_testzoneDATAdg
        testzone_oraDATA_V02 requires testzone_testzoneDATAdg
        testzone_oraarch_mount requires testzone_container_mount
        testzone_oraarch_mount requires testzone_oraDATA_V02
        testzone_oracle_mount requires testzone_container_mount
        testzone_oracle_mount requires testzone_oraBAR_V01
        testzone_oracle_testdb requires testzone_testzonedb_listener
        testzone_oradata_mount requires testzone_container_mount
        testzone_oradata_mount requires testzone_oraDATA_V01
        testzone_testzonedb_listener requires zoners
        zoners requires testzone_container_mount
        zoners requires testzone_oraarch_mount
        zoners requires testzone_oracle_mount
        zoners requires testzone_oradata_mount


        // resource dependency tree
        //
        //      group testzone_SG
        //      {
        //      Oracle testzone_oracle_testdb
        //          {
        //          Netlsnr testzone_testzonedb_listener
        //              {
        //              Zone zoners
        //                  {
        //                  Mount testzone_oracle_mount
        //                      {
        //                      Mount testzone_container_mount
        //                          {
        //                          Volume testzone_container_volume
        //                              {
        //                              DiskGroup testzone_testzonedg
        //                              }
        //                          }
        //                      Volume testzone_oraBAR_V01
        //                          {
        //                          DiskGroup testzone_testzoneBARdg
        //                          }
        //                      }
        //                  Mount testzone_oraarch_mount
        //                      {
        //                      Volume testzone_oraDATA_V02
        //                          {
        //                          DiskGroup testzone_testzoneDATAdg
        //                          }
        //                      Mount testzone_container_mount
        //                          {
        //                          Volume testzone_container_volume
        //                              {
        //                              DiskGroup testzone_testzonedg
        //                              }
        //                          }
        //                      }
        //                  Mount testzone_oradata_mount
        //                      {
        //                      Volume testzone_oraDATA_V01
        //                          {
        //                          DiskGroup testzone_testzoneDATAdg
        //                          }
        //                      Mount testzone_container_mount
        //                          {
        //                          Volume testzone_container_volume
        //                              {
        //                              DiskGroup testzone_testzonedg
        //                              }
        //                          }
        //                      }
        //                  Mount testzone_container_mount
        //                      {
        //                      Volume testzone_container_volume
        //                          {
        //                          DiskGroup testzone_testzonedg
        //                          }
        //                      }
        //                  }
        //              }
        //          }
        //      }

mikebounds
Level 6
Partner Accredited

Sorry forgot to mention, you need to get Oracle_A.log from in the local zone ("testzone"), not the global zone and this should give information about why Oracle won't start under VCS control.

There is not point switching until you can online and offline Oracle using VCS, so you should just try these two actions.  As VCS is detecting Oracle is onine, it is working to a certain extent, so I suspect it is something wrong with the /oracle/.profile.  Do you need this profile - are there any specific settings you need - even if there are, I would try without setting EnvFile, just to see if Oracle starts under VCS.

Mike

Ulysses_Leo_Lee
Level 3

Hi, Mike

There is no Oracle_A.log in the local zone.

I have also tried remove the EnvFile and manual online the db in sqlplus and then use Java console to switch this resource...still no use. It would be hang up in the process to shutdown the db.

Thanks,

Leo

mikebounds
Level 6
Partner Accredited

You could try removing pfile attribute as you don't need this and if you do use this you need to specify pfile and it looks as though you have specified spfile.

Mike

Ulysses_Leo_Lee
Level 3

Pfile attribute has been removed but still...fail....

Ulysses_Leo_Lee
Level 3

Is there anyone can look @ this and let me know how to proceed?

Thanks a lot in advance.

Ulysses_Leo_Lee
Level 3

Is there anyone can help me regarding this?

mikebounds
Level 6
Partner Accredited

You should open a Symantec Support case.  The only other thing I can think you can check is that you have the right types file.  Some agents are shipped with a 5.0 and 5.1 types file so you should check you have a ContainerOpts attribute and not a ContainerName attribute.

Mike

Ulysses_Leo_Lee
Level 3

Thank you very much. Mike.

I will open a ticket to symantec to ask them for help and would let you know when its closed.

Thanks again and have a nice day.

Leo

joseph_dangelo
Level 6
Employee Accredited

I noticed in the engine_log the following entry:

==============================================
VCS WARNING V-16-1-52529 Login Incorrect, Invalid username/password
==============================================

This can result from an improperly formed .vcspwd/halogin configuration. HAD very well be unable to determine the status of the Oracle instance.  One would think that if the listener is working then, the Database should follow suit.

You may want to try running the hazonesetup command or manually try the following:

globalzone> hauser -update  z_zoners_node1

set it to "password"

globalzone> hauser -update  z_zoners_node2

set it to password

localzone> VCS_HOST=PROD_CLUSTER_IP (You showed x.x.x.x in you main.cf)

localzone> export VCS_HOST

localzone> halogin z_zoners_node1  password

localzone> halogin z_zoners_node2  password

localzone> more /.vcspwd

it should look something like this:

100 PROD_Cluster_IP z_zoners_node1  nkInnNmnKnoMmghu
100 PROD_Cluster_IP z_zoners_node2  fopHojOlpKppNxpJom

This is also a good reference.

http://www.symantec.com/business/support/index?page=content&id=TECH159144

Give the Database Agent another try and tail the engine_logs.

Hope this helps.

Joe D

Ulysses_Leo_Lee
Level 3

Hi, All, thanks for your great help. Actually we need install SF5.1 SP1 to support the Oracle 11g DB because its Oracle 11gR2.

Thanks again for everyone's great help and I would close this ticket now.

joseph_dangelo
Level 6
Employee Accredited

It's very interesting that upgrading to 5.1 SP1 resolved the issue.  5.1 natively supports Oracle 11gR2 from a VCS perspective (assuming 64-bit Solaris 10).  The SFDB tools however are not.  Here is the latest DB support Matrix.

http://www.symantec.com/business/support/index?page=content&id=DOC4039

Take a look at page 8.

Joe D

Marianne
Level 6
Partner    VIP    Accredited Certified

Please remember to mention ALL relevant info in future....

If we knew a week ago this was 11gR2, we would've checked compatibility for you....cheeky