Forum Discussion

bhawk1s's avatar
bhawk1s
Level 2
13 years ago

VCS Oracle cluster failover without using Oracle Agent

  We would like to have VCS control server failover (Active/Passive Cluster) but not register our Oracle databases to VCS, but rather have them be restarted by the standard Oracle restart scripts (which read the oratab file) from the Oracle installation.  Is there a writeup on how to accomplish this ???

  • You can also treat the oracle as an application if you like, then you can define your own start/stop/monitor/clean script/prgoram.

    But in my opinion, it is better to use Oracle agent to manage your database and listener.  Do you have any concerns that the agent reads your oratab file?   That is a very standard oracle setup.

     

    Regards,

    Eric

  • Do you want VCS to not take any action when the Oracle database is gracefully shutdown by DBA? In that case you could use the IntentionalOffline feature of the Oracle agent. You can read more about the feature and how to use it in the Oracle agent documentation. Eg. https://sort.symantec.com/public/documents/vcs/6.0/solaris/productguides/pdf/vcs_oracle_agent_60_sol.pdf

    VCS will still be able to failover the Oracle resource in case of an abnormal shutdown of the Database.

  • Can you provide some details on why you don't want the Oracle agent controlling the database in the cluster?

  •    Primarily to avoid having to register instances with VCS. We have had occurrences where the registration was not completed in VCS and certain DB's were not restarted because of this.  Also we have had DB's registered in VCS which no longer exist.

        I am from the DBA side and not familar with all VCS capabilities, but what we would like is to have the oratab control what Oracle instances start like on all our other non Clustered Servers when a failover occurs. We would like to not be required to register the DB's to VCS when they are created or dropped and would like VCS to ignore if a DB is started or stopped. We are only failing over at the server level,  not for any particular DB (active / passive server)

        We have implemented this on Virtual Nodes on our clusters successfully. How can we accomplish this on a non virtualized clustered server (or global zone if a virtualized solaris server ) ? We understand that just having the oracle scripts run as usual in a VCS controled failover for a non virtual node would not work as VCS would not have mounted the disk required for Oracle to start when they are invoked.  Is there a documented way of accomplishing what we would like to do ??

  • Well first off, there's no concept in VCS of an app "registering" with the cluster. An app is either under VCS control or it's not, whether that control is accomplished via a purpose-built agent (as is the case with Oracle), or through our generic Application agent.

    That being said, an app (Oracle instance) under VCS control can be set to not startup automatically, and cluster permissions can be granted to non-sysadmin staff to manually startup and shutdown the instance.

    Does that help?

  • VCS DB startup info is stored in main.cf at our shop,  Is there an option to have VCS read the DB instance info out of the oratab file ????

  •   Some,  By register,  I mean DBA communicating the new or old instance info to Unix Admins who then need to add or remove the Instance info to/from the main.cf file( this is what we are being asked to do by our unix admins).

      We would like to disable VCS from doing anything with the Oracle instances and hence not add them to the main.cf and treat oracle as an app, having a script that would call the std oracle startup scripts  after VCS has allocated all disk and applications are ready to be started.

     From replies so far it sounds like this is supported and easy to do if we just treat oracle as an application to VCS ??  Can you point me at any doc on implementing application startup scripts ??

    I am sure VCS can do more related to other oracle issues/failures but we are not using those features and just want simple failover.  Other than making us update the main.cf file whenever we add an instance,  it seems to add no value in our situation. Is there another way to avoid all this besides a custom application startup script that calls std oracle startup scripts ??

     Also Are we missing anything ??? What value does VCS add from a simple failover scenario where we don't want it to do anything other than startup the DB's on the other node when the active node fails ???

  • VCS can not only react to node failure, but can act on application failure too, so can take action on Oracle database or listener (or any other process) failure.  Configurable by resouce if an application component fails, VCS can restart process, fail to other node or combinations of these 2.  So for instance most DBAs have the Listener configured to restart 1 - 3 times and if this does not work, then failover whole group (database, listener, storage etc) to the other node and if database fails then don't restart, just failover straight away.  In terms of detecting database failure, from 5.1SP1, you have 3 methods:

    1. Check processes (pmon etc)
    2. Use Oracle Heathcheck API
    3. Use an SQL scripts (an example is provided which I think checks you can write a row to a table, but you can use your own if you want).

    In terms of adding Oracle to VCS, there are at least 3 options:

    1. Add as an Oracle resource per instance
    2. Add as an Aplication resource per instance
    3. Add as a single application resource (per service group with independent storage) controlling many Oracle instances

    If you don't want to change VCS, when you add/remove/change Oralce instance name then option 3 should acheive this.  The Application resource has attributes:

     

      StartProgram

      StopProgram

      MonitorProcesses (list of processes to monitor) or/and MonitorProgram or/and PIDfiles

    Optionally you can set:

      CleanProgram: script to forcibly stop the application

      User: if not root

      From 5.1SP1 you also have "UseSUDash" and EnvFile attributes.  

    I would write you script something like:

     

     oratab_path=$2
    Read oratab file using passed $2
    
    case "$1" in
    start)
        Start all databases in oratab
        ;;
    stop)
        Stop all databases in oratab
        ;;
    monitor)
        Check all database in oratab are running
        if databases that should be running are down
        then
            exit 100
        else
            exit 110
        fi ;;
    *) 
        echo "Usage: $0 {start|stop|monitor} oratab_path
        exit 1
        ;;
    esac  
     

    Then configure in VCS (if script above is called /opt/VRTSvcs/bin/oragrp):

     
     Application OraGrp (
        StartProgram = /opt/VRTSvcs/bin/oragrp start /etc/oratab
        StopProgram = /opt/VRTSvcs/bin/oragrp stop /etc/oratab
        MonitorProgram = /opt/VRTSvcs/bin/oragrp monitor /etc/oratabb
    ) 
     
    Note the MonitorProgram can report offline (return code 100) if any one database is down, or if you have critical and non-critical databases then it could only report offline if one of the critical databases is down and hence only failover if a critical database is down (or in this instance you could have 2 application resources, one representing critical and the other for non-critical databases).  Or if you want to completly ignore state of Oracle databases you could have Start/Stop create/remove a lockfile and MonitorProgram just checks lockfile exists.
     
    Another alternative is to create a script that create/removes an Oracle resource and you can set-up a VCS oracle user in VCS so DBAs can run this script.  Script could ask them for Sid, owner, oracle home and listener name and then script could create appropiate Oracle and listener resources.  This is a very trivial script to write.
     
    Mike
  • Somehow I miss one step - how /etc/oratab is updated in case of host failover.

    You want to start only databases entered in /etc/oratab. In normal case half of your databases are on host1, half of your databases are on host2. Host1 fails. VCS takes over. How databases that had been on host1 are supposed to be started now? Somebody must go and manually update /etc/oratab? Second, how do you know which databases had been up and running (and need restart after failover) and which databases had been shut down (and do not need restart after failover)?

  • Another advantage of using VCS for monitoring is the quick detection of database failure.

    Since 5.1SP1, VCS Oracle agent can also detect any process failure instantaneously with IMF (intelligent monitoring framework), thus reducing the application downtime.