cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle agent: "online procedure did not complete within the expected time" but Oracle instance is UP

Robur
Level 3

When I'm trying to put online an Oracle resource via Cluster Manager, even after the instance is really running (processes running), the resource is NOT online and VCS shows the following messages, running the 'clean' entry and SHUTING DOWN an active instance. Apparently not problems of system resources (CPU, speed of disks,...). Version SFHA 5.1 SP1 RP2P2. Any idea?

 

2012/02/29 19:39:03 VCS NOTICE V-16-1-10301 Initiating Online of Resource ora_pprod10 (Owner: Unspecified, Group: pprod10) on System caspio
2012/02/29 19:44:04 VCS WARNING V-16-2-13012 (caspio) Resource(ora_pprod10): online procedure did not complete within the expected time.
2012/02/29 19:44:04 VCS ERROR V-16-2-13065 (caspio) Agent is calling clean for resource(ora_pprod10) because online did not complete within the expected time.
2012/02/29 19:44:06 VCS NOTICE V-16-20002-25 (caspio) Oracle:ora_pprod10:clean:Oracle shutdown returned the output
+--------------------------------------------------------------------+
LD_LIBRARY_PATH - /database/product/10.2.0.4/lib:/usr/lib:
ORACLE instance shut down.
+====================================================================+
2012/02/29 19:44:06 VCS INFO V-16-2-13068 (caspio) Resource(ora_pprod10) - clean completed successfully.
2012/02/29 19:44:06 VCS INFO V-16-2-13071 (caspio) Resource(ora_pprod10): reached OnlineRetryLimit(0).
2012/02/29 19:44:06 VCS ERROR V-16-1-10303 Resource ora_pprod10 (Owner: Unspecified, Group: pprod10) is FAULTED (timed out) on sys caspio
2012/02/29 19:44:37 VCS INFO V-16-1-10307 Resource ora_pprod10 (Owner: Unspecified, Group: pprod10) is offline on caspio (Not initiated by VCS)

 

Thanks in advance.

8 REPLIES 8

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Something wrong with Oracle resource definition?

Please post ora_pprod10 resource definition in main.cf.

Robur
Level 3

From main.cf:

        Oracle ora_pprod10 (
                Pfile = "/database/product/10.2.0.4/dbs/initpprod10.ora"
                Owner = oracle10
                User = vcs
                Table = sys_alive
                Home = "/database/product/10.2.0.4"
                StartUpOpt = STARTUP
                Sid = pprod10
                Pword = HVNtKVkNInJNkTUvKTk
                AgentDebug = 1
                LevelTwoMonitorFreq = 1
                )

 

From OracleTypes.cf:

type Oracle (
        static boolean IntentionalOffline = 0
        static str AgentDirectory = "/opt/VRTSagents/ha/bin/Oracle"
        static keylist SupportedActions = { VRTS_GetInstanceName, VRTS_GetRunningServices, DBRestrict, DBUndoRestrict, DBResume, DBSuspend, D
BTbspBackup, "home.vfd", "owner.vfd", getid, "pfile.vfd" }
        static int RestartLimit = 2
        static str ArgList[] = { Sid, Owner, Home, Pfile, StartUpOpt, ShutDownOpt, DBAUser, DBAPword, EnvFile, AutoEndBkup, DetailMonitor, Us
er, Pword, Table, MonScript, AgentDebug, Encoding, MonitorOption }
        static int ContainerOpts{} = { RunInContainer=1, PassCInfo=0 }
        static str IMFRegList[] = { Home, Owner, Sid, MonitorOption }
        str Pfile
        str Owner
        str User
        str Table
        str EnvFile
        int DetailMonitor
        str Home
        str StartUpOpt = STARTUP_FORCE
        int MonitorOption
        str MonScript = "./bin/Oracle/SqlTest.pl"
        boolean AutoEndBkup = 1
        str Encoding
        str Sid
        str DBAPword
        str Pword
        str DBAUser
        str ShutDownOpt = IMMEDIATE
        boolean AgentDebug = 0
)

 

mikebounds
Level 6
Partner Accredited

Problem is probably that you specified Pfile as Oracle usually uses an SPfile so if this is the case then you should leave pfile attribute blank.  If this is not the issue then I would trying unsetting the optional attributes:

                User = vcs
                Table = sys_alive
                Pword = HVNtKVkNInJNkTUvKTk

                LevelTwoMonitorFreq = 1

If it works without setting these, then there is a problem with the second level monitoring.

If still having problems, then also give output of "ps -ef | grep pmon"

Mike

Robur
Level 3

Processes are running after 'startup' entry is executed, but Monitor (with DetailMonitor=1 and LevelTwoMonitorFreq=1) doesn't seem to realize about this.

At the moment, the resource of Oracle type named "ora_pprod10" with DetailMonitor=0 works fine.

"Pfile" parameter is defined in the same way and DetailMonitor=1 over the other 18 Oracle resources and all works fine.

###################################3

# ps -ef | grep pmon

oracle10 29874     1   0   Feb 29 ?           0:19 /database/product/10.2.0.4/bin/tnslsnr listener_pprod10 -inherit
 

# ps -ef | grep pprod10

oracle10 27088     1   0   Feb 29 ?           0:20 ora_qmnc_pprod10
oracle10 27041     1   0   Feb 29 ?           2:40 ora_ckpt_pprod10
oracle10 27033     1   0   Feb 29 ?           0:11 ora_psp0_pprod10
oracle10 27045     1   0   Feb 29 ?           0:00 ora_reco_pprod10
oracle10 27035     1   0   Feb 29 ?           0:13 ora_mman_pprod10
oracle10 27081     1   0   Feb 29 ?           0:06 ora_arc0_pprod10
oracle10 27037     1   0   Feb 29 ?           1:11 ora_dbw0_pprod10
oracle10 27051     1   0   Feb 29 ?           8:07 ora_mmnl_pprod10
oracle10 27043     1   0   Feb 29 ?           0:32 ora_smon_pprod10
oracle10 27371     1   0   Feb 29 ?           0:00 ora_q001_pprod10
oracle10 27039     1   0   Feb 29 ?           3:46 ora_lgwr_pprod10
oracle10 27031     1   0   Feb 29 ?           3:21 ora_pmon_pprod10
oracle10 27049     1   0   Feb 29 ?           0:52 ora_mmon_pprod10
oracle10 27047     1   0   Feb 29 ?           6:11 ora_cjq0_pprod10
oracle10 15361     1   0 11:17:42 ?           0:00 ora_q000_pprod10
oracle10  1518  1517   0   Mar 01 ?          23:47 oraclepprod10 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle10 27083     1   0   Feb 29 ?           0:11 ora_arc1_pprod10
oracle10  3680     1   0 05:06:07 ?           0:01 ora_q004_pprod10
oracle10 21923     1   0 10:50:20 ?           0:00 ora_q002_pprod10
oracle10 12072     1   0   Mar 01 ?           0:09 oraclepprod10 (LOCAL=NO)
...

# ps -ef | grep -i oracle|grep -i agent
    root  8829     1   0   Feb 29 ?          31:41 /opt/VRTSagents/ha/bin/Oracle/OracleAgent -type Oracle -agdir /opt/VRTSagents/h

# ptree 8829
8829  /opt/VRTSagents/ha/bin/Oracle/OracleAgent -type Oracle -agdir /opt/VRTSagents/h
 


 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Additional parameters needed for Level Two Monitoring.

See Setting up detail monitoring for VCS agents for Oracle in https://sort.symantec.com/public/documents/sfha/5.1sp1/solaris/productguides/pdf/vcs_oracle_agent_51sp1_sol.pdf

 

mikebounds
Level 6
Partner Accredited

Looking at Oracle VCS agent 5.1SP1, it's pretty confusing as Symantec have introduced new MonitorOption attirbute so now there is:

Primary or basic monitoring consisting of:

  Process check 

  OR (NOT and) Healthcheck using Oracle Health Check API

  This is determined by new MonitorOption attirbute

 
AND you can optionally use Detail or secondlevel monitoring which is determined by the DetailMonitor attribute and this has changed since 5.1 (with no SP1) as Detail Monitor could be greater than 1 to denote running second level monitor every nth monitor interval, but now you need to use LevelTwoMonitorFreq  attribute for this.
 
The "DetailMonitor" monitor was not present in your initial post of main.cf so Detailed monitored was set to 0, and you say in your last post that Oracle Monitor works with DetailMonitor=0, but if you set to 1, you say it does not work, but you seem to have the right additional parameters set:
 

                User = vcs
                Table = sys_alive
                Pword = HVNtKVkNInJNkTUvKTk

                LevelTwoMonitorFreq = 1

It maybe you double encrypted password - i.e if you add password in GUI you do not need to encrpt as GUI encrypts for you.  If this is not the issue then it is probably an Oracle issue like Oracle user "vcs" does not have the right privileges to update table "sys_alive".  You can have a look at the perl script the agent uses - /opt/VRTSagents/ha/bin/Oracle/SqlTest.pl and try the SQL statement manually in Oracle (I think it just updates a row in the table).

Mike

 

Kimberley
Level 6
Partner

As an update, we've forwarded this thread along to our documentation team and the group responsible for rolling upgrades, so that they can make any necessary changes. Thanks for pointing this out, Mike!

Best,

Kimberley

hshinde
Level 2
Employee

Online entry point getting timed out is a completely different issue than the one in which detail monitoring does not work.

The below error suggests that the time taken by Oracle to start a perticular database instance is significantly high (5+ minutes). 

"2012/02/29 19:44:04 VCS WARNING V-16-2-13012 (caspio) Resource(ora_pprod10): online procedure did not complete within the expected time."


The VCS online operation makes a blocking call to sqlplus utility. The call returns immediately in case of any failure. The timeouts mean that the call to sqlplus did not return.

***

The Issue with detail monitoring not working seems to be due to access permissions. The user needs to have read and write permisssions to the table.have you granted those permissions to the user ?