04-23-2013 01:29 AM
Environment
Two Node Cluster
OS = Solaris 10
VCS(only HA and not vxvm) = 6.0
Applicaation agent is using for my Application
Error
VCS ERROR V-16-2-13064 Thread(5) Agent is calling clean for resource(APPLICATION) because the resource is up even after offline completed.
Just for INFO
https://sort.symantec.com/public/documents/sf/5.0MP3/solaris/html/vcs_users/ch_about_vcs21.html
Extract from the above TN:
Clean—Cleans up after a resource fails to come online, fails to go offline, or fails while in an online state. The clean entry point is designed to clean up after an application. The function ensures that the host system is returned to a valid state. For example, the clean function may remove shared memory segments or IPC resources that are left behind by a database
Query
I have an Application which have multiple processes. My Application stop/offline function can stop/offline all these processes. To monitor the Application under VCS I am monitoring only one process in Monitor attribute under Java Console. This process(which I am monitoring) is offline successfully when I did offline my Service Group. But under my Application logs(not VCS logs, its Application own logging) I saw that few processes was not able to close/shutdown/offline. Does the above highlighted ERROR can appear in my condition ?
04-23-2013 01:41 AM
Can you share the main.cf configuration and list of processes started by yopur application? Also share the Application_A.log file content when the message is logged.
Application agent stops the processes configured in PidFile/MonitorProcesses and/or the processes stopped by the StopProgram configured.
In your case, what are the monitoring methods configured for Application resource? Is it only MonitorProcesses?
Regards,
Venkat.
04-23-2013 02:16 AM
Will share the Application logs shortly
............................
Application agent stops the processes configured in PidFile/MonitorProcesses and/or the processes stopped by the StopProgram configured.
AGREE WITH YOU
(agree with you but does the processes which failed to stop by StopProgram and these processes are also not being monitoring too , does those process are the real cause of the subjected ERROR in my case ?)
............................
In your case, what are the monitoring methods configured for Application resource? Is it only MonitorProcesses?
YES
....................
04-23-2013 02:16 AM
If you only monitor one process, there are a couple of issues:
Neither of these explain your issue, but it maybe as there are a lot of processes to shutdown, the offline routine may exit before all the processes are shutdown so the process VCS monitors may still be running (see https://www-secure.symantec.com/connect/forums/can-we-put-any-time-delay-between-resources-offline for a solution to this)
Mike
04-23-2013 02:42 AM
@ Mike
Agree with all your words. I meant above, when I did offline the ServiceGroup and at the point of Application Resource(down) the StopProgram triggered and under Application logs(its Application built-in logs not the VCS logs) I can see all the processess started stopping but few process left stopping.
On the behalf of this I ask my first question
04-23-2013 02:49 AM
If one of the processes which is not stopped is the one VCS is monitoring, then VCS will call the clean.
If none of the processes which are not stopped are the one VCS is monitoring, then VCS will not know these processes are not stopped, so the clean will not be cleaned.
Mike
04-23-2013 03:10 AM
Is it a possibility that the process you are monitoring is started by any other daemon started by application?
Also do you have any processes that look similar/same to the process you are monitoring for the application? In this case the Application agent be monitoring the first process listed in ps.
Any further information related to Application agent log, engine log, ps output will give more idea on what is happening in your system.
Regards,
Venkat
04-26-2013 12:41 AM
Appologized. I think I was in hurry and I really missed last line of the logs. Kindly see the below logs in which VCS agent tried to clean twice and in the third attempt the clean procedure completed successfully in the same second.
04-26-2013 01:10 AM
This indicates the issue maybe as I posted earlier:
as there are a lot of processes to shutdown, the offline routine may exit before all the processes are shutdown so the process VCS monitors may still be running
Then the clean gets called and maybe process is still running and by the third attempt it is down. Unless you have specified your own clean CleanProgram, then the clean will just kill JUST the process you have specified in MonitorProcesses.
Your original query was:
Does the above highlighted ERROR can appear in my condition ?
If you means "Can only monitoring one process of a multiprocess application cause the offline to fail", I think the answer is no - i.e if the offline is failing it would still fail if you monitored all processes as this error just means the one process you are monitoring is still running after the offlines completes, so if VCS was checking all processes it would still find this single process (and maybe others) that are still running and so would still then call clean.
One exception to all of the above is any of the processes are identical in "/usr/ucb/ps -ww" - if this is the case you should write your own MonitorProgram as identical processes do not work well with the MonitorProcesses attribute
Mike
04-26-2013 02:17 AM
04-26-2013 06:28 AM
Thanks all for kind replies.
========
21 01:09:16 SEC-NODE AgentFramework[482]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(APPLICATION) - clean completed successfully.
04-26-2013 06:48 AM
From Moby Thesaurus II by Grady Ward, 1.0 (moby-thes)
23 Moby Thesaurus words for "amazing": astonishing, astounding, awesome, breathtaking, confounding, dazzling, extraordinary, eye-opening, fabulous, mind-boggling, miraculous, overwhelming, prodigious, remarkable, spectacular, staggering, startling, strange, stunning, stupendous, surprising, wonderful, wondrous
.... but I digress.
"ERROR" refers to the severity level, not whether the clean was successful/unsuccessful/otherwise . So if you setting notification levels and said you wanted messages of WARNING or higher, you would get WARNING, ERROR, and CRITICAL messages.
The reason for using the ERROR severity level is probably because the reason a clean is being called was due to an ERROR, so it would make sense for the associated high level event(s) to log at that severity so you can see the result of the clean operation without upping/changing the sev/notification level logging just to see the result.
See the Agent Developer's Guide for further info about agent logging messages and severity levels.