07-29-2015 07:33 AM
Just some quick questions. We use Veritas Clustering for providing an HA solution for database servers. We have set up the vcs montoring to try to connect to the database every 3 minutes. After 4 attempts the VCS kills the primary pid and fails over to the other side. The box is self is fine, just vcs was unable to connect for whatever reason. The theory is that if the database is unresponsive for 12 minutes it must be in a hung state. Correctly or not, there are many reasons that a database might be unresponsive other than hung.
My question is this, should we be checking the host (hardware) not the database for whether or not to fail over. What are other companies that use VCS for HA doing to ensure that failovers do not happen for unnecessary or unintend reasons? Is VCS intended for checking application availability?
Jim
Solved! Go to Solution.
07-31-2015 01:31 AM
Hi,
VCS is supposed to check all your resources, disk, ip, mounts, and the the database (oracle, sybase, etc).
It is not required to enable second level monitoring, that is purely up to you and your organization. The thing is, as you stated, a database might be "online" (for instance pmon process in the case or Oracle are running), but it is not OPEN and ready for transactions.
The point of the second level monitoring is to inform you of such instances.
Why would your database not be available for the 2nd level monitor to work?
You could set the restart / tolerance limits so that it can handle these situations, but as you said, why is your database not able to respond for 12 minutes?
Is it really "hung" or is the monitor failing for some reason?
08-03-2015 04:30 AM
Agree totally with Riann, 2nd level monitoring is optional and default is Process check only, so if you don't want this behaviour then set back to the default (MonitorOption = 0)
Just to give a little more info:
If Oracle cannot update a row for 12 minutes, then many applications cannot survive in such a scenrio as the application will timeout, but if this is ok in your environment, then as earlier, disable this check.
Mike
07-31-2015 01:31 AM
Hi,
VCS is supposed to check all your resources, disk, ip, mounts, and the the database (oracle, sybase, etc).
It is not required to enable second level monitoring, that is purely up to you and your organization. The thing is, as you stated, a database might be "online" (for instance pmon process in the case or Oracle are running), but it is not OPEN and ready for transactions.
The point of the second level monitoring is to inform you of such instances.
Why would your database not be available for the 2nd level monitor to work?
You could set the restart / tolerance limits so that it can handle these situations, but as you said, why is your database not able to respond for 12 minutes?
Is it really "hung" or is the monitor failing for some reason?
08-03-2015 04:30 AM
Agree totally with Riann, 2nd level monitoring is optional and default is Process check only, so if you don't want this behaviour then set back to the default (MonitorOption = 0)
Just to give a little more info:
If Oracle cannot update a row for 12 minutes, then many applications cannot survive in such a scenrio as the application will timeout, but if this is ok in your environment, then as earlier, disable this check.
Mike
08-05-2015 09:14 PM
Please reply or mark the solution if it helped you.