we have 2 node ACTIVE-PASSIVE DB cluster, but due to h/w issue my DB active node got down; but service grp is not get fail over to another node.
We manually start SG on another node but database got in recovery mode, SG is not fully coming fully online. It’s starting DB resource & after some time it come to faulted state.
after trying to many thing, management decided that we will manually start file system & DB VIP & database & listener will start manually.
Now I want disable the only oracle & listener resource in that service grp. (steps from mine end: i set Critical attribute to 0 for DB & listener resource, but due to this my service grp going to partial state only.)
Also in future, if my server got down by any issue, then service grp automatically got switch over to another node. No manual failover.
Can u help me???
Let us firstly try to see what happened during failover attempt.
My guess is due to the length of time it took to recover the database during startup, the OnlineTimeout kicked in and marked the resources as Faulted.
Changing resource Critical Attribute to 0 should not prevent a service group from coming online completely.
If the SG was partially online one node, then a failover (switch) will bring the SG online on the the other node in the same partial state.
Please copy all entries in Engine_A log from the start of the problems to engine.txt and upload as File Attachment.
There are various options that we can suggest to prevent automatic failover, but let's first try to determine what happened.
Failover options have also changed across VCS versions - so, please tell us which version.
If you want to disable vcs monitor for Oracle & lsnr,
first , you need offline Oracle&lsnr
hares -modify oracle_resource_name Enabled 0
hares -modify lsnr_name Enabled 0
haconf -dump -makero
then manual start oracle and lsnr.
after failover service group, you still need manual start oracle & lsnr.
Here is a example in lab:
[root@server101 log]# haconf -makerw
[root@server101 log]# hares -modify mount_lvol Enabled 0
[root@server101 log]# haconf -dump -makero
[root@server101 log]# hares -display mount_lvol
#Resource Attribute System Value
mount_lvol Group global saaApp1
mount_lvol Type global Mount
mount_lvol AutoStart global 1
mount_lvol Critical global 0
mount_lvol Enabled global 0 <<<<<
mount_lvol LastOnline global server101