cancel
Showing results for 
Search instead for 
Did you mean: 

DB failures in VCS node

allaboutunix
Level 6

Hi Team,

 

We have two servers in a VCS (Planet - Space), in planet DB's are experiencing DB failures, they reported to Unix team to look out from their side.

I am not able to paste the output of errors, but could you please suggest what best steps we should have to  take to resolve the issue?

3 ACCEPTED SOLUTIONS

Accepted Solutions

Sunil_Yadav
Level 4
Employee

Hi Allaboutunix,

 

Information provided is too less to suggest anything. We need more information to narrow down issue and suggest steps to resolve it.

 

Thanks & Regards,
Sunil Y

View solution in original post

sudhir_h
Level 4
Employee

Kindly check if the system is heavy loaded. If the Oracle monitor's are continuously timing out and the nuuber of the monitor timeouts equals the value set on the FaultOnMonitorTimeout attribute then they may lead to VCS marking the resource as fault and bringing down the resource. Also check if you have SecondLevelMonitor enabled for the Oracle resource and the output of the SQL script run will be seen in the agent/engine log. Enable appropriate Debug log level.

Also if there is a dependent child resource that is faulted it may cause all the parent resource to be taken offline.

This may be some of the causes, to know the exact issue you may need to check the VCS engine/agent logs and the Oracle database alert log. Kindly provide the log files if you need any help in locating the issue.

 

Regards,

Sudhir

View solution in original post

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hi,

Are you seeing the failures in VCS or failures at DB layer & VCS just acting to it ?

I would suggest to collaborate with DB team & find out why DBs are failing. VCS will act to failures as expected or defined.

If you see that DBs are failing outside of VCS, then get the DBs validated outside of VCS (freeze VCS groups) & ensure DBs are setup appropriately.

I wouldn't suggest to touch any VCS config unless you are sure about failure.


G

View solution in original post

3 REPLIES 3

Sunil_Yadav
Level 4
Employee

Hi Allaboutunix,

 

Information provided is too less to suggest anything. We need more information to narrow down issue and suggest steps to resolve it.

 

Thanks & Regards,
Sunil Y

View solution in original post

sudhir_h
Level 4
Employee

Kindly check if the system is heavy loaded. If the Oracle monitor's are continuously timing out and the nuuber of the monitor timeouts equals the value set on the FaultOnMonitorTimeout attribute then they may lead to VCS marking the resource as fault and bringing down the resource. Also check if you have SecondLevelMonitor enabled for the Oracle resource and the output of the SQL script run will be seen in the agent/engine log. Enable appropriate Debug log level.

Also if there is a dependent child resource that is faulted it may cause all the parent resource to be taken offline.

This may be some of the causes, to know the exact issue you may need to check the VCS engine/agent logs and the Oracle database alert log. Kindly provide the log files if you need any help in locating the issue.

 

Regards,

Sudhir

View solution in original post

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hi,

Are you seeing the failures in VCS or failures at DB layer & VCS just acting to it ?

I would suggest to collaborate with DB team & find out why DBs are failing. VCS will act to failures as expected or defined.

If you see that DBs are failing outside of VCS, then get the DBs validated outside of VCS (freeze VCS groups) & ensure DBs are setup appropriately.

I wouldn't suggest to touch any VCS config unless you are sure about failure.


G

View solution in original post