cancel
Showing results for 
Search instead for 
Did you mean: 

Can not core dump because of system hangs

auzelgencer
Level 2

Hi,

using these products :

VRTS STORAGE FOUNDATION ENTERPRISE HA/DR FOR ORACLE RAC 5.1 UNX PER CPU TIER 2 STD LIC EXPRESS BAND S
VRTS VOLUME REPLICATOR OPTION 5.1 UNX PER CPU TIER 2 STD LIC EXPRESS BAND S
VRTS CLUSTER SERVER 5.1 UNX PER CPU TIER 2 STD LIC EXPRESS BAND S

while system freezes / hangs completely -- no response from mouse, keyboard, anything; even can not take core dump to solve/fix the problem.

is there a way to produce immediate dumps exactly on hang time? i.e. how can we set regular dumpings on every minute to catch the problem..
 

4 REPLIES 4

Gaurav_S
Moderator
Moderator
   VIP    Certified

What is the OS you are using ?  What is server hardware ?

Are you talking about process core or system crash dump ?

Finally, do you believe that this is an issue with Veritas product ? If yes, why ?

 

 

G

auzelgencer
Level 2

OS: Solaris 10 Update 9 ais patch January 2011

Server Hardware: SUN M9000  8 CPU, 128 GB RAM   X 2 Oracle RAC

if hang state occurs, even "send-break", "halt -d" commands give response after approx. 40 minutes..

Gaurav_S
Moderator
Moderator
   VIP    Certified

Well, there could be so many reasons here , not necesarilly veritas products only..

To eliminate this thought, I can think of is, disabling the entire veritas stack & then restart the server, trying generating a crash dump at that point when no veritas product is running. If still you get the same issue, then best would be to check with Oracle ..

 

G

 

Gaurav_S
Moderator
Moderator
   VIP    Certified

Forgot to mention above, you can disable cluster startup by disabling the cluster startup scripts (S70llt, S92gab, S99vcs)

To disable vxvm stack, you will need to touch a file (# touch /etc/vx/reconfig.d/state.d/install-db) ... once you touch this file & restart the server, vxvm will not start.

Just little tip on OS part, is the crash dump configured correctly at OBP level .. there use to be some parameter (not able to collect from top of my head) .. I think "eror-reset-recovery" .. the value should be set to "sync" .. have a look around..

 

G