Solved: SG is not switching to next node.

prabindr · ‎12-04-2014

Hi All,

I am new to VCS but good in HACMP.

In our environment we are using VCS-6.0,

I one server we found that the SG is not moving from one node to another node when we tried manual failover using the bellow command.

hagrp -switch <SGnamg> -to <sysname>

We able to see that the SG is offline in the currnent node but it's not coming online in the secondary node.

There is no error locked in engine_A.log except the bellow entry

cpus load more than 60% <Secondary node name>

Can anyone help me to find the solution for this. I will provide the output of any commands if you need more info to help me out to get this trouble shooted :)

Thanks,

mikebounds · ‎12-05-2014

Issue could be Preonline script so you could test if this is being called by amended preonline script (in /opt/VRTSvcs/bin/trigger) to create a file so you know if it is being called. If Preonline is being called then issue is with preonline script. Also you could just try disabling preonline (set PreOnline = 0 and remove PREONINE from PreOnline = 1 in your group attribiutes).

But your main.cf does not correspond to the log as the main.cf contains group mdwapd2_sg, but log says ora_sg_01, and the system names are different too, so please send correct main.cf

The log also says the mounts have to kill processes (looks like WebLogic), so any processes running in the mounts should be brought down cleanly by putting app under VCS control.

Mike

View solution in original post

Gaurav_S · ‎12-04-2014

Hi,

engine_A.log would be the place I would look in, if possible, attach the engine_A.log here. There will be entries around group been taken offline & then starting to online on other node ...

G

RiaanBadenhorst · ‎12-04-2014

Also post your main.cf, there might be limits set so its not failing over.

prabindr · ‎12-04-2014

Hi Gaurav,

Thanks for quick reply,

I have attached the engine_A.log as per your request please have a look.

Thanks

prabindr · ‎12-04-2014

I have update the post by attaching main.cf and engine_A.log file please have look.

Thanks for your response.

mikebounds · ‎12-05-2014

Issue could be Preonline script so you could test if this is being called by amended preonline script (in /opt/VRTSvcs/bin/trigger) to create a file so you know if it is being called. If Preonline is being called then issue is with preonline script. Also you could just try disabling preonline (set PreOnline = 0 and remove PREONINE from PreOnline = 1 in your group attribiutes).

But your main.cf does not correspond to the log as the main.cf contains group mdwapd2_sg, but log says ora_sg_01, and the system names are different too, so please send correct main.cf

The log also says the mounts have to kill processes (looks like WebLogic), so any processes running in the mounts should be brought down cleanly by putting app under VCS control.

Mike

prabindr · ‎12-05-2014

Thanks for your response :)

Will check the information and update you the status.

Sorry for the miss I have uploaded the other main.cf file. Now i update it please have a look.

Thanks,

Rufu

mikebounds · ‎12-05-2014

I still see the same main.cf with systems a3dvap1006/7, not aixdev001/2.

Mike

prabindr · ‎12-05-2014

Hi Mike,

Thanks for your suggestion, The fault is due to preonline script, The server is not having enough resource to bring the SG online so it's stoping the SG to come online.

Thanks for your help, I will have more doubts in future :)

Thanks,

Rufu

VOX

SG is not switching to next node.