Forum Discussion

Ashish_C's avatar
Ashish_C
Level 3
11 years ago

Application Resource failing while doing SG Fail-over test

 

Hi,

 

I'm facing some issues, while  doing a failover test of an application, which is configured in VCS as a service group.

Application resource is getting faulted in the standby node while doing the fail-over.

 

For testing purpose, i have made the SG offline on all the nodes and made the resources online one by one on the secondary node,

After making this, all the resources where detected online in vcs console, except the application resource, and as per the application team, the service which was supposed to start after starting the application is started on the node. 

VCS was not throwing any error at this time. ** (as per the support team, they suggested me to check with appication team, as VCS is not giving error in engine log.)

 

When I'm doing the same test on the active node,application resource is detecting online and same is reflecting in the console.

 

Pls suggest.

 

Regards,

 

Ashish C

 

 

 

 

 

 

 

 

  • The logs don't show issue occuring on PmsProd26, but VCS is only reporting what your scripts are doing, so this is a script/environment issue.

    Try the following on each node:

    Bring all resources up, except application, then

    # sh
    # /etc/VRTSvcs/conf/HPPI/Run_OVPI.ksh
    # /etc/VRTSvcs/conf/HPPI/Monitor_OVPI.ksh
    # echo $?

    If  /etc/VRTSvcs/conf/HPPI/Monitor_OVPI.ksh is not returning exist code 110, then you need to fix script.

    Mike

  •  

    Hi Gaurav,

     

    Even I had tried copying the start and monitoring scripts from PmsProd25 ( Active node ) to PmsProd26. It doesnt worked. 

     

    Regards,

     

    Ashish C

  •  

    Hi Mike,

    I didnt tried changing the parameter yet, as the Application support team is working on that. 

     

     I'm trying to make the SG from offline state to online on PmsProd25( Active node ), its happening. But when I'm doing the same like, making offine to online on PmsProd26 its faulting.  

    My query is, if I'm changing the parameter, it has to be applied on all the nodes, here in my case its already working on one of the node. Will the parameter change help me??

    Below in the commands which I'm going to execute for the suggested change. 

    # haconf -makerw
    # hares -modify HPPI_Appl UseSUDash 1
    # haconf -dump -makero.
     

     

     

    Regards,

     

    Ashish C

     

     

     

     

  • The logs don't show issue occuring on PmsProd26, but VCS is only reporting what your scripts are doing, so this is a script/environment issue.

    Try the following on each node:

    Bring all resources up, except application, then

    # sh
    # /etc/VRTSvcs/conf/HPPI/Run_OVPI.ksh
    # /etc/VRTSvcs/conf/HPPI/Monitor_OVPI.ksh
    # echo $?

    If  /etc/VRTSvcs/conf/HPPI/Monitor_OVPI.ksh is not returning exist code 110, then you need to fix script.

    Mike

  •  

    Hi Mike,

     

    Application team had made some changes after checking their logs. After that we made the application online manually on the second node, which was not happening before.

     

    Failover tests are yet to complete. If it doesnt works,  we will go for changing the SUDash parameter as you suggested.

    Hopefully, it will failover, as it came online on the secondary node.

     

    Regards,

    Ashish C

  • Hi All,

     

    Thanks to all, who guided and helped me in this discussion. 

    Sorry for wasting all yours time, It was application related issue, it worked after making the changes on the application scripts by the support team. It was a good experience too work with all you guys, i got much more idea in VCS while discussing my issue here.

    Again, thanks to all for your support and response.

     

    Regards,

     

    Ashish C