Forum Discussion

cmvs's avatar
13 years ago

Problem in switching over the resource group

Hi All   I am facing problem in switching over the resource group.   When I use hagrp -switch  -to cmd to swithch over a Resource group , the resource group switch over absolutely fine as sho...
  • mikebounds's avatar
    13 years ago

    The only difference between switch and offline & online should be timings - i.e there is delay between when group offlines and when it onlines when you do offline&online.  Other than that the commands VCS is calling should be identical for the 2 procedures which are:

    1. offline routinte for lister
    2. offline routine for platform 
    3. offline routine for ip
    4. online routine for ip
    5. online routine for platform
    6. online routinte for lister

    This is a little simplified as VCS will also call monitor routine after each offline and online to check resource is offline or online.

    Presumebly once offline has faulted you clear fault and then run online and online works - or if it doesn't, what do you have to do to get online to work.

    For the switch, in the logs above you show 

    2012/04/03 13:02:40 VCS NOTICE V-16-1-10301 Initiating Online of Resource cme-platform-res

    But do not show:

    Resource cme-platform-res (Owner: unknown, Group: cme-platform-sg) is online on dlcmdn1 (VCS initiated)

    how long does this take to online for switch?

    For the offline & online the online of cme-platform-res seems to timeout after 90 seconds - have you changed the OfflineTimeOut from default of 300 seconds to 90 seconds and if not, what is the Type of this resource.  If you have changed to 90 seconds, it could be this is too short

    In terms of debugging you could see what happens in the following scenarios:

    1. Does offline and online in quick succession work? 
      I think you maybe able to do this by (long time since I have used "-wait" so not sure of syntax)
      hagrp -offline cme-platform-sg  -sys dlcmdn2 ;  hagrp -wait cme-platform-sg State OFFLINE -sys dlcmdn2 ; hagrp -online cme-platform-sg  -sys dlcmdn1
       
    2. Offline group on dlcmdn2 and then online ip resource manually on dlcmdn1 and then online platform resource manually without using VCS.
      If platform resource fails or it takes longer than 90 seconds, then you will have to debug your app.
       
    3. Wait a period of time - maybe 5 mins, 10 mins or an hour between offline and online.  Does online always fails or is it dependent on how long you wait between offline and online
       

    If you are still not able to solve issue after above, post the results of tests and main.cf

    Mike