Forum Discussion

mokkan's avatar
mokkan
Level 6
11 years ago

Resource Fault question

When Resource goes offline unexpectedly,  agent monitor the resource and run clean entry point to bring resource offline and make it into Faulted state.  My quesitons is before it brings it into Faul...
  • Gaurav_S's avatar
    11 years ago

    Hi,

    You very well can but would you be able to get the specific time window to restart the resource ? Resource offline, clean will happen in within  of minutes (depending on how MonitorInterval, MonitorTimeout, RestartLimit) is set.

    If a resource is in transitioning state (onlining or offlining), you can flush (hagrp -flush) the service group so that tranistioning of resources stops & then you can take the manual action.

     

    G

  • Marianne's avatar
    11 years ago

    I agree with Gaurav.

    Increase RestartLimit. The default for most resource types is 0.

    You may want to read through this section in VCS Admin Guide:

    Controlling VCS behavior at the resource level

    Extract:

    About the RestartLimit attribute
    The RestartLimit attribute defines whether VCS attempts to restart a failed
    resource before informing the engine of the fault.
    If the RestartLimit attribute is set to a non-zero value, the agent attempts to
    restart the resource before declaring the resource as faulted. When restarting a
    failed resource, the agent framework calls the Clean function before calling the
    Online function. However, setting the ManageFaults attribute to NONE prevents
    the Clean function from being called and prevents the Online function from being
    retried.
     
    (VCS Admin Guide and other manuals can be found here: http://sort.symantec.com/documents )

     

  • mikebounds's avatar
    11 years ago

    I'm not sure if you are asking if "you" can start it or if "VCS" can restart it:

    If "you" restart resource before VCS detects it is down, then resource will not be marked as faulted, but if you are intentionally restarting, you should "freeze" service group so VCS does not interfere with your restart

    If RestartLimit is set to greather than zero, then VCS will restart resource and will not mark as faulted unless all restarts fail.

    Mike

  • Setu_Gupta's avatar
    11 years ago

    First the agent will call clean entry point to make ensure that the resource is completely offline. After that the agent will call the online entry point to restart the resource as per the RestartLimit attribute.

    This is also mentioned in the description of RestartLimit attribute pasted by Marianne above.

  • Marianne's avatar
    11 years ago

    Thank you very much for all of your input. Sorry for asking stupid basic queston.

    We don't mind basic questions - all of us were new at one stage and back then there was no Symantec Connect to ask. So, we had to read manuals.

    We do hope that you will read manuals when we point out the name of a manual and the relevant section.

    You will see that I quoted from the manual 2 days ago:

    If the RestartLimit attribute is set to a non-zero value, the agent attempts to
    restart the resource before declaring the resource as faulted. When restarting a
    failed resource, the agent framework calls the Clean function before calling the
    Online function.
     
    This means that when a resource 'goes offline unexpectedly' (normally because someone has killed/offline the process manually outside of cluster), the agent will run the Clean function (to be 100% sure processes are down) and the run the Online function.
     
    Best to educate dba's, users, etc... to use ha commands to offline resources...