Forum Discussion

ajsakthivel's avatar
16 years ago
Solved

Monitor timed exception coming for the veritas ha 5.1

hi,

What will  the default monitor timed valued set in the clustering for HA 5.0 and 5.1

My project working fine for the 5.0 with the default time valued but when i migrate to 5.1 it is not working usually my agent will  take 20 minutes to get online.

in the engine_A.txt

2010/05/08 11:50:29 VCS ERROR V-16-2-13028 (lms-t1000-5) Resource(Agent) - the last (12) invocations of the monitor procedure did not complete within the expected time.
2010/05/08 12:03:21 VCS ERROR V-16-2-13027 (lms-t1000-5) Resource(datadg) - monitor procedure did not complete within the expected time.
2010/05/08 12:06:25 VCS ERROR V-16-2-13027 (lms-t1000-5) Resource(datadg) - monitor procedure did not complete within the expected time.
2010/05/08 12:12:02 VCS ERROR V-16-2-13027 (lms-t1000-5) Resource(ntfr) - monitor procedure did not complete within the expected time.
2010/05/08 12:13:23 VCS ERROR V-16-2-13027 (lms-t1000-5) Resource(datadg) - monitor procedure did not complete within the expected time.
2010/05/08 12:50:38 VCS ERROR V-16-2-13028 (lms-t1000-5) Resource(Agent) - the last (24) invocations of the monitor procedure did not complete within the expected time.

Can any one please help on this. this will  be very useful  for me.

thanks and Regards,
Sakthivel.A

  • All Resource Types have a default MonitorInterval and MonitorTimeout of 60 seconds.
    Check what yours is set to, for example:
    hatype -display DiskGroup
    ......
    DiskGroup    MonitorInterval         60
    ....
    DiskGroup    MonitorTimeout          60
    ....

    Before adjusting these timeouts, consider the following recommendation in VCS Admin Guide:

    For best results, Symantec recommends measuring the time it takes to bring a resource online, take it offline, and monitor before modifying the defaults. Issue an online or offline command to measure the time it takes for each action. To measure how long it takes to monitor a resource, fault the resource and issue a probe, or bring the resource online outside of VCS control and issue a probe.


    To change timeouts, you can do it in the GUI by selecting the Resource Type, or from cmd:
    hatype -modify <type> <attr> <value>
    e.g.
    hatype -modify DiskGroup MonitorInterval 120


5 Replies

  • Both the Agent and datadg are going to  Monitor timed out state....\

    Can any one please help

  • All Resource Types have a default MonitorInterval and MonitorTimeout of 60 seconds.
    Check what yours is set to, for example:
    hatype -display DiskGroup
    ......
    DiskGroup    MonitorInterval         60
    ....
    DiskGroup    MonitorTimeout          60
    ....

    Before adjusting these timeouts, consider the following recommendation in VCS Admin Guide:

    For best results, Symantec recommends measuring the time it takes to bring a resource online, take it offline, and monitor before modifying the defaults. Issue an online or offline command to measure the time it takes for each action. To measure how long it takes to monitor a resource, fault the resource and issue a probe, or bring the resource online outside of VCS control and issue a probe.


    To change timeouts, you can do it in the GUI by selecting the Resource Type, or from cmd:
    hatype -modify <type> <attr> <value>
    e.g.
    hatype -modify DiskGroup MonitorInterval 120


  • Can we move this post to the Cluster Server forum?
    https://www-secure.symantec.com/connect/storage-management/forums/cluster-server

  • please move the discussion relavently to particular forum......

    Can u please inform me. how To measure time taken to monitor a resource, fault the resource and issue a probe

    CAN ANY ONE PLEASE ON THIS.

    Thanks and Regards,
    Sakthivel.A
  • It appears you've misread the reply to your post. Reformatted to make it clearer:

    To measure how long it takes to monitor a resource:
    fault the resource and issue a probe,
    or
    bring the resource online outside of VCS control and issue a probe.

    to fault the resource, take the resource offline outside cluster (ie: manually)
    to issue a probe: hares -probe <resource> -sys <system where res is running/faulted>
    then time how long it takes from running the probe command until it returns error in the engine log