Forum Discussion

Nu-B's avatar
Nu-B
Level 2
12 years ago

Looking for root cause on my a resource is offine.

I recently noticed that I have a resource that is offline. I am fairly new to VCS and I'm looking to track down how I can determine when and why this resource is offline. VCS seems to have a ton of l...
  • mikebounds's avatar
    12 years ago

    I don't agree these messages are normal - they are of severity ERROR, not INFO, so the agent is exiting abnormally - i.e crashing.  This indicates an error with VCS or the SDRF agent, not SRDF itself as if there is a problem with SRDF, the SRDF agent should report a problem, the SRDF agent should not shutdown.

    Some possible causes are:

    1. SRDF Agent has memory leak
    2. Lots of resources and/or service groups (more than 200) and so "had" daemon is busy
    3. System is too busy so "had" can not get enough system resources

    It could be a one-off so, to restart agent you can use:

    haagent -start SRDF -sys sys_name_on_which_agent_has_stopped

    But if it stops again, then you need to investigate the cause which will probably involve logging a call with Symantec, unless there are obvious system resource problems on your nodes.

    Mike