SFHA Solutions 6.0.1: Troubleshooting unprobed res...

Moiz_A · ‎02-21-2013

Veritas Cluster Server (VCS) monitors resources when they are online and offline to ensure that they are not started on systems where they are not supposed to run.

When you configure VCS, you should convey to the VCS engine the definitions of the cluster, service groups, resources, and dependencies among service groups and resources. VCS uses the following two configuration files in a default configuration:

main.cf—Defines the cluster, including service groups and resources
types.cf—Defines the resource types

For more information about configuring VCS using VCS configuration files, see:

VCS mainly fails to probe the resources or service group during following scenarios:

When a new types.cf file is not copied into the /etc/VRTSvcs/conf/config/ directory duringa a VCS upgrade. If VCS fails to probe the resources, the service group does not come online and also gets auto-disabled in the cluster. This happens due to old types.cf files in the /etc/VRTSvcs/conf/config/ directory.
When the definitions of the cluster, service groups, resources, dependencies, and attributes remain undefined or incorrect in the main.cf file. This causes configuration errors, and a STALE_ADMIN_WAIT message is displayed.
When the installation of an agent for a specific node has failed.
When a resource returns the resource state as “UNKNOWN”, which means the agent or resource is unable to monitor the configured resource.
When the resource is disabled.

For more information on probing the resources or service group, or troubleshooting service groups, see

Some of the probing issues can be resolved by copying the latest types.cf file from the /etc/VRTSvcs/conf/ directory to the /etc/VRTSvcs/conf/config/ directory as follows:

1. Stop the cluster on all nodes:
# hastop -all –force

Applications continue to run, but do not fail over.

2. Back up the original types.cf file:
# mv /etc/VRTSvcs/conf/config/types.cf /etc/VRTSvcs/conf/config/types.cf.date

3. Copy the types.cf file:
# cp /etc/VRTSvcs/conf/types.cf /etc/VRTSvcs/conf/config/types.cf

4. Verify that both types.cf file are of same size:
# ls -l /etc/VRTSvcs/conf/types.cf
# ls -l /etc/VRTSvcs/conf/config/types.cf

5. Start the cluster on the node:
# hastart
You must execute the hastart command on all nodes in the cluster, and also verify that the types.cf file did not revert to the original version. If so, then repeat the procedure and shut down the Low Latency Transport (LLT) and Global Atomic Broadcast (GAB), after you execute the hastop command.

You can also find information about probing resources and troubleshooting service groups in the PDF versions of the following guides:

VCS documentation for other platforms and releases can be found on the SORT website.

VOX

SFHA Solutions 6.0.1: Troubleshooting unprobed resources in Veritas Cluster Server