cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

MultiNICB resource detects a failure and failover Service Group

Di_Ro
Level 4
Partner Accredited

Hi all,

Can anyone help me to analyse this VCS log. One of my applications failover and I need to find out the reason behind this, it seems like my NIC isn't working OK, but I thought this is something like ETrack 1961530 (Article URL http://www.symantec.com/docs/TECH125824).

This is the messages output:

Jul 22 06:36:38 node01 in.mpathd[496]: [ID 168056 daemon.error] All Interfaces in sg netMultiNICB have failed
Jul 22 06:36:52 node01 in.mpathd[496]: [ID 299542 daemon.error] NIC repair detected on e1000g0 of sg netMultiNICB
Jul 22 06:36:52 node01 in.mpathd[496]: [ID 620804 daemon.error] Successfully failed back to NIC e1000g0
Jul 22 06:36:52 node01 in.mpathd[496]: [ID 237757 daemon.error] At least 1 interface (e1000g0) of sg netMultiNICB has repaired
Jul 22 06:36:52 node01 in.mpathd[496]: [ID 299542 daemon.error] NIC repair detected on nxge0 of sg netMultiNICB
Jul 22 06:36:52 node01 in.mpathd[496]: [ID 620804 daemon.error] Successfully failed back to NIC nxge0


And here is the engine_A.log contents:

2011/07/22 06:36:38 VCS INFO V-16-1-10307 Resource netMultiNICB (Owner: unknown, sg: netSG) is offline on node01 (Not initiated by VCS)
2011/07/22 06:36:39 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:36:39 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
2011/07/22 06:36:39 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
2011/07/22 06:36:41 VCS INFO V-16-6-15004 (node02) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
2011/07/22 06:36:41 VCS INFO V-16-6-15004 (node02) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
2011/07/22 06:36:41 VCS INFO V-16-1-10307 Resource netMultiNICB (Owner: unknown, sg: netSG) is offline on node02 (Not initiated by VCS)
2011/07/22 06:36:41 VCS INFO V-16-6-15004 (node02) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:37:08 VCS INFO V-16-1-10307 Resource aggProxy (Owner: unknown, sg: applic_ltdrAGG_sg) is offline on node01 (Not initiated by VCS)
2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource aggregator (Owner: unknown, sg: applic_ltdrAGG_sg) on System node01
2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource rcvrProxy (Owner: unknown, sg: applic_ltdrRCVR_sg) is offline on node01 (Not initiated by VCS)
2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource receiver (Owner: unknown, sg: applic_ltdrRCVR_sg) on System node01
2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource csgProxy (Owner: unknown, sg: ClusterService) is offline on node01 (Not initiated by VCS)
2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ntfr (Owner: unknown, sg: ClusterService) on System node01
2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource latProxy (Owner: unknown, sg: applic_LAT_sg) is offline on node01 (Not initiated by VCS)
2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource regProxy (Owner: unknown, sg: applic_cmsREG_sg) is offline on node01 (Not initiated by VCS)
2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cmsregistry (Owner: unknown, sg: applic_cmsREG_sg) on System node01
2011/07/22 06:37:09 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:37:09 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:37:09 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:37:09 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:37:09 VCS INFO V-16-6-15004 (node01) hatrigger:Failed to send trigger for resfault; script doesn't exist
2011/07/22 06:37:10 VCS INFO V-16-10001-3 (node01) Application:aggregator:offline:Executed /etc/init.d/ltdr_aggregator.sh
2011/07/22 06:37:10 VCS INFO V-16-10001-3 (node01) Application:cmsregistry:offline:Executed /etc/init.d/cms_registry.sh
2011/07/22 06:37:11 VCS INFO V-16-1-10305 Resource ntfr (Owner: unknown, sg: ClusterService) is offline on node01 (VCS initiated)
2011/07/22 06:37:11 VCS WARNING V-16-1-10183 sg Retry: no retry of sg ClusterService on System node01 due to persistent resource fault
2011/07/22 06:37:11 VCS ERROR V-16-1-10205 sg ClusterService is faulted on system node01
2011/07/22 06:37:11 VCS NOTICE V-16-1-10446 sg ClusterService is offline on system node01
2011/07/22 06:37:11 VCS INFO V-16-1-10493 Evaluating node01 as potential target node for sg ClusterService
2011/07/22 06:37:11 VCS INFO V-16-1-50010 sg ClusterService is online or faulted on system node01
2011/07/22 06:37:11 VCS INFO V-16-1-10493 Evaluating node02 as potential target node for sg ClusterService
2011/07/22 06:37:11 VCS NOTICE V-16-1-10301 Initiating Online of Resource ntfr (Owner: unknown, sg: ClusterService) on System node02
2011/07/22 06:37:11 VCS INFO V-16-10001-3 (node01) Application:receiver:offline:Executed /etc/init.d/ltdr_receiver.sh
2011/07/22 06:37:11 VCS INFO V-16-1-10298 Resource ntfr (Owner: unknown, sg: ClusterService) is online on node02 (VCS initiated)
2011/07/22 06:37:11 VCS NOTICE V-16-1-10447 sg ClusterService is online on system node02
2011/07/22 06:37:11 VCS NOTICE V-16-1-10448 sg ClusterService failed over to system node02

We're working with VCS 5.0 MP3... can anyone help me?

 

thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Di_Ro
Level 4
Partner Accredited

Thank you everybody for your help.... the problem is solved, it was very simple:

When a persistent resource is configured in a Service Group without any other non-persisten resource, we have to configure a Phantom resource just for online/offline the SG.

View solution in original post

8 REPLIES 8

mikebounds
Level 6
Partner Accredited

Did you upgrade your types.cf file when you upgraded from 5.0MP3 to 5.1.  If you have the wrong definition of MultiNICB in your config, then this can cause issues.

If you didn't do this you need to check your current types.cf to see if you made any changes (like changing timeouts and restarts), then stop VCS, copy types file from /etc/VRTSvcs/conf to /etc/VRTSvcs/conf/config and apply any changes (timeouts and restarts).

Mike

Marianne
Level 6
Partner    VIP    Accredited Certified

If I understand correctly, you are using 5.0 MP3? The TN that you found describes a bug that was introduced with VCS 5.1.

Please share more info - is mpathd configured Probe-based or Link-based?

If Link-based, did you set the ConfigCheck attribute for MultiNICB to 0? See  http://www.symantec.com/docs/TECH66744

 

If this is not relevant - please share details of your config:

Contents of /etc/hostname.* as well as MultiNICB definition in main.cf.

Di_Ro
Level 4
Partner Accredited

Marianne
Level 6
Partner    VIP    Accredited Certified

PLEASE post the MultiNICB section of you main.cf?

The example on p.96  of the Budled Agent Guide (https://sort.symantec.com/public/documents/sf/5.0MP3/solaris/pdf/vcs_bundled_agents.pdf  ) explains how to config MultiNIC in Base Mode (instead of Multipathing Mode)  on node north:

In the file /etc/hostname.qfe0, add the following two lines:

north-qfe0 netmask + broadcast + deprecated -failover up \
addif north netmask + broadcast + up

Where north-qfe0 is the test IP address that the agent uses to determine the state of the qfe0 network card.

In the file /etc/hostname.qfe4, add the following line:
north-qfe4 netmask + broadcast + deprecated -failover up

Where north-qfe4 is the test IP address that the agent uses to determine the
state of the qfe4 network card.

In the example, north-qfe0 and north-qfe4 are the host names that correspond
to test IP addresses. north is the host name that corresponds to the test IP
address.

 

 

So, for the above example, 3 /etc/hosts entries are needed, for example:

192.168.10.1  north
192.168.10.2  north-qfe0
192.168.10.3  north-qfe4

In this example, the test-ip address for qfe0 is 192.168.10.2, and the test-ip address for qfe4 is 192.168.10.3.

The NODE-name for this server is north, and it's IP address 192.168.10.1.

This node is only ever accessed by host name north and IP address 192.168.10.1.

Similar setup is required for the other cluster node(s), each with a unique set of test-ip's and host IP address.

olguer
Level 2
Partner Accredited
I have a problem with a cluster, the cluster detects that all failed and makes failover nic and this is the log.
 please can you help me


Jul 22 06:36:38 mexagpslog01 in.mpathd[496]: [ID 168056 daemon.error] All Interfaces in group netMultiNICB have failed
 Jul 22 06:36:52 mexagpslog01 in.mpathd[496]: [ID 299542 daemon.error] NIC repair detected on e1000g0 of group netMultiNICB
 Jul 22 06:36:52 mexagpslog01 in.mpathd[496]: [ID 620804 daemon.error] Successfully failed back to NIC e1000g0
 Jul 22 06:36:52 mexagpslog01 in.mpathd[496]: [ID 237757 daemon.error] At least 1 interface (e1000g0) of group netMultiNICB has repai
 red
 Jul 22 06:36:52 mexagpslog01 in.mpathd[496]: [ID 299542 daemon.error] NIC repair detected on nxge0 of group netMultiNICB
 Jul 22 06:36:52 mexagpslog01 in.mpathd[496]: [ID 620804 daemon.error] Successfully failed back to NIC nxge0
 
 y Γ©stas son de VCS (engine_A-log):
 2011/07/22 06:36:38 VCS INFO V-16-1-10307 Resource netMultiNICB (Owner: unknown, Group: netSG) is offline on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:36:39 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:36:39 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:36:39 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:36:41 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:36:41 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:36:41 VCS INFO V-16-1-10307 Resource netMultiNICB (Owner: unknown, Group: netSG) is offline on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:36:41 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:08 VCS INFO V-16-1-10307 Resource aggProxy (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) is offline on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource aggregator (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) on System mexagpslog01
 2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource rcvrProxy (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) is offline on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource receiver (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) on System mexagpslog01
 2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource csgProxy (Owner: unknown, Group: ClusterService) is offline on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ntfr (Owner: unknown, Group: ClusterService) on System mexagpslog01
 2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource latProxy (Owner: unknown, Group: MEXAGPS_LAT_group) is offline on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:37:09 VCS INFO V-16-1-10307 Resource regProxy (Owner: unknown, Group: MEXAGPS_cmsREG_group) is offline on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:37:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) on System mexagpslog01
 2011/07/22 06:37:09 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:09 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:09 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:09 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:09 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:10 VCS INFO V-16-10001-3 (mexagpslog01) Application:aggregator:offline:Executed /etc/init.d/ltdr_aggregator.sh
 2011/07/22 06:37:10 VCS INFO V-16-10001-3 (mexagpslog01) Application:cmsregistry:offline:Executed /etc/init.d/cms_registry.sh
 2011/07/22 06:37:11 VCS INFO V-16-1-10305 Resource ntfr (Owner: unknown, Group: ClusterService) is offline on mexagpslog01 (VCS initiated)
 2011/07/22 06:37:11 VCS WARNING V-16-1-10183 Group Retry: no retry of group ClusterService on System mexagpslog01 due to persistent resource fault
 2011/07/22 06:37:11 VCS ERROR V-16-1-10205 Group ClusterService is faulted on system mexagpslog01
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system mexagpslog01
 2011/07/22 06:37:11 VCS INFO V-16-1-10493 Evaluating mexagpslog01 as potential target node for group ClusterService
 2011/07/22 06:37:11 VCS INFO V-16-1-50010 Group ClusterService is online or faulted on system mexagpslog01
 2011/07/22 06:37:11 VCS INFO V-16-1-10493 Evaluating mexagpslog02 as potential target node for group ClusterService
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10301 Initiating Online of Resource ntfr (Owner: unknown, Group: ClusterService) on System mexagpslog02
 2011/07/22 06:37:11 VCS INFO V-16-10001-3 (mexagpslog01) Application:receiver:offline:Executed /etc/init.d/ltdr_receiver.sh
 2011/07/22 06:37:11 VCS INFO V-16-1-10298 Resource ntfr (Owner: unknown, Group: ClusterService) is online on mexagpslog02 (VCS initiated)
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10447 Group ClusterService is online on system mexagpslog02
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10448 Group ClusterService failed over to system mexagpslog02
 2011/07/22 06:37:11 VCS INFO V-16-1-10307 Resource aggProxy (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) is offline on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:37:11 VCS INFO V-16-1-10307 Resource regProxy (Owner: unknown, Group: MEXAGPS_cmsREG_group) is offline on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10300 Initiating Offline of Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) on System mexagpslog02
 2011/07/22 06:37:11 VCS INFO V-16-1-10307 Resource csgProxy (Owner: unknown, Group: ClusterService) is offline on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ntfr (Owner: unknown, Group: ClusterService) on System mexagpslog02
 2011/07/22 06:37:11 VCS INFO V-16-1-10307 Resource rcvrProxy (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) is offline on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10300 Initiating Offline of Resource receiver (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) on System mexagpslog02
 2011/07/22 06:37:11 VCS INFO V-16-1-10307 Resource latProxy (Owner: unknown, Group: MEXAGPS_LAT_group) is offline on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:37:11 VCS NOTICE V-16-1-10300 Initiating Offline of Resource lat (Owner: unknown, Group: MEXAGPS_LAT_group) on System mexagpslog02
 2011/07/22 06:37:11 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:11 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:12 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog01 ClusterService   successfully
 2011/07/22 06:37:12 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:12 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:12 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:12 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for resfault; script doesn't exist
 2011/07/22 06:37:12 VCS INFO V-16-1-10305 Resource aggregator (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) is offline on mexagpslog01 (VCS initiated)
 2011/07/22 06:37:12 VCS WARNING V-16-1-10183 Group Retry: no retry of group MEXAGPS_ltdrAGG_group on System mexagpslog01 due to persistent resource fault
 2011/07/22 06:37:12 VCS ERROR V-16-1-10205 Group MEXAGPS_ltdrAGG_group is faulted on system mexagpslog01
 2011/07/22 06:37:12 VCS NOTICE V-16-1-10446 Group MEXAGPS_ltdrAGG_group is offline on system mexagpslog01
 2011/07/22 06:37:12 VCS NOTICE V-16-1-50061 Cannot initiate online of group MEXAGPS_ltdrAGG_group; this group may go online after its children fail over
 2011/07/22 06:37:12 VCS INFO V-16-1-10305 Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) is offline on mexagpslog01 (VCS initiated)
 2011/07/22 06:37:12 VCS WARNING V-16-1-10183 Group Retry: no retry of group MEXAGPS_cmsREG_group on System mexagpslog01 due to persistent resource fault
 2011/07/22 06:37:12 VCS NOTICE V-16-1-10235 Restart is set for group MEXAGPS_cmsREG_group. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
 2011/07/22 06:37:12 VCS ERROR V-16-1-10205 Group MEXAGPS_cmsREG_group is faulted on system mexagpslog01
 2011/07/22 06:37:12 VCS NOTICE V-16-1-50755 Clearing IntentOnline attribute for parallel group MEXAGPS_ltdrRCVR_group on node mexagpslog01
 2011/07/22 06:37:12 VCS NOTICE V-16-1-10446 Group MEXAGPS_cmsREG_group is offline on system mexagpslog01
 2011/07/22 06:37:13 VCS INFO V-16-6-15051 (mexagpslog02) nfs_restart:nfs_restart trigger did not do anything as there is no NFS/NFSLock/Share resource in the group
 2011/07/22 06:37:13 VCS INFO V-16-6-15002 (mexagpslog02) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart ClusterService    successfully
 2011/07/22 06:37:13 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for postonline; script doesn't exist
 2011/07/22 06:37:13 VCS INFO V-16-10001-3 (mexagpslog02) Application:lat:offline:Executed /etc/init.d/lat.sh
 2011/07/22 06:37:13 VCS INFO V-16-10001-3 (mexagpslog02) Application:cmsregistry:offline:Executed /etc/init.d/cms_registry.sh
 2011/07/22 06:37:13 VCS INFO V-16-1-10305 Resource receiver (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) is offline on mexagpslog01 (VCS initiated)
 2011/07/22 06:37:13 VCS WARNING V-16-1-10183 Group Retry: no retry of group MEXAGPS_ltdrRCVR_group on System mexagpslog01 due to persistent resource fault
 2011/07/22 06:37:13 VCS NOTICE V-16-1-10235 Restart is set for group MEXAGPS_ltdrRCVR_group. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
 2011/07/22 06:37:13 VCS ERROR V-16-1-10205 Group MEXAGPS_ltdrRCVR_group is faulted on system mexagpslog01
 2011/07/22 06:37:13 VCS NOTICE V-16-1-10446 Group MEXAGPS_ltdrRCVR_group is offline on system mexagpslog01
 2011/07/22 06:37:13 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog01 MEXAGPS_ltdrAGG_group   successfully
 2011/07/22 06:37:13 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog01 MEXAGPS_cmsREG_group   successfully
 2011/07/22 06:37:13 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:13 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:13 VCS INFO V-16-2-13001 (mexagpslog02) Resource(lat): Output of the completed operation (offline) 
 pgrep: invalid user name -- -f
 pgrep: invalid user name -- -f
 2011/07/22 06:37:13 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog01 MEXAGPS_ltdrRCVR_group   successfully
 2011/07/22 06:37:13 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:14 VCS INFO V-16-10001-3 (mexagpslog02) Application:receiver:offline:Executed /etc/init.d/ltdr_receiver.sh
 2011/07/22 06:37:14 VCS INFO V-16-1-10305 Resource ntfr (Owner: unknown, Group: ClusterService) is offline on mexagpslog02 (VCS initiated)
 2011/07/22 06:37:14 VCS WARNING V-16-1-10183 Group Retry: no retry of group ClusterService on System mexagpslog02 due to persistent resource fault
 2011/07/22 06:37:14 VCS ERROR V-16-1-10205 Group ClusterService is faulted on system mexagpslog02
 2011/07/22 06:37:14 VCS NOTICE V-16-1-10446 Group ClusterService is offline on system mexagpslog02
 2011/07/22 06:37:14 VCS INFO V-16-1-10493 Evaluating mexagpslog01 as potential target node for group ClusterService
 2011/07/22 06:37:14 VCS INFO V-16-1-50010 Group ClusterService is online or faulted on system mexagpslog01
 2011/07/22 06:37:14 VCS INFO V-16-1-10493 Evaluating mexagpslog02 as potential target node for group ClusterService
 2011/07/22 06:37:14 VCS INFO V-16-1-50010 Group ClusterService is online or faulted on system mexagpslog02
 2011/07/22 06:37:14 VCS NOTICE V-16-1-10235 Restart is set for group ClusterService. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
 2011/07/22 06:37:14 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for nofailover; script doesn't exist
 2011/07/22 06:37:14 VCS INFO V-16-1-10305 Resource lat (Owner: unknown, Group: MEXAGPS_LAT_group) is offline on mexagpslog02 (VCS initiated)
 2011/07/22 06:37:14 VCS WARNING V-16-1-10183 Group Retry: no retry of group MEXAGPS_LAT_group on System mexagpslog02 due to persistent resource fault
 2011/07/22 06:37:14 VCS ERROR V-16-1-10205 Group MEXAGPS_LAT_group is faulted on system mexagpslog02
 2011/07/22 06:37:14 VCS NOTICE V-16-1-10446 Group MEXAGPS_LAT_group is offline on system mexagpslog02
 2011/07/22 06:37:14 VCS INFO V-16-1-10493 Evaluating mexagpslog01 as potential target node for group MEXAGPS_LAT_group
 2011/07/22 06:37:14 VCS INFO V-16-1-50010 Group MEXAGPS_LAT_group is online or faulted on system mexagpslog01
 2011/07/22 06:37:14 VCS INFO V-16-1-10493 Evaluating mexagpslog02 as potential target node for group MEXAGPS_LAT_group
 2011/07/22 06:37:14 VCS INFO V-16-1-50010 Group MEXAGPS_LAT_group is online or faulted on system mexagpslog02
 2011/07/22 06:37:14 VCS NOTICE V-16-1-10235 Restart is set for group MEXAGPS_LAT_group. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
 2011/07/22 06:37:15 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for nofailover; script doesn't exist
 2011/07/22 06:37:15 VCS INFO V-16-1-10305 Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) is offline on mexagpslog02 (VCS initiated)
 2011/07/22 06:37:15 VCS WARNING V-16-1-10183 Group Retry: no retry of group MEXAGPS_cmsREG_group on System mexagpslog02 due to persistent resource fault
 2011/07/22 06:37:15 VCS NOTICE V-16-1-10235 Restart is set for group MEXAGPS_cmsREG_group. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
 2011/07/22 06:37:15 VCS ERROR V-16-1-10205 Group MEXAGPS_cmsREG_group is faulted on system mexagpslog02
 2011/07/22 06:37:15 VCS NOTICE V-16-1-50755 Clearing IntentOnline attribute for parallel group MEXAGPS_ltdrRCVR_group on node mexagpslog02
 2011/07/22 06:37:15 VCS NOTICE V-16-1-10446 Group MEXAGPS_cmsREG_group is offline on system mexagpslog02
 2011/07/22 06:37:15 VCS INFO V-16-6-15002 (mexagpslog02) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog02 ClusterService   successfully
 2011/07/22 06:37:15 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:15 VCS INFO V-16-6-15002 (mexagpslog02) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog02 MEXAGPS_LAT_group   successfully
 2011/07/22 06:37:15 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:15 VCS INFO V-16-6-15002 (mexagpslog02) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog02 MEXAGPS_cmsREG_group   successfully
 2011/07/22 06:37:15 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:15 VCS INFO V-16-1-10305 Resource receiver (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) is offline on mexagpslog02 (VCS initiated)
 2011/07/22 06:37:15 VCS WARNING V-16-1-10183 Group Retry: no retry of group MEXAGPS_ltdrRCVR_group on System mexagpslog02 due to persistent resource fault
 2011/07/22 06:37:15 VCS NOTICE V-16-1-10235 Restart is set for group MEXAGPS_ltdrRCVR_group. Group will be brought online if fault on persistent resource clears. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
 2011/07/22 06:37:15 VCS ERROR V-16-1-10205 Group MEXAGPS_ltdrRCVR_group is faulted on system mexagpslog02
 2011/07/22 06:37:15 VCS NOTICE V-16-1-10446 Group MEXAGPS_ltdrRCVR_group is offline on system mexagpslog02
 2011/07/22 06:37:16 VCS INFO V-16-6-15002 (mexagpslog02) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_postoffline mexagpslog02 MEXAGPS_ltdrRCVR_group   successfully
 2011/07/22 06:37:16 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for postoffline; script doesn't exist
 2011/07/22 06:37:39 VCS INFO V-16-1-10299 Resource netMultiNICB (Owner: unknown, Group: netSG) is online on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:37:39 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:37:39 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:37:41 VCS INFO V-16-1-10299 Resource netMultiNICB (Owner: unknown, Group: netSG) is online on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:37:41 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:37:42 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for multinicb_postchange; script doesn't exist
 2011/07/22 06:38:08 VCS INFO V-16-1-10299 Resource aggProxy (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) is online on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:38:08 VCS INFO V-16-1-10299 Resource regProxy (Owner: unknown, Group: MEXAGPS_cmsREG_group) is online on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10229 Group MEXAGPS_cmsREG_group - Trying to online resources of group that were online prior to fault on node mexagpslog01 . Persistent resource went online on node mexagpslog01
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10301 Initiating Online of Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) on System mexagpslog01
 2011/07/22 06:38:08 VCS INFO V-16-1-10299 Resource csgProxy (Owner: unknown, Group: ClusterService) is online on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10229 Group ClusterService - Trying to online resources of group that were online prior to fault on node mexagpslog02 . Persistent resource went online on node mexagpslog01
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10301 Initiating Online of Resource ntfr (Owner: unknown, Group: ClusterService) on System mexagpslog01
 2011/07/22 06:38:08 VCS INFO V-16-1-10299 Resource rcvrProxy (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) is online on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10227 One or more child group not okay with bringing group MEXAGPS_ltdrRCVR_group online on node mexagpslog01. Ignoring Restart
 2011/07/22 06:38:08 VCS INFO V-16-1-10299 Resource latProxy (Owner: unknown, Group: MEXAGPS_LAT_group) is online on mexagpslog01 (Not initiated by VCS)
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10229 Group MEXAGPS_LAT_group - Trying to online resources of group that were online prior to fault on node mexagpslog02 . Persistent resource went online on node mexagpslog01
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10301 Initiating Online of Resource lat (Owner: unknown, Group: MEXAGPS_LAT_group) on System mexagpslog01
 2011/07/22 06:38:08 VCS INFO V-16-10001-3 (mexagpslog01) Application:lat:online:Executed /etc/init.d/lat.sh
 2011/07/22 06:38:08 VCS INFO V-16-1-10298 Resource ntfr (Owner: unknown, Group: ClusterService) is online on mexagpslog01 (VCS initiated)
 2011/07/22 06:38:08 VCS NOTICE V-16-1-10447 Group ClusterService is online on system mexagpslog01
 2011/07/22 06:38:10 VCS INFO V-16-2-13001 (mexagpslog01) Resource(lat): Output of the completed operation (online) 
 pgrep: invalid user name -- -f
 pgrep: invalid user name -- -f
 2011/07/22 06:38:10 VCS INFO V-16-10001-3 (mexagpslog01) Application:cmsregistry:online:Executed /etc/init.d/cms_registry.sh
 2011/07/22 06:38:10 VCS INFO V-16-6-15051 (mexagpslog01) nfs_restart:nfs_restart trigger did not do anything as there is no NFS/NFSLock/Share resource in the group
 2011/07/22 06:38:10 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart ClusterService    successfully
 2011/07/22 06:38:10 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postonline; script doesn't exist
 2011/07/22 06:38:10 VCS INFO V-16-1-10298 Resource lat (Owner: unknown, Group: MEXAGPS_LAT_group) is online on mexagpslog01 (VCS initiated)
 2011/07/22 06:38:10 VCS NOTICE V-16-1-10447 Group MEXAGPS_LAT_group is online on system mexagpslog01
 2011/07/22 06:38:11 VCS INFO V-16-6-15051 (mexagpslog01) nfs_restart:nfs_restart trigger did not do anything as there is no NFS/NFSLock/Share resource in the group
 2011/07/22 06:38:11 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart MEXAGPS_LAT_group    successfully
 2011/07/22 06:38:11 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postonline; script doesn't exist
 2011/07/22 06:38:11 VCS INFO V-16-1-10298 Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) is online on mexagpslog01 (VCS initiated)
 2011/07/22 06:38:11 VCS NOTICE V-16-1-10447 Group MEXAGPS_cmsREG_group is online on system mexagpslog01
 2011/07/22 06:38:11 VCS WARNING V-16-1-50045 Initiating online of parent group MEXAGPS_ltdrAGG_group, PM will select the best node
 2011/07/22 06:38:11 VCS INFO V-16-1-10493 Evaluating mexagpslog01 as potential target node for group MEXAGPS_ltdrAGG_group
 2011/07/22 06:38:11 VCS INFO V-16-1-50002 MigrateQ for group MEXAGPS_ltdrAGG_group contains system mexagpslog01; group might be transitioning
 2011/07/22 06:38:11 VCS INFO V-16-1-10493 Evaluating mexagpslog02 as potential target node for group MEXAGPS_ltdrAGG_group
 2011/07/22 06:38:11 VCS INFO V-16-1-50010 Group MEXAGPS_ltdrAGG_group is online or faulted on system mexagpslog02
 2011/07/22 06:38:11 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for nofailover; script doesn't exist
 2011/07/22 06:38:12 VCS INFO V-16-1-10299 Resource regProxy (Owner: unknown, Group: MEXAGPS_cmsREG_group) is online on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:38:12 VCS NOTICE V-16-1-10229 Group MEXAGPS_cmsREG_group - Trying to online resources of group that were online prior to fault on node mexagpslog02 . Persistent resource went online on node mexagpslog02
 2011/07/22 06:38:12 VCS NOTICE V-16-1-10301 Initiating Online of Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) on System mexagpslog02
 2011/07/22 06:38:12 VCS INFO V-16-1-10299 Resource latProxy (Owner: unknown, Group: MEXAGPS_LAT_group) is online on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:38:12 VCS INFO V-16-1-10299 Resource csgProxy (Owner: unknown, Group: ClusterService) is online on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:38:12 VCS INFO V-16-1-10299 Resource rcvrProxy (Owner: unknown, Group: MEXAGPS_ltdrRCVR_group) is online on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:38:12 VCS NOTICE V-16-1-10227 One or more child group not okay with bringing group MEXAGPS_ltdrRCVR_group online on node mexagpslog02. Ignoring Restart
 2011/07/22 06:38:12 VCS INFO V-16-1-10299 Resource aggProxy (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) is online on mexagpslog02 (Not initiated by VCS)
 2011/07/22 06:38:12 VCS INFO V-16-6-15051 (mexagpslog01) nfs_restart:nfs_restart trigger did not do anything as there is no NFS/NFSLock/Share resource in the group
 2011/07/22 06:38:12 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart MEXAGPS_cmsREG_group    successfully
 2011/07/22 06:38:12 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postonline; script doesn't exist
 2011/07/22 06:38:13 VCS INFO V-16-10001-3 (mexagpslog02) Application:cmsregistry:online:Executed /etc/init.d/cms_registry.sh
 2011/07/22 06:38:15 VCS INFO V-16-1-10298 Resource cmsregistry (Owner: unknown, Group: MEXAGPS_cmsREG_group) is online on mexagpslog02 (VCS initiated)
 2011/07/22 06:38:15 VCS NOTICE V-16-1-10447 Group MEXAGPS_cmsREG_group is online on system mexagpslog02
 2011/07/22 06:38:15 VCS WARNING V-16-1-50045 Initiating online of parent group MEXAGPS_ltdrAGG_group, PM will select the best node
 2011/07/22 06:38:15 VCS INFO V-16-1-10493 Evaluating mexagpslog01 as potential target node for group MEXAGPS_ltdrAGG_group
 2011/07/22 06:38:15 VCS INFO V-16-1-10493 Evaluating mexagpslog02 as potential target node for group MEXAGPS_ltdrAGG_group
 2011/07/22 06:38:15 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group MEXAGPS_ltdrAGG_group on all nodes
 2011/07/22 06:38:15 VCS NOTICE V-16-1-10301 Initiating Online of Resource aggregator (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) on System mexagpslog01
 2011/07/22 06:38:16 VCS INFO V-16-6-15051 (mexagpslog02) nfs_restart:nfs_restart trigger did not do anything as there is no NFS/NFSLock/Share resource in the group
 2011/07/22 06:38:16 VCS INFO V-16-6-15002 (mexagpslog02) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart MEXAGPS_cmsREG_group    successfully
 2011/07/22 06:38:16 VCS INFO V-16-6-15004 (mexagpslog02) hatrigger:Failed to send trigger for postonline; script doesn't exist
 2011/07/22 06:38:17 VCS INFO V-16-10001-3 (mexagpslog01) Application:aggregator:online:Executed /etc/init.d/ltdr_aggregator.sh
 2011/07/22 06:38:18 VCS INFO V-16-1-10298 Resource aggregator (Owner: unknown, Group: MEXAGPS_ltdrAGG_group) is online on mexagpslog01 (VCS initiated)
 2011/07/22 06:38:18 VCS NOTICE V-16-1-10447 Group MEXAGPS_ltdrAGG_group is online on system mexagpslog01
 2011/07/22 06:38:19 VCS INFO V-16-6-15051 (mexagpslog01) nfs_restart:nfs_restart trigger did not do anything as there is no NFS/NFSLock/Share resource in the group
 2011/07/22 06:38:19 VCS INFO V-16-6-15002 (mexagpslog01) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart MEXAGPS_ltdrAGG_group    successfully
 2011/07/22 06:38:19 VCS INFO V-16-6-15004 (mexagpslog01) hatrigger:Failed to send trigger for postonline; script doesn't exist



Di_Ro
Level 4
Partner Accredited

Thank you everybody for your help.... the problem is solved, it was very simple:

When a persistent resource is configured in a Service Group without any other non-persisten resource, we have to configure a Phantom resource just for online/offline the SG.

Di_Ro
Level 4
Partner Accredited

this is only an example, there are 4 service group with the same configuration (just one netSG for MultiNICB):

 

group 1_group (
    SystemList = { node01 = 0, node02 = 1 }
    AutoStartList = { node01 }
    OnlineRetryLimit = 3
    OnlineRetryInterval = 120
    )

    Application lat (
        StartProgram = "/etc/init.d/lat.sh vcsstart"
        StopProgram = "/etc/init.d/lat.sh vcsstop"
        CleanProgram = "/etc/init.d/lat.sh vcsstop"
        MonitorProgram = "/etc/init.d/lat.sh vcsstatus"
        )

    Proxy latProxy (
        TargetResName = netMultiNICB
        )

    lat requires latProxy

group netSG (
    SystemList = { node01 = 0, node02 = 1 }
    Parallel = 2
    AutoStartList = { node01, node02 }
    OnlineRetryLimit = 2
    OnlineRetryInterval = 120
    )

    MultiNICB netMultiNICB (
        UseMpathd = 1
        MpathdCommand = "/usr/lib/inet/in.mpathd -a"
        Device = { e1000g0 = 0, nxge0 = 1 }
        )

This is the real hostname.* configuration (I had changed the alias in the output, but this is the original):

# cat /etc/hostname.e1000g0

node01-e1000g0 netmask + broadcast + deprecated -failover up group service
addif node01-svc netmask + broadcast + failover up

#cat /etc/hostname.nxge0

node01-nxge0 netmask + broadcast + deprecated -failover up group service

thankyou very much for your help

Di_Ro
Level 4
Partner Accredited

this is only an example, there are 4 service group with the same configuration (just one netSG for MultiNICB):


group 1_group (
    SystemList = { node01 = 0, node02 = 1 }
    AutoStartList = { node01 }
    OnlineRetryLimit = 3
    OnlineRetryInterval = 120
    )

    Application lat (
        StartProgram = "/etc/init.d/lat.sh vcsstart"
        StopProgram = "/etc/init.d/lat.sh vcsstop"
        CleanProgram = "/etc/init.d/lat.sh vcsstop"
        MonitorProgram = "/etc/init.d/lat.sh vcsstatus"
        )

    Proxy latProxy (
        TargetResName = netMultiNICB
        )

    lat requires latProxy

group netSG (
    SystemList = { node01 = 0, node02 = 1 }
    Parallel = 2
    AutoStartList = { node01, node02 }
    OnlineRetryLimit = 2
    OnlineRetryInterval = 120
    )

    MultiNICB netMultiNICB (
        UseMpathd = 1
        MpathdCommand = "/usr/lib/inet/in.mpathd -a"
        Device = { e1000g0 = 0, nxge0 = 1 }
        )

This is the real hostname.* configuration (I had changed the alias in the output, but this is the original):

# cat /etc/hostname.e1000g0

node01-e1000g0 netmask + broadcast + deprecated -failover up group service
addif node01-svc netmask + broadcast + failover up

#cat /etc/hostname.nxge0

node01-nxge0 netmask + broadcast + deprecated -failover up group service