cancel
Showing results for 
Search instead for 
Did you mean: 

VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xxx-sg) is FAULTED (timed out) on sys SEC-XXX

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Environment

Solaris 9

Two Node Cluster

SFHA installed

 

Engine_A.Log

2012/08/11 07:25:59 VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xx-sg) is FAULTED (timed out) on sys SEC-XXX

Dmesg

Aug 11 07:24:56 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:25:00 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:25:23 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:25:28 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:25:59 SEC-XXX Had[3797]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xxx-sg) is FAULTED (timed out) on sys SEC-XXX
Aug 11 07:26:38 SEC-XXX Had[3797]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group xxx-sg is faulted on system SEC-XXX
Aug 11 07:26:55 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:27:00 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:28:00 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:28:05 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:28:11 SEC-XXX Had[3797]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource XXX (Owner: Unspecified, Group: xxx-sg) is FAULTED (timed out) on sys SEC-XXX
Aug 11 07:29:11 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down
Aug 11 07:29:15 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link up 100Mbps Full-Duplex
Aug 11 07:29:17 SEC-XXX bge: [ID 801593 kern.notice] NOTICE: bge3: link down

 

It seems that bge3 faulted (if we see the dmesg logs) thats why the service group failed over to partner node. The bge3 is not a public NIC. (its a NIC through which a hardware device is connected which verify the application queries. This Hardware device is connected to switch and from switch two ethernet cables connected on each nodes bge3 because both nodes can see this device via bge3 ) But as per the error code ""VCS ERROR V-16-1-10303"" its saying something different as per the below TN. Comments required on the above logs

https://sort.symantec.com/ecls/umi/V-16-1-10303




 

8 REPLIES 8

Venkata_Reddy_C
Level 4
Employee

V-16-1-10303 is a generic message logged by VCS engine (had) for any resource which faults due to entrypoint timeout. In the technote the V-16-1-10303 ERROR message is logged for cvm_clus resource.

 

Hope this helps!

 

Regards,

Venkat

TonyGriffiths
Level 6
Employee Accredited Certified

That VCS log entry is generic and can apply to different resource types.

In this instance, I assume that the problem resource was dependent on the network

cheers

tony

mikebounds
Level 6
Partner Accredited

Did you have more messages before this one - I would guess after bge3 failed the VCS monitor timed out for the application which uses bge3 - if the monitor times out 4 times in a row (determined by resource Type attribute FaultOnMonitorTimeOuts), then the resource fails.  If this was the case then you should have seen messages in the VCS engine log to this affect.

Mike

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

The resource which was faulted is actually a NIC but not public nither private

Marianne
Level 6
Partner    VIP    Accredited Certified

If bge3 resource is marked as Critical (default), failure will cause failover.

Please review these topics in VCS Admin Guide (see https://sort.symantec.com/documentation )

Controlling VCS behavior

VCS behavior on resource faults

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Thanks all for kind words

  • More info for reference:

The bge3 is confgured as a resource in Service Group.

====

  • So the culprit is bge3 and we have to consult with hardware vendor for troubleshooting for bge3 ?

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Did you have more messages before this one - 

- if the monitor times out 4 times in a row

Logs are attached for reference:

Yes I see more messages and its three time in a row.

Marianne
Level 6
Partner    VIP    Accredited Certified

The log shows that the NIC went offline on SEC-XXX, causing Faulted state.

(SEC-XXX) NIC:XXX:monitor:.......: Resource is offline
Resource XXX .... is FAULTED (timed out) on sys SEC-XXX

VCS then did what it is supposed to do: Offline the rest of the SG, and failover to PRI-XXX :

Initiating Offline of Resource VirtualIP.....
Initiating Offline of Resource XXX-APP ....
Initiating Offline of Resource Mount ....
Initiating Offline of Resource VMDG .....
Group xxx-sg is faulted on system SEC-XXX
Group xxx-sg is offline on system SEC-XXX

Evaluating PRI-XXX as potential target node for group xxx-sg
...
Initiating Online of Resource VMDG .... on System PRI-XXX
Initiating Online of Resource Mount ...
Initiating Online of Resource XXX-APP ... on System PRI-XXX
Initiating Online of Resource VirtualIP ...
Group xxx-sg is online on system PRI-XXX
Group xxx-sg failed over to system PRI-XXX
 

So, VCS did what it was supposed to do.

You need to troubleshoot bge3 on SEC-XXX.