cancel
Showing results for 
Search instead for 
Did you mean: 

unsuccessful cluster failover occured because of nic faulted

c0derx
Level 3
Partner

Hello, 

Platform: solaris 11

Logs: 

Jun 23 16:36:49 nodeA Had[5211]: [ID 702911 daemon.notice] VCS ERROR V-16-1-54031 Resource csgnic (Owner: Unspecified, Group: ClusterService) is FAULTED on sys nodeA

Jun 23 16:37:49 nodeA Had[5211]: [ID 702911 daemon.notice] VCS ERROR V-16-1-54031 Resource nic_proxy_aggr1 (Owner: Unspecified, Group: oracle) is FAULTED on sys nodeA

 

Question 1)

I need more detail about the problem. I tried to check /var/log/messages, /var/adm/messages, /var/fm/fmd/* files and I can’t see anything related with this error. 

Which logs should be checked on the solaris 11 system for this situation?

Question 2)

What kind of method do you advise to investigate the nic problem for getting more information on platform?

Question 3) 

What kind of configuration should I do for handling with nic failures?

1 ACCEPTED SOLUTION

Accepted Solutions

Dev_Roy
Level 6
Accredited Certified

Did you try looking at the system logs like /var/adm/messages around the time when VCS complainted about resoruce is fault. Typically VCS would pick up faults if there is any actual fault  with the underlying resource. Did you get in touch with OS vendor?

I dont think a detailed RCA can be possible in this forum, since it would require loads of evidences and thorough investigation, a better route would be to get in touch with technical support with requested evidences.

View solution in original post

7 REPLIES 7

Dev_Roy
Level 6
Accredited Certified

Run the following commands and give us the output:

ifconfig -a on nodeA (since the nic has faulted on nodeA as per logs above)?

uname -a

pkginfo -l VRTSvcs

c0derx
Level 3
Partner

nic name: aggr1

OS version: SunOS nodeA 5.11 11.2 sun4v sparc sun4v

VRTSvcs version: 6.2.1.0,REV=6.2.1.0

Dev_Roy
Level 6
Accredited Certified

do you see any failed flag in the ifconfig output? You will need to paste the entire output for us to figure that out.

c0derx
Level 3
Partner

There is no failed state on ifconfig -a output. I can't copy / paste easily.

And it happens sometimes.

 

Dev_Roy
Level 6
Accredited Certified

Is this a new setup or was it working earlier?

Did you try to clear the fault and bring the resource online?

c0derx
Level 3
Partner

It is not new setup, it is a production system. We brought resource online. It is working fine but sometimes i see:,

Jun 23 16:36:49 nodeA Had[5211]: [ID 702911 daemon.notice] VCS ERROR V-16-1-54031 Resource csgnic (Owner: Unspecified, Group: ClusterService) is FAULTED on sys nodeA

Jun 23 16:37:49 nodeA Had[5211]: [ID 702911 daemon.notice] VCS ERROR V-16-1-54031 Resource nic_proxy_aggr1 (Owner: Unspecified, Group: oracle) is FAULTED on sys nodeA

on logs.

Dev_Roy
Level 6
Accredited Certified

Did you try looking at the system logs like /var/adm/messages around the time when VCS complainted about resoruce is fault. Typically VCS would pick up faults if there is any actual fault  with the underlying resource. Did you get in touch with OS vendor?

I dont think a detailed RCA can be possible in this forum, since it would require loads of evidences and thorough investigation, a better route would be to get in touch with technical support with requested evidences.