11-14-2017 08:27 PM
Hi,
I'm having an issue that whenever a NFS hangs (comes with message NFS not responding..) or whenever the Network interface goes down, the VCS cluster doesn't failover to its partner. Instead, it keeps trying on the same server and ultimately the resource group fails on both the nodes.
Can someone please help me out of this please? I would rather that the VCS fails over on its own or any other possible workaround to get it switchover manually, in case this scenario happens again. Is there anyway to achieve this?
Thanks & Regards,
Ajit
11-14-2017 08:31 PM
post the engineA and nfs logs.
11-14-2017 09:00 PM
Attaching the logs.
11-14-2017 11:18 PM
under Service Group what resource type you used for NIC and NFS ?
11-14-2017 11:32 PM
Hi,
We are using Named & Namedpc for NIC and NFS.
Thanks,
Ajit
11-15-2017 12:23 AM
I think you need to o below:
- For Network card, you need to make NIC resource
- For NFS, you need to make NFSshare resource
Let me search in case I get a detailed technote
11-15-2017 03:49 AM
Page 255 Configuring NFS service groups
https://sort.symantec.com/public/documents/sfha/6.0/linux/productguides/pdf/vcs_admin_60_lin.pdf
I hope this will help you out
11-15-2017 09:39 PM
Thanks for the document. However, I have already got those resources configured. (I mean NIC resource exists on the NFS service service group). The problem is that even when its configured like this, the switchover doesn't happen if the NIC is down.
gnic_p
gnic_p
named_ip
named_ip
named_ipmp
named_ipmp
named_mount
named_mount
named_proc
named_proc
namedpc_ip
namedpc_ip
namedpc_ipmp
namedpc_ipmp
namedpc_mount
namedpc_mount
namedpc_proc
namedpc_proc
proxy-ipmp
proxy-ipmp
Eg:
#Resource Attribute System Value
namedpc_ip Group global namedpc
namedpc_ip Type global IPMultiNICB
Thanks,
Ajit
11-18-2017 08:14 PM
NFS resource hung issue is a very popula issue and most of the times, these kind of issues are not VCS related. If NFS is hanging (becoems not responsive, the issue is in the kernel), BVS NFS agent would not be able to offline/clean the hanging resource therefore unable to failover. Likely a system reboot is needed.
Are you on VCS 5.0/5.1 or later?
If the issue occurs on a regula bases, patchj up OS and upgrade VCS is highly recommended.
If you are already on the latest OS/VCS patch level and still experience this issue, capture an application core dump (if its on Solaris) or just collect a live system call. Log a support case with Veritas to tgroubleshoot the issue further. To find the issue RCA, you need DataCollector (VRTSexplorer) report as well as an application or system core dump