Solved: Thanks Mike

mrrout · ‎08-24-2011

Hi All,

I am getting a problem in my VCS 5.0 MP3 configuration for an NFS configuration. It is as below:

The cluster is a two node setup on AIX 5.3 TL09, contents of "main.cf" file is as below:

include "types.cf"

cluster HDS_CLUS (
        UserNames = { admin = fIJbIDiFJeJJhRJdIG }
        Administrators = { admin }
        UseFence = SCSI3
        )

system AIX-23 (
)

system AIX-24 (
)

group SG1 (
        SystemList = { AIX-23 = 0, AIX-24 = 1 }
        AutoStartList = { AIX-23, AIX-24 }
        )

        DiskGroup dg01 (
                DiskGroup = dg01
                )

        IP dNFS_IP (
                Critical = 0
                Device @AIX-3 = en2
                Device @AIX-4 = en2
                Address = "172.17.12.9"
                NetMask = "255.255.254.0"
                )

        Mount dNFS_Mount1 (
                Critical = 0
                MountPoint = "/11"
                BlockDevice = "/dev/vx/dsk/dg01/vol1"
                FSType = vxfs
                FsckOpt = "-y"
                )

        Mount dNFS_Mount2 (
                Critical = 0
                MountPoint = "/12"
                BlockDevice = "/dev/vx/dsk/dg01/vol2"
                FSType = vxfs
                FsckOpt = "-y"
                )

        NFS dNFS_NFS (
                Critical = 0
                NFSv4Root = "/"
                )

        NFSRestart NFSRestart_1 (
                NFSRes = dNFS_NFS
                )

        NIC dNFS_NIC (
                Device @AIX-23 = en2
                Device @AIX-24 = en2
                NetworkHosts = { "172.17.12.1" }
                )

        Share dNFS_Share1 (
                Critical = 0
                PathName = "/11"
                Options = "rw,vers=4"
                )

        Share dNFS_Share2 (
                Critical = 0
                PathName = "/12"
                Options = "rw,vers=4"
                )

        NFSRestart_1 requires dNFS_IP
        dNFS_IP requires dNFS_NIC
        dNFS_IP requires dNFS_Share1
        dNFS_IP requires dNFS_Share2
        dNFS_Mount1 requires dg01
        dNFS_Mount2 requires dg01
        dNFS_Share1 requires dNFS_Mount1
        dNFS_Share1 requires dNFS_NFS
        dNFS_Share2 requires dNFS_Mount2
        dNFS_Share2 requires dNFS_NFS

Whenever I am trying to failover the SG using - "hagrp -switch SG1 -to AIX-24" [Assuming the SG is online on AIX-23], after failover the client returns following on trying to read/ write into the FS:

AIX91-C2 > touch /fs0/xxx [AIX91-C2 is my client]
touch: 0652-046 Cannot create /fs0/xxx

AIX91-C2 > ls -l /fs0
ls: 0653-341 The file /fs0/xyz does not exist.
total 0
drwxr-xr-x 2 root system 96 Aug 22 2011 lost+found

I am not getting any issues if I offline/ online the SG on the same node with an interval of about 10 mins, i.e. offline the SG wait for 10 mins and then online it on the same node. I guess this is something related to NFS Lock or file handle but not sure how to resolve.

Any help is greatly appreciated.

Thanks in advance.

mrrout · ‎09-14-2011

Mike,

Thank you very much for your input, I was missing some madatory steps.

But the problem I had (Client failing to read/ write to NFS mounts) was due to different major numbers for VxVM disks in two servers (One had 52 the other had 53). After modify it to the same major number using - haremajor -s <#> in one server this issue did not re-appear.

Thank you very much for your comments.

Manoranjan.

View solution in original post

mrrout · ‎08-29-2011

If I mount the shared filesystem (From VCS cluster) on a linux client and then do switchover of the NFS service group (SG1) then the client shows:

"Stale file Handle" error for the corresponding filesystem. For a standalone NFS server one would unmount the FS at the client and re-mount it to resolve the issue but a cluster setup should not be requiring this.

Please let me know if anyone has come accross such an issue and any resolution for this.

Thanks in advance.

mikebounds · ‎08-29-2011

Have you done the following from 5.0MP3 RP5 readme_first (ftp://ftp.veritas.com/pub/support/patchcentral/AIX/5.0_MP3/sfha/sfha-aix-5.0MP3RP5-patches.tar.gz_do...:)

Mandatory configuration change for the NFS and NFSRestart

resources

You must perform the following instructions for VCS configurations that have

NFSRestart resources. Failure to perform these instructions can result in

NFS/NFSRestart resources not functioning correctly.

Symantec implemented this change to prevent the invocation of

NFSRestart-related triggers when no NFSRestart resources in the VCS

configuration.

To copy the nfs_preonline and nfs_postoffline files

◆

Copy the nfs_preonline and nfs_postoffline files to the

/opt/VRTSvcs/bin/triggers directory.

# cp /opt/VRTSvcs/bin/sample_triggers/nfs_preonline \

/opt/VRTSvcs/bin/triggers

# cp /opt/VRTSvcs/bin/sample_triggers/nfs_postoffline \

/opt/VRTSvcs/bin/triggers

You were required to do this from at least RP2.

In case you only have RP1 installed, the instructions where different:

1469381

Fixed an issue where the Share agent was 10x slower on 5.0 MP1 with 300+

Share resources in a service group.

Note: This fix changes basic VCS functionality, it is critically important for

you to implement these changes for all service groups that contain

NFSRestart resources.

You must set the value of the PreOnline attribute to 1 for all service groups

that contain NFSRestart resources. Failure to set the service group's

PreOnline attribute to a value of 1 results in broken NFSRestart resource

configurations.

The ha commands to change this attribute are:

# haconf -makerw

# hagrp -modify servicegroup_name PreOnline 1

# haconf -dump -makero

Mike

mrrout · ‎09-14-2011

Mike,

Thank you very much for your input, I was missing some madatory steps.

But the problem I had (Client failing to read/ write to NFS mounts) was due to different major numbers for VxVM disks in two servers (One had 52 the other had 53). After modify it to the same major number using - haremajor -s <#> in one server this issue did not re-appear.

Thank you very much for your comments.

Manoranjan.

VOX

Client can not read/write after SG fail-over