cancel
Showing results for 
Search instead for 
Did you mean: 

Veritas Cluster - Steps for a beginner..

JJose
Level 3
Partner

Hi Guys,

I'm new to Veritas as well as to this forum.

I was playing around with Veritas SF 5.0 but not able to finish the clustering part.
here is the description.

 have installed Veritas Storage Foundation with Cluster with 2 nodes.
Basic clustering (IP failure) is happening - so thought about going to the next step , that is sharing of disks.
and i'm having some trouble there.

What i did
1. Allocated a 10 Gb disk from the SAN - Both systems can see the disk
2. On System A - created diskgroup and created volume inside it (no entry in vfstab but did an mkfs )
3. On System A - Created a cluster group by adding both System A and System B in Systemlist.
4. On System A - Created a rsource group for disk group
5. On System A - Created a resource group for mount
6. On System A - modified the mount resource by adding above created volume.
7. On System A - modified the mount resource by adding FsckOpt %-y
8. On System A - enabled the resouces

But when i make the group online , on System B it is going to faulty state complaining about the mount_resource

Am i taking correct steps ?
please advice..
 

 

thanks in advance ..

jj

1 ACCEPTED SOLUTION

Accepted Solutions

g_lee
Level 6
The first message on iadps4 (presumably SystemA) indicates VCS is unable to probe one or more resources to determine the status.

Mount_Resource is missing FSType, which is a required attribute - this would cause it to be unable to probe, and explains why it would have faulted.

If you have VCS 5.0MP3, also check VxFSMountLock for vxfs filesystems

See the VCS bundled agents reference guide for your version for more detail on which attributes are required for resources (link to 5.0MP3 guide: http://sfdoccentral.symantec.com/sf/5.0MP3/solaris/html/vcs_bundled_agents/index.html )

Once you update the configuration to set these atttributes it should be able to probe the resources properly to determine state and online/offline.

To clear the fault on iadps5 (presumably SystemB):
# hares -clear <resource> [-sys <system>]
(where <resource> is each faulted resource shown in # hastatus -summ )

Note: FsckOpt may also require correcting - at the moment it looks like it's trying to be both yes and no ...!?
FsckOpt = "-y%-n"

View solution in original post

11 REPLIES 11

g_lee
Level 6
- have you created the mount point on System B?
- have you tested if you are able to import the dg and mount the filesystem on System B outside of VCS? (ie: when offline/not online on System A)
- if the above steps work, then what are the exact error messages seen in VCS?

hope that helps

JJose
Level 3
Partner


Mount point , yes i have created.

About mounting file system on System B, no i'm not yet successful in that.

1. How we can make the disk offline on System A ?

#vxdisk offline EMC_CLARiiON0_1
VxVM vxdisk ERROR V-5-1-534 Device EMC_CLARiiON0_1: Device is in use

2.If i try to import the diskgroup on System B ,  -( since it is still under System A's control)
#vxdg import Diskgrp
VxVM vxdg ERROR V-5-1-10978 Disk group Diskgrp: import failed:
Disk is in use by another host


thanks,
JJ

g_lee
Level 6
1. Are any of the resources still online on System A? If so, please offline them (diskgroup and mount) so nothing is mounted/imported on SystemA
# hares -offline <resource> -sys SystemA

(if the IP is not in the same group or you don't need the IP up at this point you could also offline the group)

2. It might be a good idea to freeze the group at this point since you're manually bringing things online on SysB
# hagrp -freeze <group>

3. if the diskgroup still imported on System A  (ie: if it shows up in vxdg list) - it shouldn't be if it was offlined correctly in Step 1, but if so, deport it
SystemA# vxdg list
SystemA# vxdg deport <dg>

4. Try to import the dg on SystemB
SystemB# vxdg -t import <dg>  #### (-t means the dg won't be imported on reboot - which is what you want if the dg is to be clustered)

5. Check the volumes - they will probably be in DISABLED state. Start them with vxrecover. Check they're in ENABLED state before trying to mount any filesystems
SystemB# vxprint -htr -g <dg>
SystemB# vxrecover -sb -g <dg>
SystemB# vxprint -htr -g <dg>

6. fsck the filesystem, mount
SystemB# fsck -F <fstype> /dev/vx/rdsk/<dg>/<volume>
SystemB# mount -F <fstype> /dev/vx/dsk/<dg>/<volume>  <mountpoint>

7. If this all works, this suggests a problem with the configuration. Unmount/take things offline on SystemB
SystemB# umount <mountpoint>
SystemB# vxdg deport <dg>
SystemB# hagrp -unfreeze <group>

If wish to investigate further, please send a copy of your configuration file (alternatively you may wish to open a support case if you have a support contract)
On either system
# haconf -dump
# cat /etc/VRTSvcs/conf/config/main.cf

JJose
Level 3
Partner

Hi tried the above steps, and i can mount the volume on System B

here is main.cf from SystemA

iadps4 #cat /etc/VRTSvcs/conf/config/main.cf
include "types.cf"

cluster Veritas-Cluster (
        UserNames = { admin = bIJbIDiFJeJJhRJdIG }
        ClusterAddress = "153.71.79.21"
        Administrators = { admin }
        )

system iadps4 (
        )

system iadps5 (
        )

group ClusterService (
        SystemList = { iadps4 = 0, iadps5 = 1 }
        AutoStartList = { iadps4, iadps5 }
        OnlineRetryLimit = 3
        OnlineRetryInterval = 120
        )

        IP webip (
                Device = bge0
                Address = "153.71.79.21"
                NetMask = "255.255.254.0"
                )

        NIC csgnic (
                Device = bge0
                )

        VRTSWebApp VCSweb (
                Critical = 0
                AppName = cmc
                InstallDir = "/opt/VRTSweb/VERITAS"
                TimeForOnline = 5
                RestartLimit = 3
                )

        VCSweb requires webip
        webip requires csgnic


        // resource dependency tree
        //
        //      group ClusterService
        //      {
        //      VRTSWebApp VCSweb
        //          {
        //          IP webip
        //              {
        //              NIC csgnic
        //              }
        //          }
        //      }


group clustergroup (
        SystemList = { iadps4 = 0, iadps5 = 1 }
        AutoStartList = { iadps4 }
        )

        DiskGroup Disk_Resource (
                DiskGroup = Diskgrp
                )

        Mount Mount_Resource (
                MountPoint = "/VERImnt"
                BlockDevice = "/dev/vx/dsk/Diskgrp/vol1"
                FsckOpt = "-y%-n"
                )

        Mount_Resource requires Disk_Resource


        // resource dependency tree
        //
        //      group clustergroup
        //      {
        //      Mount Mount_Resource
        //          {
        //          DiskGroup Disk_Resource
        //          }
        //      }


iadps4 #


When i tried to bring them online, this is what i'm getting.

iadps4 #hagrp -online clustergroup -sys iadps4
VCS WARNING V-16-1-10162 Group clustergroup has not been fully probed on system iadps4
iadps4 #
iadps4 #hagrp -online clustergroup -sys iadps5
VCS WARNING V-16-1-10157 Group clustergroup is faulted on iadps5; clear faults first with hares -clear



 

JJose
Level 3
Partner

Logs..
I just looked into the log file and found the following

From Mount_A.log
2009/10/08 11:12:31 VCS ERROR V-16-10001-5566 Mount:Mount_Resource:monitor:Invalid FSType () is specified.

From engine_A.log
2009/10/08 10:46:33 VCS WARNING V-16-10001-1038 (iadps4) DiskGroup:Disk_Resource:monitor:Disk attribute 'autoimport' is set to 'yes' for the disk group Diskgrp. Setting it to 'no'.
2009/10/08 10:56:34 VCS NOTICE V-16-1-10233 Clearing Restart attribute for group clustergroup on all nodes
2009/10/08 11:00:36 VCS ERROR V-16-2-13067 (iadps5) Agent is calling clean for resource(Disk_Resource) because the resource became OFFLINE unexpectedly, on its own.
2009/10/08 11:00:38 VCS ERROR V-16-1-10205 Group clustergroup is faulted on system iadps5
2009/10/08 11:00:38 VCS NOTICE V-16-1-10446 Group clustergroup is offline on system iadps5
 

g_lee
Level 6
The first message on iadps4 (presumably SystemA) indicates VCS is unable to probe one or more resources to determine the status.

Mount_Resource is missing FSType, which is a required attribute - this would cause it to be unable to probe, and explains why it would have faulted.

If you have VCS 5.0MP3, also check VxFSMountLock for vxfs filesystems

See the VCS bundled agents reference guide for your version for more detail on which attributes are required for resources (link to 5.0MP3 guide: http://sfdoccentral.symantec.com/sf/5.0MP3/solaris/html/vcs_bundled_agents/index.html )

Once you update the configuration to set these atttributes it should be able to probe the resources properly to determine state and online/offline.

To clear the fault on iadps5 (presumably SystemB):
# hares -clear <resource> [-sys <system>]
(where <resource> is each faulted resource shown in # hastatus -summ )

Note: FsckOpt may also require correcting - at the moment it looks like it's trying to be both yes and no ...!?
FsckOpt = "-y%-n"

JJose
Level 3
Partner


Good,

 

i configured Mount resource with FSType and now it is online.

Failover is also happening perfectly for the vxfx volume.

Thanks so much for the help.

One more doubt, now the volume is Active-Passive, ie i can access it only fomr one  machine.

I can make it active-Active , right ?
is it some ocnfigurations or ..


Thanks,

jj

 

 

g_lee
Level 6
If you want the volume to be active-active (ie: to access the volume from both/more than one machine concurrently) you need Storage Foundation Cluster File System (SFCFS) or SF for Oracle RAC (SFRAC) (if you were planning on using for Oracle RAC) - these are different products to SF-HA.

ie: you cannot make this active/active (parallel) without SFCFS - if you attempt this with standard VxVM/VCS it will corrupt the data on the volumes.

The SFCFS admin guide is a good place to start if you want more details: http://sfdoccentral.symantec.com/sf/5.0MP3/solaris/html/sfcfs_admin/index.html

If there really is a requirement for concurrent access, then you'll need to look at getting this and setting up per admin guide.

JJose
Level 3
Partner

Oh..ok.

Now i will look into Oracel RAC, since that is what i have to do.

Thanks g_lee for the great support..

Much appretiated.

jj

g_lee
Level 6
Glad to have helped you :)

As you mentioned you're unfamiliar / new to the product, would strongly recommend suggest looking into training courses for SF, VCS, and SF RAC if you're going to be running/supporting Oracle RAC with these products, otherwise it may be a fairly daunting learning curve ...!

ScottK
Level 5
Employee
The documentation url referenced above is migrating to:
https://vos.symantec.com/documents/content/336