Hi, Steps to configure I/O

brucemcgill_n1 · ‎05-21-2012

Hi,

To setup SFRAC, I need to configure I/O fencing on a 2-node VCS configuration. I am unable to create the shared diskgroup after configuring I/O Fencing. Without I/O Fencing, I am able to create the shared dikgroup. The error is: VxVM vxdg ERROR V-5-1-585 Disk group oradg: cannot create: Record not in disk group

The two SPARC v400 servers "node1" and "node2 are connected to a Sun StorageTek 6140 array and the details are as follows:

On "node1":

node1 * / # vxdisk -e list
DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
disk_0       auto:none      -            -           online invalid       c1t3d0s2         -
disk_1       auto:none      -            -           online invalid       c1t2d0s2         -
disk_2       auto:cdsdisk   disk_2       vxfencoorddg online               c2t6d0s2         -
disk_3       auto:cdsdisk   disk_3       vxfencoorddg online               c2t7d0s2         -
disk_4       auto:cdsdisk   -            -           online               c2t0d0s2         -
disk_5       auto:cdsdisk   disk_5       vxfencoorddg online               c2t5d0s2         -
disk_6       auto:cdsdisk   -            -           online               c2t3d0s2         -
disk_7       auto:none      -            -           online invalid       c2t1d0s2         -
disk_8       auto:cdsdisk   -            -           online               c2t2d0s2         -
disk_9       auto:none      -            -           online invalid       c2t4d0s2         -
disk_10      auto:none      -            -           online invalid       c1t1d0s2         -
disk_11      auto:ZFS       -            -           ZFS                  c1t0d0s2         -

On "node2":

node2 * / # vxdisk -e list
DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
disk_0       auto:none      -            -           online invalid       c1t3d0s2         -
disk_1       auto:none      -            -           online invalid       c1t1d0s2         -
disk_2       auto:ZFS       -            -           ZFS                  c1t0d0s2         -
disk_3       auto:cdsdisk   -            -           online               c3t6d0s2         -
disk_4       auto:cdsdisk   -            -           online               c3t7d0s2         -
disk_5       auto:cdsdisk   -            -           error                c3t0d0s2         -
disk_6       auto:cdsdisk   -            -           online               c3t5d0s2         -
disk_7       auto:cdsdisk   -            -           online               c3t3d0s2         -
disk_8       auto:none      -            -           online invalid       c3t1d0s2         -
disk_9       auto:cdsdisk   -            -           online               c3t2d0s2         -
disk_10      auto:none      -            -           online invalid       c3t4d0s2         -
disk_11      auto:none      -            -           online invalid       c1t2d0s2         -

On "node1":

node1 * / # vxdg -s init oradg oradg01=disk_4
VxVM vxdg ERROR V-5-1-585 Disk group oradg: cannot create: Record not in disk group

node1 * / # vxdg -s init oradg oradg01=disk_8
VxVM vxdg ERROR V-5-1-585 Disk group oradg: cannot create: Record not in disk group

I have installed the latest ASL libraries for StorageTek 6140 array. Can anyone please help?

Best Regards,

Bruce

Neeraj_Bhatia · ‎05-22-2012

This message is documented on SORT: https://sort.symantec.com/ecls/umi/V-5-1-585.

Moving this post to the Storage Foundation forum for further discussion.

jstucki · ‎05-22-2012

Bruce,

Can you explain what process you went through to install and configure the product up to this point? Did you use the "installsfrac" utility? And did you use the "installsfrac -fencing" command (as described in the installation guide) to configure the fencing for you? The reason I ask is that it appears that your fencing disks are imported on node 1. They shouldn't be imported on any node. They should be deported on both nodes.

Did you perform the steps to test your fencing disks, by first identifying them by SCSI serial number (vxfenadm -i /dev/rdsk/c1t1d0s2) on both systems, and then using the vxfentsthdw utility? Before trying to create shared disk groups, you should have the CFS/CVM/VCS products up and running, after having used the "installsfrac" utility to take care of getting I/O Fencing configured properly, and also getting the base products of CFS/CVM/VCS up and running properly, so that you can begin to create shared disk groups.

It doesn't appear that your configuration is ready to begin creating shared disk groups. If this is your first time installing this product, read completely through the installation guide (maybe 2 or 3 times.....its a bit confusing the first time reading through it, and trying to understand what needs to happen when), and post any questions you have on this forum. Get your questions answered before beginning the installation process.

-John

joseph_dangelo · ‎05-22-2012

Bruce,

Are you able to create a private disk group using the same devce?

If so you will want to then check to see if the cvm service group is online.

Can you post the ouput of the following command:

gabconfig -a

Joe D

brucemcgill_n1 · ‎05-23-2012

Hi,

Steps to configure I/O Fencing:

node1 * / # vxdisk -e list

DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
disk_0       auto:none      -            -           online invalid       c1t3d0s2         -
disk_1       auto:none      -            -           online invalid       c1t2d0s2         -
disk_2       auto:none      -            -           online invalid       c3t6d0s2         -
disk_3       auto:none      -            -           online invalid       c3t7d0s2         -
disk_4       auto:none      -            -           online invalid       c3t0d0s2         -
disk_5       auto:none      -            -           online invalid       c3t5d0s2         -
disk_6       auto:none      -            -           online invalid       c3t3d0s2         -
disk_7       auto:none      -            -           online invalid       c3t1d0s2         -
disk_8       auto:none      -            -           online invalid       c3t2d0s2         -
disk_9       auto:none      -            -           online invalid       c3t4d0s2         -
disk_10      auto:none      -            -           online invalid       c1t1d0s2         -
disk_11      auto:ZFS       -            -           ZFS                  c1t0d0s2         -

node2 * / # vxdisk -e list

DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
disk_0       auto:none      -            -           online invalid       c1t3d0s2         -
disk_1       auto:none      -            -           online invalid       c1t1d0s2         -
disk_2       auto:ZFS       -            -           ZFS                  c1t0d0s2         -
disk_3       auto:none      -            -           online invalid       c3t6d0s2         -
disk_4       auto:none      -            -           online invalid       c3t7d0s2         -
disk_5       auto:none      -            -           online invalid       c3t0d0s2         -
disk_6       auto:none      -            -           online invalid       c3t5d0s2         -
disk_7       auto:none      -            -           online invalid       c3t3d0s2         -
disk_8       auto:none      -            -           online invalid       c3t1d0s2         -
disk_9       auto:none      -            -           online invalid       c3t2d0s2         -
disk_10      auto:none      -            -           online invalid       c3t4d0s2         -
disk_11      auto:none      -            -           online invalid       c1t2d0s2         -

On node1, disk_3, disk_4 & disk_5 were used as coordinator disks:

node1 * / # vxdisk -f init disk_3
node1 * / # vxdisk -f init disk_4
node1 * / # vxdisk -f init disk_5

Steps to configure I/O Fencing:

node1 * / # vxfenadm -i /dev/rdsk/c3t6d0s2
Vendor id       : SEAGATE
Product id      : ST330055FSUN300G
Revision        : 3192
Serial Number   : 2000001D3874D6D0

node2 * / # vxfenadm -i /dev/rdsk/c3t6d0s2
Vendor id       : SEAGATE
Product id      : ST330055FSUN300G
Revision        : 3192
Serial Number   : 2000001D3874D6D0

node1 * / # vxfenadm -i /dev/rdsk/c3t7d0s2
Vendor id       : SEAGATE
Product id      : ST330055FSUN300G
Revision        : 3093
Serial Number   : 2000001D38AC4B15

node2 * / # vxfenadm -i /dev/rdsk/c3t7d0s2
Vendor id       : SEAGATE
Product id      : ST330055FSUN300G
Revision        : 3093
Serial Number   : 2000001D38AC4B15

node1 * / # vxfenadm -i /dev/rdsk/c3t0d0s2
Vendor id       : SEAGATE
Product id      : ST330057FSUN300G
Revision        : 0605
Serial Number   : 2000001D38FCD64A

node2 * / # vxfenadm -i /dev/rdsk/c3t0d0s2
Vendor id       : SEAGATE
Product id      : ST330057FSUN300G
Revision        : 0605
Serial Number   : 2000001D38FCD64A

node1 * / # vxdctl -c mode
mode: enabled: cluster active - MASTER
master: node1

node1 * / # cfscluster status

Node             : node1
Cluster Manager : running
CVM state        : running
No mount point registered with cluster configuration

Node             : node2
Cluster Manager : running
CVM state        : running
No mount point registered with cluster configuration

node1 * / # hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen

A node1                RUNNING              0
A node2                RUNNING              0

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State

B cvm             node1                Y          N               ONLINE
B cvm             node2                Y          N               ONLINE

node1 * / # vxdisk -e list
DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
disk_0       auto:none      -            -           online invalid       c1t3d0s2         -
disk_1       auto:none      -            -           online invalid       c1t2d0s2         -
disk_2       auto:none      -            -           online invalid       c3t6d0s2         -
disk_3       auto:cdsdisk   -            -           online               c3t7d0s2         -
disk_4       auto:cdsdisk   -            -           online               c3t0d0s2         -
disk_5       auto:cdsdisk   -            -           online               c3t5d0s2         -
disk_6       auto:cdsdisk   -            -           online               c3t3d0s2         -
disk_7       auto:none      -            -           online invalid       c3t1d0s2         -
disk_8       auto:none      -            -           online invalid       c3t2d0s2         -
disk_9       auto:none      -            -           online invalid       c3t4d0s2         -
disk_10      auto:none      -            -           online invalid       c1t1d0s2         -
disk_11      auto:ZFS       -            -           ZFS                  c1t0d0s2         -

node2 * / # vxdisk -e list
DEVICE       TYPE           DISK        GROUP        STATUS               OS_NATIVE_NAME   ATTR
disk_0       auto:none      -            -           online invalid       c1t3d0s2         -
disk_1       auto:none      -            -           online invalid       c1t1d0s2         -
disk_2       auto:ZFS       -            -           ZFS                  c1t0d0s2         -
disk_3       auto:none      -            -           online invalid       c3t6d0s2         -
disk_4       auto:cdsdisk   -            -           online               c3t7d0s2         -
disk_5       auto:cdsdisk   -            -           online               c3t0d0s2         -
disk_6       auto:cdsdisk   -            -           online               c3t5d0s2         -
disk_7       auto:cdsdisk   -            -           online               c3t3d0s2         -
disk_8       auto:none      -            -           online invalid       c3t1d0s2         -
disk_9       auto:none      -            -           online invalid       c3t2d0s2         -
disk_10      auto:none      -            -           online invalid       c3t4d0s2         -
disk_11      auto:none      -            -           online invalid       c1t2d0s2         -

node1 * / # gabconfig -a
GAB Port Memberships
===============================================================
Port a gen   30ae02 membership 01
Port b gen   30ae15 membership 01
Port d gen   30ae06 membership 01
Port f gen   30ae22 membership 01
Port h gen   30ae18 membership 01
Port o gen   30ae01 membership 01
Port u gen   30ae1f membership 01
Port v gen   30ae1b membership 01
Port w gen   30ae1d membership 01
Port y gen   30ae1a membership 01

node2 * / # gabconfig -a
GAB Port Memberships
===============================================================
Port a gen   30ae02 membership 01
Port b gen   30ae15 membership 01
Port d gen   30ae06 membership 01
Port f gen   30ae22 membership 01
Port h gen   30ae18 membership 01
Port o gen   30ae01 membership 01
Port u gen   30ae1f membership 01
Port v gen   30ae1b membership 01
Port w gen   30ae1d membership 01
Port y gen   30ae1a membership 01

Following is the error:

node1 * / # vxdg -s init oradg oradg01=disk_6

VxVM vxdg ERROR V-5-1-585 Disk group oradg: cannot create: Record not in disk group

Regards,

Bruce

brucemcgill_n1 · ‎05-23-2012

Hi,

In continuation to my previous mail, the external storage is a StorageTek 6140 JBOD and the fibre channel connectivity exists from the Sun Fire v440 servers to the array.

I also see the following messages:

May 23 11:38:52 node1 gab: [ID 272960 kern.notice] GAB INFO V-15-1-20036 Port a[GAB_Control (refcount 1)] gen   30ae02 memb                                  ership 01
May 23 11:38:52 node1 gab: [ID 828938 kern.notice] GAB INFO V-15-1-20037 Port a[GAB_Control (refcount 1)] gen   30ae02   je                                  opardy ;1
May 23 11:39:09 node1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-5759 oradg: dg create with I/O fence enabled
May 23 11:39:09 node1 vxvm:vxconfigd: [ID 702911 daemon.notice]
May 23 11:39:09 node1 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost 66 hb seq 39694 from 1 link 0 (ce1)
May 23 11:39:10 node1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@2/fp@0,0/ssd@w2200001d38fcdbcf,0 (ssd                                  10):
May 23 11:39:10 node1   reservation conflict
jeopardy ;1
May 23 11:39:33 node1 Had[5772]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10111 System node2 (Node '1') is in Regular and                                   Jeopardy Memberships - Membership: 0x3, Jeopardy: 0x2
May 23 11:39:33 node1 gab: [ID 272960 kern.notice] GAB INFO V-15-1-20036 Port y[GAB_LEGACY_CLIENT (refcount 0)] gen   30ae1                                  a membership 01
May 23 11:39:33 node1 gab: [ID 828938 kern.notice] GAB INFO V-15-1-20037 Port y[GAB_LEGACY_CLIENT (refcount 0)] gen   30ae1                                  a   jeopardy ;1
May 23 11:39:33 node1 gab: [ID 272960 kern.notice] GAB INFO V-15-1-20036 Port o[VCSMM (refcount 2)] gen   30ae01 membership                                   01
May 23 11:39:33 node1 gab: [ID 828938 kern.notice] GAB INFO V-15-1-20037 Port o[VCSMM (refcount 2)] gen   30ae01   jeopardy                                   ;1
May 23 11:39:33 node1 gab: [ID 272960 kern.notice] GAB INFO V-15-1-20036 Port u[GAB_USER_CLIENT (refcount 0)] gen   30ae1f                                   membership 01
May 23 11:39:33 node1 gab: [ID 828938 kern.notice] GAB INFO V-15-1-20037 Port u[GAB_USER_CLIENT (refcount 0)] gen   30ae1f                                     jeopardy ;1
May 23 11:39:56 node1 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost 95 hb seq 40582 from 1 link 0 (ce1)
May 23 11:40:31 node1 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost 700 hb seq 41283 from 1 link 0 (ce1)
May 23 11:40:36 node1 llt: [ID 602713 kern.notice] LLT INFO V-14-1-10023 lost 98 hb seq 41382 from 1 link 0 (ce1)
May 23 11:40:36 node1 llt: [ID 860062 kern.notice] LLT INFO V-14-1-10024 link 0 (ce1) node 1 active
May 23 11:40:38 node1 llt: [ID 140958 kern.notice] LLT INFO V-14-1-10205 link 0 (ce1) node 1 in trouble

Regards,

Bruce

brucemcgill_n1 · ‎05-23-2012

Hi,

main.cf file:

node1 * / # cat /etc/VRTSvcs/conf/config/main.cf
include "OracleASMTypes.cf"
include "types.cf"
include "CFSTypes.cf"
include "CRSResource.cf"
include "CVMTypes.cf"
include "Db2udbTypes.cf"
include "MultiPrivNIC.cf"
include "OracleTypes.cf"
include "PrivNIC.cf"
include "SybaseTypes.cf"

cluster vcs (
        UserNames = { admin = dlmElgLimHmmKumGlj }
        Administrators = { admin }
        UseFence = SCSI3
        HacliUserLevel = COMMANDROOT
        )

system node1 (
        )

system node2 (
        )

group cvm (
        SystemList = { node1 = 0, node2 = 1 }
        AutoFailOver = 0
        Parallel = 1
        AutoStartList = { node1, node2 }
        )

        CFSfsckd vxfsckd (
                )

        CVMCluster cvm_clus (
                CVMClustName = vcs
                CVMNodeId = { node1 = 0, node2 = 1 }
                CVMTransport = gab
                CVMTimeout = 200
                )

        CVMVxconfigd cvm_vxconfigd (
                Critical = 0
                CVMVxconfigdArgs = { syslog }
                )

        cvm_clus requires cvm_vxconfigd
        vxfsckd requires cvm_clus

        // resource dependency tree
        //
        //      group cvm
        //      {
        //      CFSfsckd vxfsckd
        //          {
        //          CVMCluster cvm_clus
        //              {
        //              CVMVxconfigd cvm_vxconfigd
        //              }
        //          }
        //      }

Regards,

Bruce

Gaurav_S · ‎05-23-2012

This is bit strange ... you are on the master node & still are unable to create the diskgroup ... since CVM is online, I am assuming that all the licenses are intact ...

By the way what is product version & OS version ?

I may think of following doubts ..

1. Coordinator disks have any pre-existing keys ! Try using a "vxfenclearpre" to clear pre-existing keys.

2 . Did u check "vxfentsthdw" utility & confirmed that all the tests are passed ?

3. coordinator dg should be deported on all the nodes ... not sure at this time whether you have any keys on coordinator disks or no..

4. were the disks presented here were having preexisting vxvm config ? is this setup an upgraded setup ?

can u paste following:

# cat /etc/vxfentab

# cat /etc/vxfenmode

# modinfo |egrep 'llt|gab|vxfen'

# /sbin/vxfenadm -g all -f /etc/vxfentab

also paste output of vxfentsthdw utility ..

G

brucemcgill_n1 · ‎05-23-2012

Hi,

This is a fresh installation. The OS is Solaris 10 Update 10 SPARC 64-bit and the cluster software is 6.0. The vxfentsthdw utility was run and all the disks succeeded the test. Regarding the co-ordinator diskgroup, it is deported from both the nodes and was not imported during shared diskgroup creation. The disks were having old vxvm configuration. After a fresh installation of OS, the vxdiskunetup -C command was run and the disks were re-initialzed.

node1 * / # cat /etc/vxfentab
#
# /etc/vxfentab:
# DO NOT MODIFY this file as it is generated by the
# VXFEN rc script from the file /etc/vxfendg.
#
/dev/rdsk/c3t7d0s2
/dev/rdsk/c2t7d0s2
/dev/rdsk/c3t0d0s2
/dev/rdsk/c2t0d0s2
/dev/rdsk/c3t5d0s2
/dev/rdsk/c2t5d0s2

node1 * / # cat /etc/vxfenmode
#
# vxfen_mode determines in what mode VCS I/O Fencing should work.
#
# available options:
# scsi3      - use scsi3 persistent reservation disks
# customized - use script based customized fencing
# disabled   - run the driver but don't do any actual fencing
#
vxfen_mode=scsi3

#
# scsi3_disk_policy determines the way in which I/O Fencing communicates with
# the coordination disks.
#
# available options:
#       dmp - use dynamic multipathing
#       raw - connect to disks using the native interface
#
# Although vxfen supports raw for scsi3_disk_policy, it is now deprecated.
# It is recommended to use dmp as the scsi3_disk_policy for coordinator disks.

scsi3_disk_policy=raw
node1 * / # modinfo |egrep 'llt|gab|vxfen'
234 7af02000 31368 344   1 llt (LLT 6.0)
235 7af2a000 65e18 345   1 gab (GAB device 6.0)
236 7ae70000 79758 346   1 vxfen (VRTS Fence 6.0)

node1 * / # /sbin/vxfenadm -g all -f /etc/vxfentab
VXFEN vxfenadm WARNING V-11-2-2414 This option is deprecated and would be removed with the next release.
Please use the -s option.

Device Name: /dev/rdsk/c3t0d0s2
Total Number Of Keys: 4
key[0]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[1]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[2]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001
key[3]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001

Device Name: /dev/rdsk/c2t7d0s2
Total Number Of Keys: 4
key[0]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[1]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[2]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001
key[3]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001

Device Name: /dev/rdsk/c2t5d0s2
Total Number Of Keys: 4
key[0]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[1]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[2]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001
key[3]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001

Device Name: /dev/rdsk/c2t0d0s2
Total Number Of Keys: 4
key[0]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[1]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[2]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001
key[3]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001

Device Name: /dev/rdsk/c3t7d0s2
Total Number Of Keys: 4
key[0]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[1]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[2]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001
key[3]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001

Device Name: /dev/rdsk/c3t5d0s2
Total Number Of Keys: 4
key[0]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[1]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,48
        Key Value [Character Format]: VF000000
key[2]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001
key[3]:
        Key Value [Numeric Format]: 86,70,48,48,48,48,48,49
        Key Value [Character Format]: VF000001

Regards,

Bruce

Gaurav_S · ‎05-23-2012

ok keys are looking fine ....

can u paste

# vxdisk -o alldgs list

# ps -ef |grep -i vxconfigd

# vxdg list

# vxdg list oradg

# vxlicrep -e

try initiaing a diskgroup with other name .... lets say testdg (error is appearing for oradg) ... also try creating a local diskgroup without -s option & share the results ..

G

mikebounds · ‎05-23-2012

I would check you can create a private non-shared diskgroup first:

vxdg init oradg oradg01=disk_6

If this works, then deport diskgroup and try to import on second node to confirm both systems can access disk.

If above does not work, then issue has nothing to do with CVM, but if it does work, then deport diskgroup and try to import (on master node) as shared (effectively this converts a private diskgroup to a shared diskgroup):

vxdg -s import oradg

Mike

brucemcgill_n1 · ‎05-23-2012

Hi,

I re-installed VCS. The layout is as follows:

node1 * / # vxdisk -f init disk_5
node1 * / # vxdg init oradg oradg01=disk_5
node1 * / # vxdg deport oradg

node2 * / # vxdg import oradg
node2 * / # vxdg deport oradg

node1 * / # vxdg -s import oradg
VxVM vxdg ERROR V-5-1-10978 Disk group oradg: import failed:
No valid disk found containing disk group

But ...

node1 * / # vxdisk -o alldgs list
DEVICE       TYPE            DISK         GROUP        STATUS
disk_0       auto:none       -            -            online invalid
disk_1       auto:none       -            -            online invalid
disk_2       auto:cdsdisk    -            (vxfencoorddg) online
disk_3       auto:cdsdisk    -            (vxfencoorddg) online
disk_4       auto:cdsdisk    -            (vxfencoorddg) online
disk_5       auto:cdsdisk    -            (oradg)      online shared
disk_6       auto:none       -            -            online invalid
disk_7       auto:none       -            -            online invalid
disk_8       auto:none       -            -            online invalid
disk_9       auto:none       -            -            online invalid
disk_10      auto:none       -            -            online invalid
disk_11      auto:ZFS        -            -            ZFS

node2 * / # vxdisk -o alldgs list
DEVICE       TYPE            DISK         GROUP        STATUS
disk_0       auto:none       -            -            online invalid
disk_1       auto:none       -            -            online invalid
disk_2       auto:ZFS        -            -            ZFS
disk_3       auto:cdsdisk    -            (vxfencoorddg) online
disk_4       auto:cdsdisk    -            (vxfencoorddg) online
disk_5       auto:cdsdisk    -            (vxfencoorddg) online
disk_6       auto:cdsdisk    -            (oradg)      online shared
disk_7       auto:none       -            -            online invalid
disk_8       auto:none       -            -            online invalid
disk_9       auto:none       -            -            online invalid
disk_10      auto:none       -            -            online invalid
disk_11      auto:none       -            -            online invalid

node1 * / # vxdg list
NAME         STATE           ID

node2 * / # vxdg list
NAME         STATE           ID

Regards,

Bruce

brucemcgill_n1 · ‎05-23-2012

Hi,

In continuation to my previous comment, the shared diskgroup is not listed with vxdg command and since the dikgroup is not listed, I won't be able to destroy it:

node1 * / # vxdisk -o alldgs list
DEVICE       TYPE            DISK         GROUP        STATUS
disk_0       auto:none       -            -            online invalid
disk_1       auto:none       -            -            online invalid
disk_2       auto:cdsdisk    -            (vxfencoorddg) online
disk_3       auto:cdsdisk    -            (vxfencoorddg) online
disk_4       auto:cdsdisk    -            (vxfencoorddg) online
disk_5       auto:cdsdisk    -            (oradg)      online shared
disk_6       auto:none       -            -            online invalid
disk_7       auto:none       -            -            online invalid
disk_8       auto:none       -            -            online invalid
disk_9       auto:none       -            -            online invalid
disk_10      auto:none       -            -            online invalid
disk_11      auto:ZFS        -            -            ZFS

node1 * / # vxdg -s import oradg
VxVM vxdg ERROR V-5-1-10978 Disk group oradg: import failed:
Cannot find disk on slave node

node2 * / # vxdg -s import oradg
VxVM vxdg ERROR V-5-1-10978 Disk group oradg: import failed:
Cannot find disk on slave node

node1 * / # vxdg list
NAME STATE ID

node2 * / # vxdg list
NAME STATE ID

Regards,

Bruce

brucemcgill_n1 · ‎05-23-2012

Hi,

After a reboot:

node1 * / # vxdg import oradg
VxVM vxdg ERROR V-5-1-10978 Disk group oradg: import failed:
Disk group exists and is imported

node1 * / # vxdg list
NAME STATE ID
oradg enabled,shared,cds 1337776919.42.node1

node1 * / # cfsdgadm add oradg all=sw
Disk Group is being added to cluster configuration...

node1 * / # cat /etc/VRTSvcs/conf/config/main.cf
include "OracleASMTypes.cf"
include "types.cf"
include "CFSTypes.cf"
include "CRSResource.cf"
include "CVMTypes.cf"
include "Db2udbTypes.cf"
include "MultiPrivNIC.cf"
include "OracleTypes.cf"
include "PrivNIC.cf"
include "SybaseTypes.cf"

cluster vcs (
        UserNames = { admin = hlmElgLimHmmKumGlj }
        Administrators = { admin }
        UseFence = SCSI3
        HacliUserLevel = COMMANDROOT
        )

system node1 (
        )

system node2 (
        )

group cvm (
        SystemList = { node1 = 0, node2 = 1 }
        AutoFailOver = 0
        Parallel = 1
        AutoStartList = { node1, node2 }
        )

        CFSfsckd vxfsckd (
                ActivationMode @node1 = { oradg = sw }
                ActivationMode @node2 = { oradg = sw }
                )

        CVMCluster cvm_clus (
                CVMClustName = vcs
                CVMNodeId = { node1 = 0, node2 = 1 }
                CVMTransport = gab
                CVMTimeout = 200
                )

        CVMVxconfigd cvm_vxconfigd (
                Critical = 0
                CVMVxconfigdArgs = { syslog }
                )

        cvm_clus requires cvm_vxconfigd
        vxfsckd requires cvm_clus

        // resource dependency tree
        //
        //      group cvm
        //      {
        //      CFSfsckd vxfsckd
        //          {
        //          CVMCluster cvm_clus
        //              {
        //              CVMVxconfigd cvm_vxconfigd
        //              }
        //          }
        //      }

However, the CVM service group has faulted on "node2":

node1 * / # hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen

A node1                RUNNING              0
A node2                RUNNING              0

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State

B cvm             node1                Y          N               ONLINE
B cvm             node2                Y          N               OFFLINE|FAULTED

-- RESOURCES FAILED
-- Group           Type                 Resource             System

D cvm             CVMCluster           cvm_clus             node2

Regards,

Bruce

mikebounds · ‎05-23-2012

This proves storage can be accessed by both nodes, so problem must be with communication between 2 nodes (LLT, GAB, CVM) or fencing

Interestingly the second import gives more information to the problem:

Cannot find disk on slave node:

So when you initially ran "vxdg -s import" the "shared" flag was successfully set, but the import failed and the error was:

No valid disk found containing disk group

Which I guess basically meant can't find disk on slave node, but we know the 2nd node can access this disk as you could import diskgroup there when it was a private diskgroup

You could try disabling fencing to see if fencing is the issue, by changing vxfen_mode to disabled in /etc/vxfenmode and removing line "UseFence = SCSI3" in main.cf then restarting VCS and fencing. This used to work in earlier versions of CVM, when fencing was not mandatory, but this may still work (now fencing is mandatory), but it wouldn't be supported (but this is just a test to see if fencing is the issue).

Mike

mikebounds · ‎05-23-2012

Sorry, missed statement in your initial post "Without I/O Fencing, I am able to create the shared dikgroup", so definatley a fencing problem !!

Mike

mikebounds · ‎05-23-2012

Are you using DMP, if so you should set scsi3_disk_policy=dmp in the /etc/vxfenmode file

Mike

brucemcgill_n1 · ‎05-23-2012

You are right about the fencing issue. I am not using DMP but raw devices when I run the vxfentsthdw utility. If I select dmp path like /dev/vx/rdmp/disk_3s2, I get reservation conflict error. However, when I use the raw device path like /dev/rdsk/c3t6d0s2, the I/O fencing is successful but am not able to create shared dik group afterwards.

mikebounds · ‎05-24-2012

Bruce,

I think this is the route of your problem that reservations are not working on dmp path. For I/O fencing arbitration you have the choice of using raw paths or dmp. You really want to use dmp (which wasn't supported until 5.0MP3) as if you use raw path then if you loose that path then you loose access to the disk, rather than being able to use the alternate path if you are using dmp path (as DMP path is virtual and represents 2 physical paths). You can get round this conflict by using raw for I/O fencing arbitration , but for reservations on data disks, these use the dmp paths and these is no option to use raw paths and so you are seeing reservation conflicts in the log:

May 23 11:39:10 node1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/SUNW,qlc@2/fp@0,0/ssd@w2200001d38fcdbcf,0 (ssd 10):
May 23 11:39:10 node1 reservation conflict
jeopardy ;1

So you need to resolve being able to use dmp path, so I would start by setting vxfenmode to dmp and then restarting vxfencing which should auto generate the paths in /etc/vxfentab. You should also make sure the 6140 array and DMP are set to use Active-Passive as this is the only mode supported in HCL (https://sort.symantec.com/documents/doc_details/sfha/6.0/Solaris/CompatibilityLists/). After doing this if it is still not working then I would log a call with Symantec support and use this discussion as reference.

Mike

jstucki · ‎05-24-2012

Bruce,

A couple more thoughts.....

Did you install the RP1 (Rollup Patch) patch set on top of SFRAC 6.0? Its always a good idea to install the latest RP patch set, as it will include a bunch of fixes to bugs encountered to date. https://sort.symantec.com/patch/detail/6119

Do you have MPxIO enabled? Check this Technote about MPxIO and Fencing/SFRAC: http://www.symantec.com/business/support/index?page=content&id=TECH51507

When you run your "vxdisk list" commands, all your LUN devices are showing up with names like "disk_4". Disks from an external array would typically show up with a name like "USP6_0217", which represents a different enclosure. The names beginning with "disk_" are typically the onboard devices. This leads me to believe that DMP isn't recognizing your external array.

It appears that the latest ASL/APM package for SFRAC 6.0 is version 6.0.000.200 2012-04-30: https://sort.symantec.com/asl/details/557

For the STK 6140 array, your ASL library module should be libvxlsiall/2.0/SUN, and your APM library module should be dmpEngenio/1.0. I think you can check them with these commands: # vxdmpadm list dmpnode dmpnodename=disk_3 # vxdmpadm listapm all (to see if the correct APM is "active")

VOX

VxVM vxdg ERROR V-5-1-585 Unable to create shared disk group