cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

CVM won't start on remote node with an FSS diskgroup

mikebounds
Level 6
Partner Accredited

I am testing FSS (Flexible Shared Storage) on SF 6.1 on RH 5.5 in a Virtual Box VM and when I try to start CVM on the remote node I get:

VCS ERROR V-16-20006-1005 (r55v61b) CVMCluster:cvm_clus:monitor:node - state: out of cluster reason: Disk for disk group not found: retry to add a node failed

Here is my setup:

Node A is master with a local (sdd) and remote disk (B_sdd)

[root@r55v61a ~]# vxdctl -c mode
mode: enabled: cluster active - MASTER
master: r55v61a
[root@r55v61a ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
B_sdd        auto:cdsdisk    -            -            online remote
sda          auto:none       -            -            online invalid
sdb          auto:none       -            -            online invalid
sdc          auto:cdsdisk    -            -            online
sdd          auto:cdsdisk    -            -            online exported

 

Node B is the slave, and sees local (sdd) and remote disk (A_sdd)

[root@r55v61b ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
A_sdd        auto:cdsdisk    -            -            online remote
sda          auto:none       -            -            online invalid
sdb          auto:none       -            -            online invalid
sdc          auto:cdsdisk    -            -            online
sdd          auto:cdsdisk    -            -            online exported

 

On node A, I add an FSS diskgroup, so on node A the disk is local

[root@r55v61a ~]# vxdg -s -o fss=on init fss-dg fd1_La=sdd
[root@r55v61a ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
B_sdd        auto:cdsdisk    -            -            online remote
sda          auto:none       -            -            online invalid
sdb          auto:none       -            -            online invalid
sdc          auto:cdsdisk    -            -            online
sdd          auto:cdsdisk    fd1_La       fss-dg       online exported shared

 

And on node B the disk in fss-dg is remote

[root@r55v61b ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
A_sdd        auto:cdsdisk    fd1_La       fss-dg       online shared remote
sda          auto:none       -            -            online invalid
sdb          auto:none       -            -            online invalid
sdc          auto:cdsdisk    -            -            online
sdd          auto:cdsdisk    -            -            online exported

 

I then stop and start VCS on node B which is when I see the issue:

2014/05/13 12:05:23 VCS INFO V-16-2-13716 (r55v61b) Resource(cvm_clus): Output of the completed operation (online) 
==============================================
ERROR: 
==============================================

2014/05/13 12:05:24 VCS ERROR V-16-20006-1005 (r55v61b) CVMCluster:cvm_clus:monitor:node - state: out of cluster
reason: Disk for disk group not found: retry to add a node failed

 

If I destroy fss-dg diskgroup on node A, then CVM will start on node B, so issue is the FSS diskgroup where it seems CVM cannot find the remote disk in the diskgroup

I can also get round issue by stopping VCS on node A and then CVM will start on node B:

[root@r55v61b ~]# hagrp -online cvm -sys r55v61b
[root@r55v61b ~]# vxdisk -o alldgs list
DEVICE       TYPE            DISK         GROUP        STATUS
sda          auto:none       -            -            online invalid
sdb          auto:none       -            -            online invalid
sdc          auto:cdsdisk    -            -            online
sdd          auto:cdsdisk    -            -            online exported

 

 

If I then start VCS on node A, then B is able to see the FSS diskgroup:

[root@r55v61b ~]# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
A_sdd        auto:cdsdisk    fd1_La       fss-dg       online shared remote
sda          auto:none       -            -            online invalid
sdb          auto:none       -            -            online invalid
sdc          auto:cdsdisk    -            -            online
sdd          auto:cdsdisk    -            -            online exported

 

 

I can stop and start VCS on each node when disks are just exported and VCS is able to see disk from other node, but when I create the FSS diskgroup, CVM won't start on the system that has the remote disk - does anybody have any ideas as to why?

 

Mike

21 REPLIES 21

ccarrero
Level 4
Employee

That is right, Mike. You can now leverage FSS to create a mirror between different hosts if your bandwidth and latency are ok for you.

To clarify the DISKS issue, we have not done any work with VBOX, but we have an ASL to recognize VMDK disks when usign VMware. Therefore, FSS can be used with VMware inside the VMs.

mikebounds
Level 6
Partner Accredited

The issue of this post which was CVM would not start if an FSS diskgroup was present, giving error message:

CVMCluster:cvm_clus:monitor:node - state: out of cluster reason: Disk for disk group not found: retry to add a node failed

was resolved by recreating separate diskgroup which was purely CVM (no exported disks).  The likely issues was UDID mismatches or conflicts as it would appear with non-FSS failover and CVM diskgroups, all that is required is that VxVM read the private region, but with FSS diskgroups, my theory is the UDID is required to be used to ensure that if you export a disk then it only shows as a remote disk on other systems if the same disk can NOT be seen on the SAN of the remote system and it needs to use the UDID to determine this.

Hence in Virtual box, the same disk will normally show as having different UDID when viewed from different systems, and if this disk is shared then I did indeed see a single disk presented on one server BOTH via the SAN and as a remote disk, but when I made the UDID the same by changing the hostname of one of the nodes so both nodes had the same hostname and hence the same constucted UDID, then VxVM correctly identied the remote disk was available via the SAN and hence ONLY showed the disk as a SAN attached disk and not also a remote disk.

Although in my opening post I was not exporting any shared SAN disks (only local disks), I believe the UDID checking when autoimporting the diskgroups caused the issue.

Mike