Volume is mounted on two split-brain nodes in VCS ...

Jamesb_china · ‎01-13-2015

I have built a three-node cluster using vcs 6.0 in sles 11 sp1. Here is the configuration:

main.cf:

include "OracleASMTypes.cf"
include "types.cf"
include "Db2udbTypes.cf"
include "OracleTypes.cf"
include "SybaseTypes.cf"

cluster vcscluster (
ClusterAddress = "192.168.4.10"
SecureClus = 1
UseFence = SCSI3
)

system vcs1 (
)

system vcs2 (
)

system vcs3 (
)

group ClusterService (
SystemList = { vcs1 = 0, vcs2 = 1, vcs3 = 2 }
AutoStartList = { vcs1, vcs2, vcs3 }
OnlineRetryLimit = 3
OnlineRetryInterval = 120
)

IP webip (
Device = eth0
Address = "192.168.4.10"
NetMask = "255.255.255.0"
)

NIC csgnic (
Device = eth0
)

webip requires csgnic

// resource dependency tree
//
// group ClusterService
// {
// IP webip
// {
// NIC csgnic
// }
// }

group apache (
SystemList = { vcs1 = 0, vcs2 = 1, vcs3 = 2 }
AutoStartList = { vcs1 }
)

DiskGroup share_dg (
DiskGroup = share_dg
)

Mount apache_fs (
MountPoint = "/srv/www/htdocs"
BlockDevice = "/dev/vx/dsk/share_dg/apache"
FSType = vxfs
FsckOpt = "-y"
)

apache_fs requires share_dg

// resource dependency tree
//
// group apache
// {
// Mount apache_fs
// {
// DiskGroup share_dg
// }
// }

# lltstat -l
LLT link information:
link 0 eth1 on ether hipri
mtu 1500, sap 0xcafe, broadcast FF:FF:FF:FF:FF:FF, addrlen 6
txpkts 129728 txbytes 14155153
rxpkts 119866 rxbytes 7909769
latehb 0 badcksum 0 errors 0
link 1 eth2 on ether lowpri
mtu 1500, sap 0xcafe, broadcast FF:FF:FF:FF:FF:FF, addrlen 6
txpkts 49369 txbytes 2400217
rxpkts 50476 rxbytes 2480391
latehb 0 badcksum 0 errors 0

# vxfenconfig -l

I/O Fencing Configuration Information:
======================================

Single Disk Flag : 0
Count : 3
Disk List
Disk Name Major Minor Serial Number Policy
/dev/vx/rdmp/disk_2s3 201 67 7ae525da dmp
/dev/vx/rdmp/disk_1s3 201 51 27cddc71 dmp
/dev/vx/rdmp/disk_0s3 201 35 132a74e8 dmp

# vxfenadm -d

I/O Fencing Cluster Information:
================================

Fencing Protocol Version: 201
Fencing Mode: SCSI3
Fencing SCSI3 Disk Policy: dmp
Cluster Members:

* 0 (vcs1)
1 (vcs2)
2 (vcs3)

RFSM State Information:
node 0 in state 8 (running)
node 1 in state 8 (running)
node 2 in state 8 (running)

# hastatus -summary

-- SYSTEM STATE
-- System State Frozen

A vcs1 RUNNING 0
A vcs2 RUNNING 0
A vcs3 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B ClusterService vcs1 Y N ONLINE
B ClusterService vcs2 Y N OFFLINE
B ClusterService vcs3 Y N OFFLINE
B apache vcs1 Y N OFFLINE
B apache vcs2 Y N OFFLINE
B apache vcs3 Y N ONLINE

#df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 7218432 4356176 2495576 64% /
devtmpfs 995788 212 995576 1% /dev
tmpfs 995788 0 995788 0% /dev/shm
tmpfs 4 0 4 0% /dev/vx
/dev/vx/dsk/share_dg/apache
512000 3285 476928 1% /srv/www/htdocs

When I disconnect the net link of eth1/eth2 in VCS3. Apache is brought up in vcs1. But when I check vcs3, the mount point still exists . And after several minutes, a kernel panic occur in vcs3.

I think it is very dangerous that a volume is mounted on two split-brain nodes . How can I prevent this happens?

VOX

Volume is mounted on two split-brain nodes in VCS 6.0