04-10-2012 05:39 AM
As a nom we have heard that while split brain (suppose a two node cluster) both nodes mount the "shared disk resource" and write simultaneously which cause data corruption. As a reference see below:
The VCS engine on a service group that will prevent that group from becoming on-line. Auto-disable is used as a way to prevent data corruption from occurring by avoiding a situation called split-brain. Split-brain is where data is being updated by two or more hosts simultaneously
http://www.symantec.com/business/support/index?page=content&id=TECH8436
Question: I am never able to forcely mount the DiskGroup on both nodes. So how the VCS can mount the Disk resource on both nodes (which leads data corruption) ... for example : I tried to import DiskGroup on the partner/idle node of cluster on which ServiceGroup is not online
Solved! Go to Solution.
04-10-2012 06:35 AM
Split brain occurs when all cluster communications links between cluster nodes are lost. In such a case, each node thinks the other is dead and the idle node attempts to bring the service group online while the active node already has it online.
When the cluster is running normally and at least one link is active, the cluster won't allow you to use cluster services to start the service group on the idle node, for example by issing hagrp -online. You can attempt to forcibly bring resources online via their native commands (vxdg import, mount, etc.), but VCS will detect the concurrency violation and bring the resources offline on the idle node.
04-10-2012 06:29 AM
Hi Zahid,
The SCSI reservations put on the disk in the disk group should prevent the disk group from being imported on more than one server at a time. However, if there is a problem with the scsi reservation process (typically hardware or driver related) the disk group might be accessible by more than one node at a time.
Keep in mind that you are pointing to a SF-HA Unix article. I know you have SFW-HA Windows clusters and I'm not sure if you have SF-HA Unix clusters. SF-HA Unix and SFW-HA clusters perform disk operations in very different manners. What is mentioned for SF-HA Unix product about disk access concerns are not going translate into concers on SFW-HA Windows product.
Thank you,
Wally
04-10-2012 06:35 AM
Split brain occurs when all cluster communications links between cluster nodes are lost. In such a case, each node thinks the other is dead and the idle node attempts to bring the service group online while the active node already has it online.
When the cluster is running normally and at least one link is active, the cluster won't allow you to use cluster services to start the service group on the idle node, for example by issing hagrp -online. You can attempt to forcibly bring resources online via their native commands (vxdg import, mount, etc.), but VCS will detect the concurrency violation and bring the resources offline on the idle node.
04-10-2012 10:24 PM
Thanks all for kind words
@ Wally
I know that the article is unix related. I just wanted to elaborate you(and I also read this at multiple places) that Split Brain can cause the data corruption. I tried to do mount the Disk resource on multiple nodes but always failed. SO I confused that how this could possible that Split Brain can cause the data loss.
Is there anyway to do a test in which we can mount the Disk Resource on both Cluster Nodes
04-11-2012 06:31 AM
Hi Zahid,
If everything is working correctly you will not be able to import the cluster disk group on more than one server at a time. However, when you are in a split brain situation, things are not working correctly.
Why do you want to test import a disk group on more than one node and put your data at take high risk of corruption? I don't have a test process for this and I would advise against putting your data at risk of corruption.
Thanks,
Wally
04-11-2012 11:33 PM
Wally offcourse I will never want to put my data on Risk..I want to test this situation on a test environment because I felt that this(Disk Resource Mount) is not going to happen when one Node lock the DIsk Resource on it.
Anyway thanks for your kind words Wally :)
07-17-2012 08:02 PM
Are there any other ways to prevent split brain? I have been wondering why split brain occurs in the first place. We need to find a solution to the problem of data being updated twice.
07-17-2012 10:23 PM
See the below link , It may help you:
https://sort.symantec.com/public/documents/sf/5.0/solaris/html/sf_rac_install/sfrac_intro13.html
09-17-2012 12:23 AM
Thanks Wally
The SCSI reservations put on the disk in the disk group should prevent the disk group from being imported on more than one server at a time.
Any way to keep the disk safe if client only have HA/VCS and dont have vxvm.