cancel
Showing results for 
Search instead for 
Did you mean: 

split brain

tanislavm
Level 6

Hi,

I like to see your comment on this matter.

I have 2 vcs node and suddenly the 2 private network path fail,so i am in the split brain scenario.I have an service(application) runs on node A and other application runs on node B.

 

Please what are the steps to perform in this scenario in order to have all the users running the aplication safely.

 

The steps would be like below?

- kill all users sessions on node A and node B

- shutdown VCS on every node and leave the applications running(on node A and on node B).

 

- workaround->edit llttab so the vcs will use link-lowpriv nic or heartbeat disk?then start vcs on both nodes?

 

 

 

 

thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello,

Ideally having a low pri link is an additional security to prevent split brain however if it has already happened, things are different..

1. If you have not implemented IOFencing, high chances of having data corruption as node A will think that node B has gone while node B will think that node A has gone & thus both the nodes will try to take a full ownership.

The recommendation in this situation is to imediately & safely shutdown one node & keep working with one node with all the groups imported.

Ideal recommendation is to always use IOFencing in order to protect data corruption from split brain situations.

You can also tune GAB parameters  but again thats a workaround

 

G

View solution in original post

3 REPLIES 3

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello,

Ideally having a low pri link is an additional security to prevent split brain however if it has already happened, things are different..

1. If you have not implemented IOFencing, high chances of having data corruption as node A will think that node B has gone while node B will think that node A has gone & thus both the nodes will try to take a full ownership.

The recommendation in this situation is to imediately & safely shutdown one node & keep working with one node with all the groups imported.

Ideal recommendation is to always use IOFencing in order to protect data corruption from split brain situations.

You can also tune GAB parameters  but again thats a workaround

 

G

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The most important check of the heartbeats/interlinks is to ensure that completely different hardware/NICs/switches/network paths are used for the heartbeats.
If ANY individual component in the network path fails, it should not lead to loss of both heartbeats.
This kind of test is normally done on a new cluster before it goes into production.

Any kind of common infrastructure will be a SPOF (single point of failure). This is a bigger risk to your data than not having a cluster.

I have seen 2 split brain scenarios at 2 different sites over the years. Not pretty....

After recovering data from tape, both customers implemented I/O fencing.

tanislavm
Level 6

Hi Marianne,

Your reply is very useful.thank you.