cancel
Showing results for 
Search instead for 
Did you mean: 

jeopardy

lelu
Level 3

hi,

in jeopardy case the vcs automatically will set the node as jeopardy member?

in 2 node cluster what i must to perform in jeoprdy case in rder to resolv this issue?i mean i must shutdown the node with higher llt number then resolv the broken private connection?

tnx

2 ACCEPTED SOLUTIONS

Accepted Solutions

CliffordB
Level 4
Employee
Hi.

First, determine if you still have cluster communications. The output of “gabconfig -a” provides the answer.

If you have jepoardy, but the last link is still running, then you can just fix the network. The cluster will then fix itself.

If the last link failed (no links working), then shutdown the node with highest LLT, repair the link, then restart the node you just rebooted. Once the host joins the cluster, you may have to clear the AutoDisable flag on service groups (if any) that were running on that node. Then, manually bring them online.

When all links fail, NEVER fix the links while the hosts are alive. Always shut them down first. If IO fencing is running, then the losing nodes would have already crashed. Without IO fencing, they may or may not have crashed. Peform a shutdown.


All cluster should be running IO fencing with coordinator disks (or CPS). I highly recommend that all production clusters and any cluster running CFS use IO fencing. IO fencing with SCSI3 PGR will prevent data corruption when communications are down.

The geek diary link is a good resource. I highly recommend you also study the Admin Guide section on “About communications, membership, and data protection in the cluster“. It provides greater details on how exactly to recover when these situations occur. This is one of the hardest sections in the manual for most users, so take your time. I find the table on IO Fencing scenarios to be most helpful.

Cheers

_______________________________
My customers spend the weekends at home with family, not in the datacenter.

View solution in original post

hi,

just to clarify.

in link 

Recovery
To recover from jeopardy, just fix the link and GAB automatically detects the new link and the jeopardy membership is removed from node03.

 

It mean that node03 will became a normal member.

VCS will automatically set node03 as jeopardy member or normal member.right?

tnx

View solution in original post

5 REPLIES 5

CliffordB
Level 4
Employee
Good day.

For details, check your Admin Guide. There is a section on various jeopardy scenarios.

In general, one must fix the broken communications link. Once fixed, the cluster will return to normal on its own.

Cheers



_______________________________
My customers spend the weekends at home with family, not in the datacenter.

hi,

https://www.thegeekdiary.com/vcs-cluster-101-communication-faults-jeopardy-split-brain-io-fencing/

when i reconnect the cable then a node panic.

should i shutdown first the node with higher llt number.what else after i reboot it?

CliffordB
Level 4
Employee
Hi.

First, determine if you still have cluster communications. The output of “gabconfig -a” provides the answer.

If you have jepoardy, but the last link is still running, then you can just fix the network. The cluster will then fix itself.

If the last link failed (no links working), then shutdown the node with highest LLT, repair the link, then restart the node you just rebooted. Once the host joins the cluster, you may have to clear the AutoDisable flag on service groups (if any) that were running on that node. Then, manually bring them online.

When all links fail, NEVER fix the links while the hosts are alive. Always shut them down first. If IO fencing is running, then the losing nodes would have already crashed. Without IO fencing, they may or may not have crashed. Peform a shutdown.


All cluster should be running IO fencing with coordinator disks (or CPS). I highly recommend that all production clusters and any cluster running CFS use IO fencing. IO fencing with SCSI3 PGR will prevent data corruption when communications are down.

The geek diary link is a good resource. I highly recommend you also study the Admin Guide section on “About communications, membership, and data protection in the cluster“. It provides greater details on how exactly to recover when these situations occur. This is one of the hardest sections in the manual for most users, so take your time. I find the table on IO Fencing scenarios to be most helpful.

Cheers

_______________________________
My customers spend the weekends at home with family, not in the datacenter.

hi,

just to clarify.

in link 

Recovery
To recover from jeopardy, just fix the link and GAB automatically detects the new link and the jeopardy membership is removed from node03.

 

It mean that node03 will became a normal member.

VCS will automatically set node03 as jeopardy member or normal member.right?

tnx

CliffordB
Level 4
Employee
Correct.

Again, that means you have at least one valid link. When the second link is fixed, VCS will update membership automatically.

If there is no valid link, shutdown the node, fix the links, the bring the node back online. VCS will return to normal.


Cheers

_______________________________
My customers spend the weekends at home with family, not in the datacenter.