cancel
Showing results for 
Search instead for 
Did you mean: 

sophisticated failover policy

bazi
Level 3

Hi Guys,

I am looking for a solution to achieve the below failover scenario. Maybe someone set something similar up and could advice.

 

I have a 5-node VCS4.1 cluster:

 

HOST1 : GroupA; HOST2: GroupB; HOST3: GroupC; HOST4: GroupD; HOST5: IDLE

In case HOST2 goes down GroupA goes to HOST5 (same in case of HOST2 and GroupC). However in case HOST2 and HOST3 go down one after another if VCS see GroupB already running on HOST5 it moves GroupC to HOST4 (same the other way around). We cannot achieve it setting offline local dependency as it would only work one way and the other way it would first move parent to HOST4 and online the child group on HOST5. What I want is sort of for the Group that needs to be failed over first onto HOST5 to take an ownership on the node and always move the second one to HOST4.

This was one scenario. Another one would introduce one more dependency. If HOST1 goes down and VCS moves GroupA to HOST4 then if HOST2 and HOST3 go down  they both CAN online on HOST5.

In short words I want GroupB and GroupC to never start on HOST5 together unless there is GroupA already running on HOST4. Remember that I do not want to use 'offline local' dep as I do not want any group to be switched twice but rather want the first group to take ownership on HOST5.

 

Pre-online scripts?:)

 

Thanks,

Wojtek

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello Bazi,

Its available in VCS users guide for 5.0 & older versions or newly introduced VCS admin guide for 5.1 version...

you can find the guide here:

http://sfdoccentral.symantec.com/Storage_Foundation_HA_50_Solaris.html

 

have a look at page 391 onwards..... that lists the "service group workload management"...

 

So for e.g config .... here is how you would setup your main.cf

system Host1 (
Capacity = 200

system Host2 (
Capacity = 200

system Host3 (
Capacity = 200

system Host4 (
Capacity = 400

system Host5 (
Capacity = 200

 

& your group will have load defined like:

group GroupA (

...

FailOverPolicy = Load
Load = 100

 

group GroupB (

...

FailOverPolicy = Load
Load = 200

 

group GroupC (

...

FailOverPolicy = Load
Load = 100

 

In above example..... If  node A fails, that will take GroupA to node 5 (since host 5 would be running with 0 load & systemlist should be setup correctly), once GroupA is online on node 5 it can't take GroupB because defined capacity of node 5 won't allow... so you need to keep extra capacity available on node 4... & same is with node B, if node B fails, Group B will move over to host 5 & will occupy full capacity.... This is just for an idea, hope you got it...

Have a run through guide & you should have your query answered..

 

Gaurav

View solution in original post

5 REPLIES 5

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Hi,

 

You should be able to achieve this using load, capacity, limits and pre-requisites.

 

Also, you description of the hosts/groups seem inconsistent, can you check it and make sure its correct. Or am I confused?

 

R

bazi
Level 3

Thanks Riaan. You are right there are typos in my description that introduce a lot of confusion. Should be: 

 

"In case HOST2 goes down GroupB goes to HOST5 (same in case of HOST3 and GroupC). However in case HOST2 and HOST3 go down one after another if VCS see GroupB already running on HOST5 it moves GroupC to HOST4 (same the other way around). We cannot achieve it setting offline local dependency as it would only work one way and the other way it would first move parent to HOST4 and online the child group on HOST5. What I want is sort of for the Group that needs to be failed over first onto HOST5 to take an ownership on the node and always move the second one to HOST4.

This was one scenario. Another one would introduce one more dependency. If HOST1 goes down and VCS moves GroupA to HOST4 then if HOST2 and HOST3 go down  they both CAN online on HOST5."

 

Caanot really find anything in details about capacity based failover logic or maybe the guides are not visible publicly. Do you maybe happen to have anything you could share?

 

Thanks again,

Bazi

Gaurav_S
Moderator
Moderator
   VIP    Certified

Hello Bazi,

Its available in VCS users guide for 5.0 & older versions or newly introduced VCS admin guide for 5.1 version...

you can find the guide here:

http://sfdoccentral.symantec.com/Storage_Foundation_HA_50_Solaris.html

 

have a look at page 391 onwards..... that lists the "service group workload management"...

 

So for e.g config .... here is how you would setup your main.cf

system Host1 (
Capacity = 200

system Host2 (
Capacity = 200

system Host3 (
Capacity = 200

system Host4 (
Capacity = 400

system Host5 (
Capacity = 200

 

& your group will have load defined like:

group GroupA (

...

FailOverPolicy = Load
Load = 100

 

group GroupB (

...

FailOverPolicy = Load
Load = 200

 

group GroupC (

...

FailOverPolicy = Load
Load = 100

 

In above example..... If  node A fails, that will take GroupA to node 5 (since host 5 would be running with 0 load & systemlist should be setup correctly), once GroupA is online on node 5 it can't take GroupB because defined capacity of node 5 won't allow... so you need to keep extra capacity available on node 4... & same is with node B, if node B fails, Group B will move over to host 5 & will occupy full capacity.... This is just for an idea, hope you got it...

Have a run through guide & you should have your query answered..

 

Gaurav

bazi
Level 3

Yeah. The load-based failover policy will do the trick. The set up I described is simpler and the actual one is much more complex but I already figured it all out in my head:) Thanks guys for all the great suggestions.

 

Thanks,

Bazi

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Hi,

 

You'll have to combine it with limits as the load policy alone will not prevent a system being overloaded i.e. having a minus capacity. The limit and prerequisites however will stop a service group from coming online.

 

I trust this is exactly what you wanted.

 

R