shaggy62
15 years agoLevel 3
Restart clustered application without triggereing failover?
Hello,
I have a number of apps setup in simple failover cluster configurations across multiple clusters. Most of these are java application servers. Normally if an app needs to be restarted I can simply switch the app from one node to another (and back if need be) and that is fine. However there are times, particularly in non-production clusters, where application developers need to restart the application service and aren't interfacing with the cluster to do so. I have done this by disabling the service group on the secondary node(s) and then performing the restart, but I'd like to be able to restart the applications and not trigger a failover operation at all. So the question is how do I do that? Is there a way I can tell the cluster I'm performing an opertion like that?
I have a number of apps setup in simple failover cluster configurations across multiple clusters. Most of these are java application servers. Normally if an app needs to be restarted I can simply switch the app from one node to another (and back if need be) and that is fine. However there are times, particularly in non-production clusters, where application developers need to restart the application service and aren't interfacing with the cluster to do so. I have done this by disabling the service group on the secondary node(s) and then performing the restart, but I'd like to be able to restart the applications and not trigger a failover operation at all. So the question is how do I do that? Is there a way I can tell the cluster I'm performing an opertion like that?
- Instead of disabling the service group you could just freeze the group. This has the effect of preventing a failover if a resource comes offline outside of the cluster.
When you are finished with your external operations, then you should run the probe command to ensure that the resource is back online and then unfreeze the service group.
The one thing to note in this scenario. If the physical machine was to fail while the SG is frozen, that group will not automatically be brought online on a failover node.
Because it is frozen it will not take any action on the group (online/offline/restart of resources).