Freeze SG or Freeze Node ?

Question

hi there,
I have a scnario where applications are under cluster [2 node cluster].&nbsp;I had an issue in the past where hastop -local was executed before bringing down applications manually which resulted in appsg to go in faulted state.
&nbsp;
My question is should I freeze the service groups individually&nbsp;or should I freeze the nodes while performing OS patch maintainance ? Does freezing node solves the same purpose as that freezing sg's in a perticular node ?
&nbsp;
Regards,
Saurabh

wally_heim · Answer

Hi Saurabh,
If you are stopping VCS for maintenance and you want the service group online then you should run "hastop -local -force".&nbsp;&nbsp; The -force will leave all service groups online (except for the ClusterService group.)
If you are needing to stop and start the cluster for your maintenance and you do not want the service group to react, I would recommend persistantly freezing the node.&nbsp; That will affect all service groups on the node so you will not need to worry about each group individually.&nbsp; The persistant freeze will survive a restart of HAD on the node.&nbsp; A Temporary freeze will be automatically removed when HAD is restarted on the node.
Thank you,
Wally

arangari · Answer

If Group is frozen:
VCS will not take any failover action on the group. The Agents will only monitor the resources in the group, will report any state-change and group may be seen as faulted if a resource is faulted. However there will be no failover action for the group.
If the group is frozen persistently, all the above is applicable in a current life of cluster. However, if the cluster was to boot-strap again (ex: hastop -all followed by starting all nodes), the group's state on first probe will be honored. no further actions will be taken - like AutoStart of group.
&nbsp;
If System is frozen:
VCS will not choose this node as target for online of any group. However already existing online groups will continue to be online till there is need for it to be failed-over due to user-action (switch/offline) or fault-event. &nbsp;
Temporary frozen system will have this behavior in current life of cluster. However the persistent frozen system will continue this behavior if the cluster was to boot-strap again.&nbsp;
&nbsp;
Coming back to the question: For OS patch maintenance - i would suggest to evaluate the following:
1. If the OS maintenance does not impact the the health of applications (VCS and related service groups), then you may just apply the patch.
2. while in maintenance, if you do not want any service groups to be brought online on this particular node - but you need VCS to be running during this time, you would want node to be frozen. You may have evacuated the node by switching all online service groups to other node. If the patch needs reboot and system may not be yet stable after reboot is over, you would not want this node to bring groups online, in this case node should be persistently frozen.&nbsp;
3. If the reboot is required, VCS will evacuate the service groups to the other node. If you do not need the service groups to be evacuated, the persistent freeze of service groups is good idea.

saurabh_pande · Answer

Thanks Amit and Wally for your valuable answers.&nbsp;That really helps.
&nbsp;
So, now, If you can advise if this is correct :
1) Resepctive Service Groups Persistenly frozen on both the nodes since OS maintanace is being carried on both the nodes at the same time and both the nodes will require reboot.
2) Apps will be brought down manually.
3) OS maintanace starts for which I am going to move rc2.d and rc3.d scripts for VCS on both the nodes so that with every reboot I don't have to worry about the cluster part.&nbsp;I will be doing hastop -all&nbsp;before this.
4) Once OS patches are applied in Single user mode the scripts will be replaced so that with the reboot cluster comes up [with the same state as it was before doing hastop -all ? ? ? ]
5) Since the groups are frozen persistently, apps will be brough up maunally after which SG's will be unfreezed persistently.
&nbsp;
Regards,
Saurabh

arangari · Answer

Use the following steps:
As you would need to offline all the apps anyway for OS maintanance take following steps:
1. Offline all SGs one after another (hagrp -offline) in appropriate dependency order.&nbsp;
2. freeze SGs persistently
2.1 open configuration for writting (haconf -makerw)
2.2 Persistent freeze all SGs (hagrp -freeze -persistent)
&nbsp; &nbsp; this will make sure that the SGs will not go online due to AutoStart/AutoStartList after cluster is &nbsp; formed. [Any resources found online at first probe will certainly reflect in the state of SG as PARTIAL or ONLINE, but VCS has not issues any online]
2.3 dump the configuration &nbsp;(haconf -dump -makero)
3. If you are okay to have VCS stack started during the reboot, the rc scripts can stay. &nbsp;However I would let you make this decision.
4. hastop -all on one of the node - to ensure all nodes are stopped.
5. Apply patches - complete the OS maintanance. Re-instate the rc scripts if they have been moved. This step can be done at a point just before last required reboot of the maintanance window.&nbsp;
6. Once nodes are up, and VCS is started forming the complete cluster, &nbsp;unfreeze all the SGs.
6.1 open configuration for writting (haconf -makerw)
6.2 Unfreeze SGs (hagrp -unfreeze -persistent). &nbsp;
&nbsp; &nbsp; &nbsp;This will not initiate AutoStart logic, hence bring the SGs online using 'hagrp -online' command with appropriate order.&nbsp;
6.3 dump the configuration (haconf -dump -makero)
&nbsp;
&nbsp;
Regards,
Amit Rangari
&nbsp;

saurabh_pande · Answer

Amit,
&nbsp;
What if I don't offline the SG's prior to freezing ? ? Anyway apps are going to be manually brought down and since the SG's are persistently freezed onlining and offliing is disabled. So, will it impact anything if I don't offline SG's and just freeze ? ?

Forum Discussion

Freeze SG or Freeze Node ?

9 Replies

Related Content

node freeze v/s service group freeze

Re: VOM acitivity on multiple servers

Media Freeze

vcs freeze

Re: Unable to reuse L5 old tapes

Recent Discussions

Configure two Mount type resources of nfs FStype attribute using the same share

order

key registration and reservation

Verifying that primary and dr clusters replication is synced

vcs can create logical nic