cancel
Showing results for 
Search instead for 
Did you mean: 

How VCS-5 protects the cluster online configuration.

Veeraa
Not applicable

Dear All,

I'm new to VCS and going throuhg VCS4 student guide to learn VCS. I have created two node cluster in Vmware using VCS-5.0.30.00-MP3.

In VCS 4, if we open the cluster configuration, VCS will create a file called ".stale" in /etc/VRTSvcs/conf/config.

But I couldn't see the same in VCS5. could any one please explain, how VCS-5 protects the cluster online configuration.

And in vcs-4, there is option to start the had with -stale. so that second system will wait for 1st system to come up fully and then second system will join the cluster. But I couldn't find the option in VCS5. could any one please explain how this work in vcs5.

 

Thanks in Advance.

Veera.

 

5 REPLIES 5

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

See the following section in the VCS 5.0 Release Notes:

Backup of VCS configuration files
VCS backs up all configuration files (<config>.cf) including main.cf and types.cf to <config>.cf.autobackup. The configuration is backed up only if the
BackupInterval is set and the configuration is writable.
When you save a configuration, VCS saves the running configuration to the actual configuration file (i.e. <config>.cf) and removes all autobackup files. This
does away with the VCS behavior of creating .stale files If you do not configure the BackupInterval attribute, VCS does not save the running configuration automatically.
See the Veritas Cluster Server User’s Guide for more information.

Gaurav_S
Moderator
Moderator
   VIP    Certified

Agree with Marianne... .stale concept was removed 5.x onwards & above method was introduced .. & honestly 5.x I never saw cluster ever entering to a "stale_admin_state" ... rather if issues are there it may land to 'admin_wait" state..

 

Gaurav

mikebounds
Level 6
Partner Accredited

You don't see stale_admin_wait, because stale protection has been removed.  i.e in 4.1, if you had

scenario A: opened config made changes, then lost node A, then saved config on remaining node B and then stopped VCS on node B, then when node A booted, it would NOT start VCS as its copy is stale (node B has a more up to date version).  

You do not seem to get this protection in 5.x because even if BackInterval is set (which it is NOT by default) then in scenario A above, VCS would start with the stale config on node A.

In addition I have done some testing with setting BackupInterval (on 5.1 RHEL) and the results are not what I expected - can someone confirm if this is expected behaviour or a bug as the Admin guide doesn't actually go into what VCS should do when it starts if an autoback exists.  This is what I see for  a one-node cluster:

 [root@r5_sf51a config]# haclus -display  | grep -i bac

BackupInterval       3
[root@r5_sf51a config]# hagrp -add mike
VCS NOTICE V-16-1-10136 Group added; populating SystemList and setting the Parallel attribute recommended before adding resources
[root@r5_sf51a config]# hagrp -modify mike SystemList r5_sf51a 0
[root@r5_sf51a config]# grep mike *
[root@r5_sf51a config]# grep mike *
main.cf.autobackup:group mike (
main.cf.autobackup:     //      group mike
[root@r5_sf51a config]# ps -ef | grep had
root     15540     1  0 15:21 ?        00:00:02 /opt/VRTSvcs/bin/had
root     15543     1  0 15:21 ?        00:00:00 /opt/VRTSvcs/bin/hashadow
root     19477  7587  0 15:33 pts/0    00:00:00 grep had
[root@r5_sf51a config]# kill -9 15540 15543
[root@r5_sf51a config]# grep mike *
main.cf.autobackup:group mike (
main.cf.autobackup:     //      group mike
[root@r5_sf51a config]# hastart
[root@r5_sf51a config]# grep mike *
[root@r5_sf51a config]# ls main.cf.auto*
ls: main.cf.auto*: No such file or directory
 
[root@r5_sf51a config]# hagrp -list
ClusterService          r5_sf51a
appsg                   r5_sf51a
appsg_fd                r5_sf51a
repsg                   r5_sf51a
 

 

 

So you can see here that the autobackup backs up my change of adding group mike, but if VCS then dies (example node crashed, but in my example I just killed had), then when VCS starts it deletes backup and it doesn't even rename it to main.cf.date file so this change is lost.  What is the point in the backup if it gets deleted !!!

 

Mike

Anoop_Kumar1
Level 5

Mike, how many files if you check with "ls -al main.cf*" ?

Regards,

~Anoop

mikebounds
Level 6
Partner Accredited

20 files:

[root@r5_sf51a config]# ls -l main.cf*

 

-rw------- 2 root root 3249 Feb  8 15:37 main.cf
-rw------- 1 root root 3117 Feb  8 15:23 main.cf.08Feb2011.15.23.08
-rw------- 1 root root 3198 Feb  8 15:25 main.cf.08Feb2011.15.25.27
-rw------- 1 root root 3117 Feb  8 15:25 main.cf.08Feb2011.15.25.44
-rw------- 2 root root 3137 Feb  8 15:29 main.cf.08Feb2011.15.29.16
-rw------- 2 root root 3249 Feb  8 15:37 main.cf.08Feb2011.15.37.08
-rw------- 1 root root 1562 Jan 22  2010 main.cf.21Jan2010.16.01.22
-rw------- 1 root root 1825 Jan 22  2010 main.cf.22Jan2010.13.39.17
-rw------- 1 root root 2704 Jan 22  2010 main.cf.22Jan2010.14.52.58
-rw------- 1 root root 2785 Jan 22  2010 main.cf.22Jan2010.15.21.30
-rw------- 1 root root 2831 Jan 22  2010 main.cf.22Jan2010.16.11.13
-rw------- 1 root root 2818 Jan 22  2010 main.cf.22Jan2010.17.07.09
-rw------- 1 root root 2833 Jan 22  2010 main.cf.22Jan2010.17.07.13
-rw------- 1 root root 2858 Jan 22  2010 main.cf.22Jan2010.17.22.36
-rw------- 1 root root 2924 Jan 22  2010 main.cf.22Jan2010.19.41.08
-rw------- 1 root root 3077 Jan 22  2010 main.cf.22Jan2010.20.01.01
-rw------- 1 root root 2945 Jan 22  2010 main.cf.22Jan2010.20.09.12
-rw------- 1 root root 3077 Jan 22  2010 main.cf.22Jan2010.20.11.48
-rw------- 1 root root 3117 Jan 22  2010 main.cf.22Jan2010.22.44.10
-rw------- 2 root root 3137 Feb  8 15:29 main.cf.previous