cancel
Showing results for 
Search instead for 
Did you mean: 

A question of HTCAgent montior action

lmiao
Level 4
Employee Accredited

I have a GCO system, Primary site is 2-node cvm+ HTC Agent, DR site is one node with HTC; after bring online HTC & cvm resources, I can see the following messages in engine_A log on primary site one minute intervals...

<snip>

2013/02/15 15:02:13 VCS INFO V-16-1-50135 User root fired command: hagrp -unfreeze cvm  remotecluster  from localhost
2013/02/15 15:02:14 VCS INFO V-16-2-13716 (node0) Resource(htc): Output of the completed operation (monitor)
==============================================
VCS WARNING V-16-1-40201 Group is not temporarily frozen
==============================================

2013/02/15 15:02:32 VCS INFO V-16-2-13716 (node1) Resource(htc): Output of the completed operation (monitor)
==============================================
VCS WARNING V-16-1-40201 Group is not temporarily frozen
==============================================

2013/02/15 15:03:13 VCS INFO V-16-1-50135 User root fired command: hagrp -unfreeze cvm  remotecluster  from localhost
2013/02/15 15:03:14 VCS INFO V-16-2-13716 (node0) Resource(htc): Output of the completed operation (monitor)
==============================================
VCS WARNING V-16-1-40201 Group is not temporarily frozen
==============================================

2013/02/15 15:03:32 VCS INFO V-16-2-13716 (node1) Resource(htc): Output of the completed operation (monitor)
==============================================
VCS WARNING V-16-1-40201 Group is not temporarily frozen
==============================================

2013/02/15 15:04:13 VCS INFO V-16-1-50135 User root fired command: hagrp -unfreeze cvm  remotecluster  from localhost
2013/02/15 15:04:14 VCS INFO V-16-2-13716 (node0) Resource(htc): Output of the completed operation (monitor)
==============================================
VCS WARNING V-16-1-40201 Group is not temporarily frozen
==============================================

2013/02/15 15:04:32 VCS INFO V-16-2-13716 (node1) Resource(htc): Output of the completed operation (monitor)
==============================================
VCS WARNING V-16-1-40201 Group is not temporarily frozen
==============================================

<snip>

Could anybody give me any suggestions? Thanks a lot!

5 REPLIES 5

Mark_Alleyne
Level 3
Employee Accredited Certified

Hello Lmlao,

 

Can you please provide me with more information (VCS versions, HTCAgent version...etc)

 

Having a quick look at the  HTCAgent 5.8.0 this appears to be coming from HTCAgent.pm

 

 
sub unfreezetarget {
my ($self) = @_;
my ($val) = split('\n', `$vcs_home/bin/hares -value $self->{'resname'} TargetFrozen`);
if (!$val) {
return 0;
}
my ($grp, $clus) = split(':', $val);
my $ret = system("$vcs_home/bin/hagrp -unfreeze $grp -clus $clus");
if (!$ret) {
       VCSAG_LOG_MSG("N", "service group $grp in cluster $clus was unfrozen because the replication link has been resynchronized", 26, "$grp", "$clus");
system("$vcs_home/bin/hares -modify $self->{'resname'} TargetFrozen \"\"");
}
return 0;
}
 
# hagrp -unfreeze cvm 
VCS WARNING V-16-1-40201 Group is not temporarily frozen
 
Have you logged a case with support?
 
Regards
 
Mark

 

mikebounds
Level 6
Partner Accredited

Have you configured the CVMVolDg actions necessary for the HTC agent - this step is often overlooked when using replication agents with CVM - see extract from HTC Agent install guide:

Configuring the agent in an SF for Oracle RAC environment or Storage
Foundation Cluster File System (SFCFS) environment
 
To configure the agent in a Storage Foundation for Oracle RAC or SFCFS
environment:
1 Configure the SupportedActions attribute for the CVMVolDg resource.
2 Add the following keys to the list: import, deport, vxdctlenable.
3 Use the following commands to add the entry points to the CVMVolDg
resource:
haconf -makerw
hatype -modify CVMVolDg SupportedActions import deport vxdctlenable
haconf -dump -makero

 

Mike

lmiao
Level 4
Employee Accredited

Hi Mark and Mike,

Thank you so much for your quickly reply and kindly support!

1) The package version is SFCFS 5.1SP1RP2 + VRTSvcstc5.0.06 on Solaris10;

2) CVMVolDg attribute has been modified per hatype -display CVMVolDg:

CVMVolDg      SourceFile             ./CVMTypes.cf
CVMVolDg      SupportedActions       import deport vxdctlenable
CVMVolDg      ToleranceLimit         0

3) I am not sure if it is caused by "SplitTakeover =0" (the current value), I will check it again.

It seems that HTCAgent configuration on CVM/CFS is not popular, therefore, few technotes or cases can be found on web site for reference?

Thanks again!

mikebounds
Level 6
Partner Accredited

If the replication link fails then you should get message in the log:

 

"service group $grp in cluster $cluslist[1] was frozen because the replication link has failed; group will be unfrozen when data is resynchronized", 25, "$grp", "$cluslist[1]"
 
and when agent freezes group it sets temporary attribute TargetFrozen in the HTC resource.
When replication link is restored and resynchronized, then agent checks the vaule of TargetFrozen and if set then assumes group is frozen, but if you manually unfroze the group, then as the agent doesn't check for this, then you will get an error as you see in the log that "Group is not temporarily frozen" when the agent tries to unfreeze.  When the agent fails to unfreeze, it does NOT unset TargetFrozen and therefore you will contantly see this message every monitor cycle.  So they way to stop this message is to freeze cvm group and then the agent should unfreeze it and unset TargetFrozen.
 
A few other points:
  1. The HTC resource (and resources above, CVMVolDg, CFSMount and any other resources) should normally be in a separate parallel service group dependent on cvm, not in cvm service group
     
  2. I THINK there are instances where resources in cvm might freeze and unfreeze the cvm service group as I seem to remember seeing messages in the log about cvm group been froze and unfroze when I last used cvm.  So it MAY be that the group was unfroze by a another resource as oppose to manually by a user.

Mike

lmiao
Level 4
Employee Accredited

Okay, many thanks!

I will check the configuration and test again, any questions I will ask Support for more assistance.