cancel
Showing results for 
Search instead for 
Did you mean: 

WARNING V-16-2-13139 ERROR V-16-2-13027

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Environment

SFHA/DR

ERROR

2014/11/25 07:22:07 VCS WARNING V-16-2-13139 Thread(4157086608) Canceling thread (4132436880)
2014/11/25 07:22:18 VCS ERROR V-16-2-13027 Thread(4146596752) Resource(DG) - monitor procedure did not complete within the expected time.

 

I created another Diskgroup to archive the debug logs(Although its not a production Diskgroup). Now the client have two Diskgroups. I can understand that that above ERROR/WARNING are occuring due to multiple Diskgroup are being accessed via same Ethernet interface card. I need to keep this new Diskgroup for few days. My query is that , will the WARNING/ERROR can harm the Production Diskgroup ?

25 REPLIES 25

Wally_Heim
Level 6
Employee

Hi Zahid,

The monitor cycle timing out should cause the resource to go into an "Unknown" state.  The cluster does not know how to handle unknown states so it is not able to do anything to correct the situation expect wait until for a monitor cycle to complete successfully with online or offline state of the resource.

The resource being in an Uknown state will not directly cause an issue with another resource.  However, if the resoure monitor completes successfully and causes a fault, then the rest of the resource may be brought offline as a result.  If the resource is set to non-critical and it has no critical resources depended on it then the group should not fault.  The bottom line is that it depends on a lot of factors to say if it will evently cause the group to fault or not.

You should check the agent log to see if it gives you more details on why the monitor did not complete in the time expected. 

Thank you,

Wally

kjbss
Level 5
Partner Accredited

As Wally says, "it depends on a lot of factors to say if it will evently cause the group to fault or not".

Therefore, also check to see what your FaultOnMonitorTimeouts attribute is set to for the DiskGroup Type, and consider localizing and then increasing it. 

-$ hatype -display DiskGroup | grep -i fault
DiskGroup    FaultOnMonitorTimeouts 4

Usually when VCS resource monitor routines do not complete within their alloted timeout periods, it means you've got to either add more physical resources to your server (RAM, CPUs, etc), or you need to move some workloads off your server. 

 

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Thanks all for your valueable comments. Is it possible that we can use the new Diskgroup but VCS dont monitor it ?

 

As this is not a production Diskgroup and removing it from VCS monitoring will not cause us anything. Furthermore these WARNINGS and ERRORS ( WARNING V-16-2-13139 ERROR V-16-2-13027 ) will not be appear on Production if we can disable monitoring on new DIskgroup

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Disable a service group to prevent it from coming online. This process temporarily stops VCS from monitoring a service group on a system that is undergoing maintenance operations

Above is the extract from the below link

https://sort.symantec.com/public/documents/sfha/6.0/linux/productguides/html/vcs_admin/ch06s12s07.htm

 

Correct me if I am wrong.

I disabled the ServiceGroup (while my DG and MOUNT resource is online) in which the new Diskgroup exist. This will prevent Servicegroup and its resources from being monitored by VCS.

 

kjbss
Level 5
Partner Accredited

I think you are wrong in that assumption...

On my VRTSvcs-6.1.1.000-RHEL6.i686 system, if I bring a service group online, and then disable the service group:
 

hagrp -modify service_group Enabled 0

The resources contained within that service group are still being monitored.  I think this is the correct behaviour, though I might have argued that the attempt to "disable" an online service group should have been prevented by the VCS state engine (but I guess somebody had a use-case to support that).

Now, eventhough the service group is ONLINE (and disabled), you can now offline the service group, no problem.  However, if you now try to start the "disabled" service group, you get:

VCS WARNING V-16-1-10161 Group service_group is not enabled on system n0-rhel64

 

kjbss
Level 5
Partner Accredited

I am not sure what exactly it is your are trying to achieve, but in general, if you do not want the "non-production" Disk Group to be monitored by VCS, why don't you leave it outside of VCS management and simply manually import the Disk Group when you want to?

 

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

simply manually import the Disk Group when you want to?

Its easy for me but not for my client simple :)

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Thanks for your words. What about the document which is saying that when service group is disabled it will no longer monitored by VCS.

 

I dont want to offline the servicegroup. I just want that the new DG should not be monitored by VCS and keep online too. So If I disable the DG it will not be online but if I disable the SG it keeps my DG online. thats why I disabled the SG to use the DG as well as not monitored by VCS.

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I agree - simply remove the new DG resource from VCS and import manually.

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Extract from the below Symantec document

Disable a service group to prevent it from coming online. This process temporarily stops VCS from monitoring a service group on a system that is undergoing maintenance operations

https://sort.symantec.com/public/documents/sfha/6.0/linux/productguides/html/vcs_admin/ch06s12s07.htm

 

As per my understanding. The above Symantec document only apply when the diskgroup(under service group) is offline. If the diskgroup is online then the above Symantec document cannot apply.

kjbss
Level 5
Partner Accredited

You can override the new DG resource's MonitorTimeout and then you can change it's MonitorInterval to a very high number, say 9999, which would cause it to run the monitor routines every 9999 seconds, or 166.66 minutes.  That should effectively give you a "most of the time" non-monitored resource. 

I am not sure how large you can set the MonitorInterval to, but 9999 should certainly be enough for what it sounds like you are looking for...

 

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Yes KJ. This is the point this time we are working on. Seems this can resolve our problem.

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

To import a DG and Mount a resource manually is a command prompt task which the user dont have rights :(

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

I have two resources MOUNT and DISKGROUP in the ServiceGroup (Non production SG). For both resources I did the same

OldLogsSG_NotMonitor_MOUNT_AND_DG.jpg

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Although my problem is resolved via a workaround which I mentioned above. But have a question :

 

Why a single node cannot have multiple Diskgroups(under multiple SG) each(DG) for different purpose

kjbss
Level 5
Partner Accredited

Do you have a question or are you just showing what you've done?

BTW:  I do not recommend such large numbers, especially for MonitorTimeout, but I guess if you have attained the desired behavior and your tests have so far proven no ill side-effects, then I suppose that is fine.

Sure would be a better solution to determine why the system cannot handle these miniscul monitor routines and either add system resources (primarily more RAM usually does the trick, and isn't that expensive these days), or find non-essential services that you can remove from the server, or otherwise re-tune your server via various kernel tuning methods. 

Of course, a review of the various service and applicaiton logs might reveal additional issues to resolve that might free up system resources (and make the server capable of handling more processes/tasks). 

 

 

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Yes the reason to show snaps is that I changed the VCS attributes for desire result.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

What makes you ask this question?

Many of our customers have multiple SGs with own dg(s) running on the same node.

If there are no service group dependencies that specifies 2 SGs cannot run on the same node (for various business or physical resource reasons, not VCS), then it is perfectly fine to have more than one SG with own dg(s) on the same node.

I have never seen documentation to say a single node cannot have multiple DGs under multiple SG.
You cannot have same DG in multiple SGs.

In this particular environment, it seems that your cluster nodes are battling with physical resources that makes more than one SG/dg on a single node undesirable.

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

If there are no service group dependencies that specifies 2 SGs cannot run on the same node

 

Cannot have multiple DGs for multiple applications on a single Cluster node ?