WARNING V-16-2-13139 ERROR V-16-2-13027

Zahid_Haseeb · ‎11-25-2014

Environment

SFHA/DR

ERROR

2014/11/25 07:22:07 VCS WARNING V-16-2-13139 Thread(4157086608) Canceling thread (4132436880)
2014/11/25 07:22:18 VCS ERROR V-16-2-13027 Thread(4146596752) Resource(DG) - monitor procedure did not complete within the expected time.

I created another Diskgroup to archive the debug logs(Although its not a production Diskgroup). Now the client have two Diskgroups. I can understand that that above ERROR/WARNING are occuring due to multiple Diskgroup are being accessed via same Ethernet interface card. I need to keep this new Diskgroup for few days. My query is that , will the WARNING/ERROR can harm the Production Diskgroup ?

Wally_Heim · ‎11-25-2014

Hi Zahid,

The monitor cycle timing out should cause the resource to go into an "Unknown" state. The cluster does not know how to handle unknown states so it is not able to do anything to correct the situation expect wait until for a monitor cycle to complete successfully with online or offline state of the resource.

The resource being in an Uknown state will not directly cause an issue with another resource. However, if the resoure monitor completes successfully and causes a fault, then the rest of the resource may be brought offline as a result. If the resource is set to non-critical and it has no critical resources depended on it then the group should not fault. The bottom line is that it depends on a lot of factors to say if it will evently cause the group to fault or not.

You should check the agent log to see if it gives you more details on why the monitor did not complete in the time expected.

Thank you,

Wally

kjbss · ‎11-25-2014

As Wally says, "it depends on a lot of factors to say if it will evently cause the group to fault or not".

Therefore, also check to see what your FaultOnMonitorTimeouts attribute is set to for the DiskGroup Type, and consider localizing and then increasing it.

-$ hatype -display DiskGroup | grep -i fault
DiskGroup FaultOnMonitorTimeouts 4

Usually when VCS resource monitor routines do not complete within their alloted timeout periods, it means you've got to either add more physical resources to your server (RAM, CPUs, etc), or you need to move some workloads off your server.

Zahid_Haseeb · ‎11-25-2014

Thanks all for your valueable comments. Is it possible that we can use the new Diskgroup but VCS dont monitor it ?

As this is not a production Diskgroup and removing it from VCS monitoring will not cause us anything. Furthermore these WARNINGS and ERRORS ( WARNING V-16-2-13139 ERROR V-16-2-13027 ) will not be appear on Production if we can disable monitoring on new DIskgroup

Zahid_Haseeb · ‎11-26-2014

Disable a service group to prevent it from coming online. This process temporarily stops VCS from monitoring a service group on a system that is undergoing maintenance operations

Above is the extract from the below link

https://sort.symantec.com/public/documents/sfha/6.0/linux/productguides/html/vcs_admin/ch06s12s07.htm

Correct me if I am wrong.

I disabled the ServiceGroup (while my DG and MOUNT resource is online) in which the new Diskgroup exist. This will prevent Servicegroup and its resources from being monitored by VCS.

kjbss · ‎11-26-2014

I think you are wrong in that assumption...

On my VRTSvcs-6.1.1.000-RHEL6.i686 system, if I bring a service group online, and then disable the service group:

hagrp -modify service_group Enabled 0

The resources contained within that service group are still being monitored. I think this is the correct behaviour, though I might have argued that the attempt to "disable" an online service group should have been prevented by the VCS state engine (but I guess somebody had a use-case to support that).

Now, eventhough the service group is ONLINE (and disabled), you can now offline the service group, no problem. However, if you now try to start the "disabled" service group, you get:

VCS WARNING V-16-1-10161 Group service_group is not enabled on system n0-rhel64

kjbss · ‎11-26-2014

I am not sure what exactly it is your are trying to achieve, but in general, if you do not want the "non-production" Disk Group to be monitored by VCS, why don't you leave it outside of VCS management and simply manually import the Disk Group when you want to?

Zahid_Haseeb · ‎11-26-2014

simply manually import the Disk Group when you want to?

Its easy for me but not for my client simple :)

Zahid_Haseeb · ‎11-26-2014

Thanks for your words. What about the document which is saying that when service group is disabled it will no longer monitored by VCS.

I dont want to offline the servicegroup. I just want that the new DG should not be monitored by VCS and keep online too. So If I disable the DG it will not be online but if I disable the SG it keeps my DG online. thats why I disabled the SG to use the DG as well as not monitored by VCS.

Marianne · ‎11-26-2014

I agree - simply remove the new DG resource from VCS and import manually.

Handy NetBackup Links

Zahid_Haseeb · ‎11-26-2014

Extract from the below Symantec document

Disable a service group to prevent it from coming online. This process temporarily stops VCS from monitoring a service group on a system that is undergoing maintenance operations

https://sort.symantec.com/public/documents/sfha/6.0/linux/productguides/html/vcs_admin/ch06s12s07.htm

As per my understanding. The above Symantec document only apply when the diskgroup(under service group) is offline. If the diskgroup is online then the above Symantec document cannot apply.

kjbss · ‎11-27-2014

You can override the new DG resource's MonitorTimeout and then you can change it's MonitorInterval to a very high number, say 9999, which would cause it to run the monitor routines every 9999 seconds, or 166.66 minutes. That should effectively give you a "most of the time" non-monitored resource.

I am not sure how large you can set the MonitorInterval to, but 9999 should certainly be enough for what it sounds like you are looking for...

Zahid_Haseeb · ‎11-27-2014

Yes KJ. This is the point this time we are working on. Seems this can resolve our problem.

Zahid_Haseeb · ‎11-27-2014

To import a DG and Mount a resource manually is a command prompt task which the user dont have rights :(

Zahid_Haseeb · ‎11-27-2014

I have two resources MOUNT and DISKGROUP in the ServiceGroup (Non production SG). For both resources I did the same

Zahid_Haseeb · ‎11-28-2014

Although my problem is resolved via a workaround which I mentioned above. But have a question :

Why a single node cannot have multiple Diskgroups(under multiple SG) each(DG) for different purpose

kjbss · ‎11-28-2014

Do you have a question or are you just showing what you've done?

BTW: I do not recommend such large numbers, especially for MonitorTimeout, but I guess if you have attained the desired behavior and your tests have so far proven no ill side-effects, then I suppose that is fine.

Sure would be a better solution to determine why the system cannot handle these miniscul monitor routines and either add system resources (primarily more RAM usually does the trick, and isn't that expensive these days), or find non-essential services that you can remove from the server, or otherwise re-tune your server via various kernel tuning methods.

Of course, a review of the various service and applicaiton logs might reveal additional issues to resolve that might free up system resources (and make the server capable of handling more processes/tasks).

Zahid_Haseeb · ‎11-28-2014

Yes the reason to show snaps is that I changed the VCS attributes for desire result.

Marianne · ‎11-28-2014

What makes you ask this question?

Many of our customers have multiple SGs with own dg(s) running on the same node.

If there are no service group dependencies that specifies 2 SGs cannot run on the same node (for various business or physical resource reasons, not VCS), then it is perfectly fine to have more than one SG with own dg(s) on the same node.

I have never seen documentation to say a single node cannot have multiple DGs under multiple SG.
You cannot have same DG in multiple SGs.

In this particular environment, it seems that your cluster nodes are battling with physical resources that makes more than one SG/dg on a single node undesirable.

Handy NetBackup Links

Zahid_Haseeb · ‎12-01-2014

If there are no service group dependencies that specifies 2 SGs cannot run on the same node

Cannot have multiple DGs for multiple applications on a single Cluster node ?

VOX

WARNING V-16-2-13139 ERROR V-16-2-13027