cancel
Showing results for 
Search instead for 
Did you mean: 

RemoteGroup Resource - MonitorOnly

jstucki
Level 4

I created a RemoteGroup resource last night, in VCS 5.1 SP1 RP1 on Solaris 10, to do some testing and validation.  I needed a resource which monitors a remote service group (Oracle database service group).  It doesn't need to cause the remote service group to go online or offline.  It simply needs to show an online status when the remote service group is online, and show an offline (or faulted) status when the remote service group is not online.

I created the local RemoteGroup resource, and set the ControlMode attribute to "MonitorOnly".  I thought that by setting up this resource in this way, it would behave much like a NIC resource.  I was wrong.  The NIC resource doesn't have to be brought online or offline.  It just automatically is online when the NIC is working, and is offline (or faulted) when the NIC isn't working.  Just what you'd expect.

When I first created the local RemoteGroup resource, it turned blue after probing the remote service group, and indicated an online status.  This is what I expected.  Then the remote service group was manually taken offline, and the local RemoteGroup resource became faulted.  Again, just what I expected. 

However, when the remote service group was manually brought online again, the local RemoteGroup resource did not automatically clear its fault, and become online again (like a NIC resource would).  It just remained faulted.  I don't think this is the expected behavior.

Another thing that was odd:  When the local service group (with local RemoteGroup resource) was taken offline, the local RemoteGroup resource went offline.  I would have expected the local RemoteGroup resource to remain online when the local group was offlined......like a NIC or NFS (Server) resource does.

My question is this:   Is this the expected behavior of the MonitorOnly setting for the RemoteGroup resource?  If so, is anyone aware of some settings that will allow this resource to behave much like a NIC resource, when the ControlMode attribute is set to MonitorOnly? 

Thanks, -John

1 ACCEPTED SOLUTION

Accepted Solutions

mikebounds
Level 6
Partner Accredited

John,

The behavior you describe is as expected as it is an OnOff resource - its operations are not "None" like NIC.  If it were in a failover service groups and it faulted and onlined on another service group (as target DB service group being monitored came back on line) then if resource cleared and onlined, like a NIC on faulted system, then you would have an OnOff resource online on 2 systems causing a concurrency violation.  If the database was in the same cluster then if it faulted you would have to manually clear it and there is no way round this apart from scripted, so really it is the same using the remote group resource.

If you want it to act like a NIC then you would need to write your own agent with operations of None.  You could actually clone the RemoteGroup agent and make it operations of OnOnly, but this only part solves your problem as it would mean the remote groiup resource would not offline if you offlined the service group it was contained in, but I am not sure what, if anything, you gain from this.  I am pretty sure you can't clone and make opeations of None as I thing the online entry point creates a lock file that the monitor entry point monitors.

Another alternative would be to put the RemoteGroup agent in a separate parallel service group (this is only supported from 5.1 SP1) and use an Infinate (high number which is effectively infinate) for the OnlineRetryLimit on the servicegroup (NOT the resource) and use a PreOnline for the service group to clear the fault.

Mike

View solution in original post

1 REPLY 1

mikebounds
Level 6
Partner Accredited

John,

The behavior you describe is as expected as it is an OnOff resource - its operations are not "None" like NIC.  If it were in a failover service groups and it faulted and onlined on another service group (as target DB service group being monitored came back on line) then if resource cleared and onlined, like a NIC on faulted system, then you would have an OnOff resource online on 2 systems causing a concurrency violation.  If the database was in the same cluster then if it faulted you would have to manually clear it and there is no way round this apart from scripted, so really it is the same using the remote group resource.

If you want it to act like a NIC then you would need to write your own agent with operations of None.  You could actually clone the RemoteGroup agent and make it operations of OnOnly, but this only part solves your problem as it would mean the remote groiup resource would not offline if you offlined the service group it was contained in, but I am not sure what, if anything, you gain from this.  I am pretty sure you can't clone and make opeations of None as I thing the online entry point creates a lock file that the monitor entry point monitors.

Another alternative would be to put the RemoteGroup agent in a separate parallel service group (this is only supported from 5.1 SP1) and use an Infinate (high number which is effectively infinate) for the OnlineRetryLimit on the servicegroup (NOT the resource) and use a PreOnline for the service group to clear the fault.

Mike