Re: VCS5 & Solaris 10 NFS Cluster - NFS Restart ag...

Sel_Soy · ‎09-13-2006

Hi, I am setting up a Solaris 10, VCS 5, NFS Cluster. I created the NFS Service Group using the Java Console and the NFS template. The problem is with the NFSRestart agent which wont start. I have configured the correct properties for the resource (LockPathName, NFSLockFailover, NFSRes) however the resource consistently has a red ? over it when you look at the Resouces tab of the Service group view (using the Java Console) and if you try to start it it complains that "Resource has not been probed on system ". However probing the resource does not resolve the problem or remove the red question mark.

Now when you look at the Unix messages file it constantly displays the following error repeatedly.

Sep 13 13:24:42 nfscluster2 AgentFramework: VCS ERROR V-16-1-16515 NFSRestart:dNFS_NFSRestart:monitor:NFSRestart:Service Management Facility monitoring is not disabled for NFS lockd daemon!! Returning Unknown

So it is complaining that the NFSRestart daemon is still under Service Management Facility (SMF) monitoring. But I disabled the documented daemons from SMF control as per the NFS agent documentation. e.g. I have all ready issued the following commands;

Disable SMF for nfsd and mountd
svccfg delete -f svc:/network/nfs/server:default
Disable SMF for nfsmapid
svccfg delete -f svc:/network/nfs/mapid:default

So it would appear that there is another command which needs to be run but I can't find any documentation on it anywhere. Has anyone got any ideas? This one has me baffled.

Oh and I tried this one to after seeing that it was in the output of the svccfg list command (see below) but it hasn't resolved the problem neither.
svccfg delete -f svc:/network/nfs/nlockmgr:default

-bash-3.00# svccfg list | grep nfs
network/nfs/cbd
network/nfs/client
network/nfs/mapid
network/nfs/nlockmgr
network/nfs/status
network/nfs/rquota
network/nfs/server

Hope you can help.

Many thanks,
Sel

Sel_Soy · ‎09-13-2006

Resolved it!!! There is another daemon which needs to be removed from SMF and kept under VCS control and that is

svccfg delete -f svc:/network/nfs/status:default

Once that was removed the error stopped appearing in the messages file and the resource could be probed and brought online successfully. It would have been nice if this was in the VCS documentation as it needs to be in there and I am sure anyone trying to cluster NFS on Solaris 10 will run in to this problem. Well at least the answer is here for them to find!!! :)

Sel

VOX

VCS5 & Solaris 10 NFS Cluster - NFS Restart agent problem