cancel
Showing results for 
Search instead for 
Did you mean: 

Clustering NFS - first disk commit hangs for 30 seconds

Sel_Soy
Level 3
Hi, my NFS Clustering hell continues. I am setting up a (SPARC) Solaris 10, VCS 5, NFS Cluster and I have the service group set up and working however when ever you mount one of the NFS volumes from a client server on the first time that you try to access the disk the command will hang for approx 30 seconds. I have tried touching a file or creating a directory and the result is always the same.

Now looking at some discussion boards it would appear that the consensus believes that this is related to the NFS server doing a reverse DNS lookup on the server making the connection. But I have spent all of today setting up a DNS server and trying various different settings which have all annoyingly made no difference what so ever!!! Arrrr!

Does anyone have any ideas? I am open to suggestions.

Many thanks,
Sel
5 REPLIES 5

Gene_Henriksen
Level 6
Accredited Certified
Have you tried sharing manually, mounting from a remote and seeing if you get the same problem?

Sel_Soy
Level 3
Hi Gene, yes I tried that this morning and I get exactly the same behaviour on first mount.. disk write. Nothing appears in the messages file and I have set up DNS on the NFS server so that it can do a reverse lookup on the server making the NFS request.

I found this on the web "when Solaris receives an NFS mount request Solaris does a reverse lookup of the IP address of the system making the request.
The result of the reverse lookup must EXACTLY (including case) match an entry in the exports file."
However it makes no sense to add an entry into the exports file as the mount & NFS share must be controlled by VCS.

Still none the wiser...
Sel

Gene_Henriksen
Level 6
Accredited Certified
One of the things we stress in VCS class is to test ot the resources manually before putting them under VCS control. This is to catch problems of this sort before you try putting it into VCS.

Since VCS will share it with the Share resource, whatever you put there is what the NFS server will be looking for. So if the DNS have a fully qualified domain name but the share only uses the name of a directory to be shared. Where does Solaris expect you to put the name that it will resolve?

Sel_Soy
Level 3
Right I seem to have resolved this one. It would appear that you don't need DNS working so long as you have an entry for each host that will be using the NFS server in the NFS servers hosts file. I added an entry for each client server that would be connecting to the NFS server. To confirm I made an ssh connection from each client server to the NFS server and then issued a "who" command on the NFS server. This would then display the host name as it appears in the hosts file.

bash-3.00# who
root pts/1 Sep 20 15:15 (node1_nfs)

I would then use this name in the VCS Share properties (root=node1_nfs:node2_nfs:node3_nfs:node4_nfs) and my connections can now connect and read/write to an NFS volume delay free.

I have now come into another problem however where any mounted NFS shares display the "Stale NFS file handle" on the clients if you fail over the NFS Cluster.... so not such a great NFS Cluster if the connections don't work after failover!!!? So the investigative work starts all over again. I may open a new posting on this one if I can't find the answer? The truth is out there.
Thx
Sel

Gene_Henriksen
Level 6
Accredited Certified
Stale File Handle is covered in several places. Essentially, Solaris builds the file handle handed to the client using the major/minor device numbers. If you do not have the same major/minor for the device you are sharing, then this occurs.

see haremaor command for more info.

Also remember you can award points for correct answers.