cancel
Showing results for 
Search instead for 
Did you mean: 

FileShare resource not going online only when rebooting server

Dejan
Level 2
Hi

I have created SQL service group for one SQL instance. I than appended FileShare resource for one share which is being hosted on shared disk, where also SQL database files will be placed.

Than I linked FileShare resource to LANMAN(VirutalServer 1 for example)  and to MOUNTV (shared disk with partition D:\ for example).

I bringed resource online and it all works, tha SQL is up and running and also the file share.

But when I reboot server and when the server goes up and brings this SQL service group up, all resources go up, but not the FileShare resource. When I clean faulted fileshare resource and manualy bring it on it goes on line without problem.

Does anyone know, what is wrong, that FileShare resource not going online only when rebooting server?

Thank you for any advice
1 ACCEPTED SOLUTION

Accepted Solutions

jlockley
Level 3
Employee Accredited Certified

That message is being reported from the file system, I don't know why after reboot it should report that for that fileshare should report its write protected.  SQL server data may report no problem becuase whatever is causing fileshare to fault, has passed by the time SQL server tries to write to the volume.

I suppose you could try to find out by just onlining the MountV and then trying to browse the folder from the local server before you share it, but you might have to be quick to try and create a file before the problem condition passes.

I suggest you have the cluster apply  the workaround you are doing manually (retry online).  In the properties for the FileShare agent  (right click FileShare, View -> Properties View) select ShowAllAttributes, look for OnlineRetryLimit and set to 1.  This will retry the online operation which based on your experience should allow the resource online.  Set this higher if needed.

James.

View solution in original post

5 REPLIES 5

jlockley
Level 3
Employee Accredited Certified

With this one, as you can try to online again and it is successful with no further action suggests some kind of timing issue for the online.  Check that you have the FileShare linked to the correct MountV.

Usually FileShare fails to online due to a permsissions issue, path incorrect, or some other configuration issue that needs to be resolved before online will work.  Any file system issue will fault the MountV, not the FileShare.

The reason it is faulting should be logged in the application event log.  Test the online and then look in that log for "AgentFramework" and you will see a message in the form of:

FileShare:<name>:online:Failed to open folder <some_folder_name> [X:Y] 
where X and Y are numbers.

Use the second number and from command line "net helpmsg Y" and hopefully this will give you the reason why.

James

 

Jay_Kim
Level 5
Employee Accredited Certified
I agree with James that it does indeed sound like a timing issue with something that it needs to depend on.

You can also check the fileshare_A.txt log file that's available at "%VCS_HOME%\log folder".
This is the agent log that logs all the events that occur with a fileshare resource.
It should tell you straight up why it has faulted in most cases. So definitely worth a look.

Dejan
Level 2
Hi again

Thank you very much for both replys.

The MountV is all OK, because the share goes online after I clear faulted FileShare Resource, just after reboot. It also works perfect if I do switch over.

I looked at the FileShare log and I saw this:

2009/08/24 11:31:17 VCS ERROR V-16-2-13120 Thread(2416) Error receiving from the engine. Agent(FileShare) is exiting.
2009/08/24 11:37:43 VCS ERROR V-16-10051-10506 FileShare:TestShareName:online:Unknown error for folder D:\exchange [14:19]
2009/08/24 11:39:44 VCS ERROR V-16-2-13066 Thread(3092) Agent is calling clean for resource(TestShareName) because the resource is not up even after online completed.
2009/08/24 11:39:44 VCS ERROR V-16-2-13068 Thread(3092) Resource(TestShareName) - clean completed successfully.
2009/08/24 11:39:44 VCS ERROR V-16-2-13071 Thread(3092) Resource(TestShareName): reached OnlineRetryLimit(0).

As you can see I got: Unknown error for folder D:\exchange [14:19] and when I run:

net helpmsg 19 I got "The media is write protected"

But this volume is not write protected, because SQL server data file is using it and if this was really right, the SQL service group would not go online without problems.

Any idea?

Thank you, Dejan



jlockley
Level 3
Employee Accredited Certified

That message is being reported from the file system, I don't know why after reboot it should report that for that fileshare should report its write protected.  SQL server data may report no problem becuase whatever is causing fileshare to fault, has passed by the time SQL server tries to write to the volume.

I suppose you could try to find out by just onlining the MountV and then trying to browse the folder from the local server before you share it, but you might have to be quick to try and create a file before the problem condition passes.

I suggest you have the cluster apply  the workaround you are doing manually (retry online).  In the properties for the FileShare agent  (right click FileShare, View -> Properties View) select ShowAllAttributes, look for OnlineRetryLimit and set to 1.  This will retry the online operation which based on your experience should allow the resource online.  Set this higher if needed.

James.

Dejan
Level 2
Hi James

That workaround is good and it works. First after reboot, the resource get faulted, than OnlineRetryLimit=1 brings it on.
I tested several times and everytime works.

Thank you for workaround solution and if I find what is the main problem I will let you know.

Dejan