File Cluster Instance Failover takes more than 5 minutes after installing (case 414-736-536)
- customer has a windows 2008 2-node (active-active) file cluster
- there are several instances on the cluster
- moving an instance normally takes approx. 30 seconds
- after installing the fsa agent and creating the resources, move takes about 8 minutes
- in cluster administration we see the placeholder service keeps running, after 8 minutes it is (forcefully by the system??) shut down
--> we do not have any GPT disks in the cluster
we do not see any event viewer entries
no entries in cluster log
any ideas? has anyone see this before?
I would enable logging.
if this does not yield anything we may need to gather some dumps of the resource manager.
What version of EV and FSA Agent running on the nodes?
Was the Clustered target file server configured with HA option using the FSA Cluster Wizard?
You could try to add or remove the HA option to see if it makes any difference.
The FSA services run on all nodes permanently and are not stopped by the cluster service during failover. If the services are being stopped it sounds like you may have configured the FSA Agent services to be controlled within the Cluster group. If this is the case you need to remove these service resources from the group.
Actually - you seem to have already given the hint here "after installing the fsa agent and creating the resources, move takes about 8 minutes"
My interpretation is that you have manually added the service resources int he cluster group to be managed. This is not correct. You will not need to do that.
As long as the FSA Agent is intalled on all nodes that are part of that cluster, all you need to do is configure that clustered file server as a taget in the VAC and run the FSA Cluster Wizard to made the resource group High-Available (HA) while the FSA Resource is added to the Cluster Group
If you have added the resources, just remove them.
The current status is:
- Actually we DO have GPT disks
- So we managed to update to EV 9.0.2, every component
- to be sure we wiped out the complete file archiving config (means, uninstalling/removing all compononts from the Cluster nodes, and if necessary, cleaning remaining files/folders under c:\prog..(x86)\enterprise vault
- No luck, we still suffering from the not working failover
- But to make it funny, we have an test file server, which had the same issues, but after removing and reinstalling the THIRD TIME (!!!) it worked there now. Trying until it works is not a thing we want to do on a production system....