We are running a EV8 SP2 on a W2k8 failover cluster. Recently our SQL server was unavailable (DNS issue not SQL issue), but our EV8 SP2 Cluster Resources didn't fail. Does anyone know how to have the Vault alert us if SQL is unreachable? Is there a resource or dependancy I need to add? How do I monitor SQL from the Vault perspective?
From EV's perspective any transaction that fails will usually be logged to the EV event logs (Symantec Enterprise Vault in EV8+, older versions just Enterprise Vault) and if SQL is down and uncontactable i'm assuming you would accumulate lots and lots and lots of errors, especially 13360 errors, and if 10,000 errors are hit in 7200 seconds then the Admin Service would initiate an EV Frenzy and cause the services to fail.
So if you have anything that can monitor the Enterprise Vault event logs and keep an eye out for that 13360 then you can be alerted to the action a lot quicker, if you can specifically find an event that says the SQL server is completely unreachable then you may be able to trigger some custom rules.
The only thing i see being an issue though is if the SQL Server is unreachable from Server A, will it be reachable via Server B? as i'm assuming EV's fail over server also couldn't resolve the address? and if the SQL Server is completely dead in the water then the 2nd EV server would face the same problem and then it would go in to a loop of failing over
in our environment i can tell you that we dont have any EV servers clustered but each of our SQL Servers are, as there is more likely to be an issue with the SQL Server than EV itself, as normally when you fail over, the problems will follow EV
Since my Admin service didnt fail, maybe I am missing a dependancy? I have the index, msmq, shopping, and vault store LUNs.
You're right about event log monitoring...It looks like W2K8 has an easy way to send event log notifications via email. I'll be testing this out. As well I'm going to see if SCOM R2 can monitor the events.
Yes, you are correct the failover wouldn't resolve the problem, SQL wouldn't be accessable on the other node, but then services would likely fail trigger alerts.
Thanks for your help!
Yes, it's a situation that occurred with Dynamic DNS. The SQL DNS entry was changed and therefore EV was told (when updating it's cache) that SQL had a different IP. SQL was up and running the whole time with the correct IP.