VMware backup of SQL fails to run ASC
Hi
I've configured a VMware backup of a single SQL virtual machine running SQL2008 on a Windows 2008 platform. I've also installed the Veritas VSS provider.
Basically the ASC is failing to run with the following error:
Error nbpem (pid=3288) Cannot perform application state capture because a client name cannot be determined for <client FQDN>"
And is identical to the following post as far as I can tell
The name resolution seems to be good from both Master server (RH Linux) and the media server (NBU 5240 Appliance) - both are running NBU 8.1 and are ok with the bpclntcmd commands. I have tried creating host entries on all the machines but get the same error.
I have also noticed the following. The vmtools was showing 'not running' on the sql vm so I rebooted and it started running once booted but seemed to change back to 'not running' when I tried a test backup. The vmtools version is 10.1.15
Within the netbackup policy, if I browse to the vm instead of doing a query I can see 'DNS name' is blank but the IP details are there. I reinstalled the vmtools (which I assume overwrites the Veritas VSS provider?) and this info came back.
If I set the query find the vm via the displayname it finds the VM ok but I get the ASC error but the backup of the actual vm completes ok. If I set the query to find the vm based on vm hostname it finds the vm ok but the jobs fails straight away with an error 200 in Activity monitor.
I guess I need the ASC bit working so I can do the SQL GRT but perhaps I don't need the Veritas VSS providers as it sounds like this will only allow me to truncate the logs (and I can do that by other methods).
Amazingly this did work once only - the very first time I tried it having just installed the Veritas VSS provider - but failed on the next time I ran the job and hasn't worked since. Nothing was changed.
Can anyone tell me the best log files to look into to help troubleshoot this and if the logs files are on the master, the media or the client?
thanks in advance and kind regards
Hi CadenL,
I've been down this same path before unfortunately :smileysad:
Exact same symptoms - VMware Tools appears to be running fine, then boom....IP address and DNS name are gone when you look at your VCentre console, and of course backups that rely on that fail. If you look at the services remotely, you will find VMTools has stopped. Try logging in to that server and you'll probably find that VMTools is magically running again.
You'll probably also see that you may have ghost disk drives under Computer Management->Disk Management (likely to have a capacity of 0 (zero) MB). Also under Device Manager, if you turn on "Show Hidden Devices", I bet you will find ghost disks under Disk Drives, possibly ghost "Generic volume shadow copy" under Storage volume shadow copies, and ghost Generic volumes under Storage volumes. The ghost items in Device Manager will be in a very light grey compared to the real, active devices. If you look under their properties on the General tab, you'll probably see "Currently, this hardware device is not connected to the computer. (Code 45)". I see all this items as left overs of what happens after this issue.
You can delete all these items, and when the issue happens again they will reappear. If you do a VSSADMIN List Shadows, interestingly there are no orphaned shadow copies, just these ghost devices.
The cause appears to be a number of buggy versions of VMware Tools. We saw this first on a very early 10.0.x version, and several versions following that. We tried 10.1.0, and initially that seemed good but then started having the same problem with that. We currently run 10.1.10 and so far so good. But disappointing to see you are 10.1.15 and still seeing this, which means the same or similar bug keeps getting reintroduced.
For temporary workarounds for this, we had to revert to pure agent based backups as an interim method. Doing that we would see that VMTools sayed stable and didn't bomb out.
Not a fix, but hope this info helps.
Steve