Backupexec 2010 and 2012 leaving snapshots behind
Hi there,
I have a number of customers who run VMWare infrastructures and have backupexec in place to perform their backups with varying degrees of success, however, I've discovered an issue which is of some concern. I have found backupexec is frequently leaving snapshots behind when backup jobs fail, as such the VMs continue running on the snapshots and I'm having to go and manually consolidate them. This issue is happening in the following scenarios (that I've found so far):
Physical Backup Server (Win2k3, all patches installed) + BE2010 R3 > AVVI Backup of Esxi 4 with all patches installed, VMs running 2k3 and 2k8r2
Physical Backup Server (Win2k3, all patches installed) + BE2012 > AVVI Backup of Esxi 4 with all patches installed, VMs running 2k3 and 2k8r2
Physical Backup Server (Win2k3, all patches installed) + BE2012 > AVVI Backup of Esxi 5.1 with patches installed, VMs running linux and 2k3.
I appreciate there are some 'quality issues' with 2012 and Vmware as per a symantec blog post, however, I'm just trying to perform non-grt backups of a linux machine as I can accept not having granular restores available for the time being, but my primary concern is the snapshots frequently being left behind, the BE media services aren't crashing, just the job is failing and BE not instructing vsphere to consolidate the snapshot. This issue was first discovered on Wednesday morning when a customer lost access to their most important database as the datastore had filled up after a housekeeping process which is scheduled to run after the backup kicked in, causing the delta to grow rapidly. Consequently my customer lost 6 hours of their day whilst I consolidated snapshots and got the machine back up and running. Along with this the backup had failed, meaning if i'd had any corruption issues I would have had to roll them back another day for a working backup. Presently I'm having to check for snapshots being left behind and consolidate them manually, which I've done 3 times yesterday and twice this morning. First off, am I doing something wrong? The configuration isn't particularly complex and I can't see any obvious options I'm missing that would lead to this scenario, but I'm all ears if someone has a pointer or two. Secondly, if it is a known issue going as far back as 2010 and esxi 4, is there a patch scheduled to deal with this?
I accept if the media server crashes then it wouldn't be around to make an api call to vmware to perform the consolidation, so in this situation I know to go and tidy up manually, but obviously if the services aren't failing and are leaving things behind this is a nightmare, causing me to spend more time making checks and working around the product, not ideal as I sell Backupexec to my customers under the theory that it should reduce their support bills by being a superior product. Sadly, at present, I'm finding I spend more time supporting BE than I did NTBackup or Server Backup built into Windows, after convinving them to shell out ~£1k on software to improve reliability and reduce support.
Many Thanks
Phill