cancel
Showing results for 
Search instead for 
Did you mean: 

Using NBU for VMware

smakovits
Level 6

I am running 7.1.0.3 to backup up my virtual machines.  When the policy executes, a majority of systems start and finish successfully, however, I have a few others that are having issues when it comes to creating the snapshot.  Instead of a single snapshot for a machine, Vcenter starts trying to create like 5+ snapshots.

The jobs fail in NBU, while the snapshots just sit there.

48 REPLIES 48

Mark_Solutions
Level 6
Partner Accredited Certified

Maybe the client itself then - perhaps having difficulty in quiesceing it - does it run SQL or Exchange or something?

The tools.conf can be edited to disable AppQuiesce

Stuart_Green
Level 6

Heres a thing. You say some VM's backup OK and some don't.

I would start to rule out differences at the VM level.

{So here's the tip}

Good tool I use to inventory and check all VM Settings is RVTools

Download and Run RVTools http://www.robware.net/

Export to a Excel spreadsheet and compare contrast your working and non-working VM's.

Check VM Hardware version, OS version, Network adapters, SCSI adapters, Persistent disks... the whole nine yards of all your VM's

Run this as a Administrator of the vSphere infrastructure (uses passthrough)

From the command line:

C:\Program Files\RobWare\RVTools\RVTools.exe -passthroughAuth -s barney.bedrock.net -c ExportAll2xls -d c:\Temp\ -f RVTools.xls

(barney.bedrock.net is the name of the vcenter server)


Worth a try so ...reset Change Block Tracking.
Changed Block Tracking (CBT) on virtual machines

Enabling Changed Block Tracking (CBT) on virtual machines


I would also be interested at the VMware level.

Review the vmware.log of the problem VM's during the backups. Tells you a lot about the running of the VM and the snapshot operation.

Can get this file in the directory of the VM's Configuration (ie vmx file). So get the file through the Datastore Browser view of the vsphere client.

Pravs
Level 4
Employee
smakovits, did you hear back from VMware support on this ?

smakovits
Level 6

Got this last week, "The latest update from our engineering team is that the "VSSVC.exe" process is in a hung state for approximately two-three minutes.  They're using debugging tools to analyze why this process is getting into this state."

smakovits
Level 6

In the end, this is where I may end up if VMware doesnt find a fix.  

smakovits
Level 6

Finally got new direction from VMware and as a result, I am engaging Microsoft.

 

 

On a fresh install of Windows 2008 with VMware tools installed we see 4 events, 2 to bring the volume online and 2 offline.  In your virtual machine we're seeing 14 additional events (similiar to the one above).  The first one hangs for minutes.  
 
Ultimately, we believe this is either an issue with a) VSS b) the volume manager or c) something else causing these events to pause.
 
At this point we will need to have Microsoft engaged in order to debug VSS further.  

Pravs
Level 4
Employee

this looks interesting. Keep us posted and good luck with Microsoft :)

Pravs
Level 4
Employee

Well, You can also contact Symantec Support (Referencing ETrack 2892912 ) so that they can request option to set the number of times a snapshot is re-attempted.

i.e. if first snapshot fails, NetBackup by default tries it 9 more times. If in your environment, these left-over snapshots are a big concern (for me it will always be), please contact us through a case where we can provide a configuration to help you. Off course It won’t help with the problem that you are facing with snapshot failures, but you should be able to control the number of snapshots.

smakovits
Level 6

Thanks Pravs, in the short term this will help with speed since the 10 snapshots delay other backups from being able to start for over an hour, so this will speed the failure.

 

On a seperate note, Microsoft has told me not them and VMware says Microsoft, so I am working to make the two of them talk...we shall see.

 

The one VM that was failing for 3 months suddnely started to work, but now others that did work do not work, so this is most enjoyable to say the least...