Forum Discussion

JAM1991's avatar
JAM1991
Level 4
10 years ago

VMware - Snapshot error encountered (156)

I have a VMware policy configured using annotation query to select clients.

Yesterday one of the clients failed with a status code of 156.

Last night the same client completed successfully.

I noticed in the detailed status, I have error's relating to resource.

23/09/2014 18:00:52 - Info nbrb(pid=2888) Limit has been reached for the logical resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk 

What do I need to check/change to stop this from re-occuring?

Thanks,

  • 23/09/2014 18:56:55 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - VMware error received: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.

    This error looks to be the cause. Look at bpfis log with vmcloglevel = 6 for further logs. (see VMware for NetBackup Administration Guide for more information on vmcloglevel)

    This error can occur when The I/O in the virtual machine is very high and the quiesce operation is unable to flush all the data to disk as more and more I/O is being created. There can be different solutions to this including pausing active applications with high I/O or restart the vmware tools service in the guest. I have even seen reinstalling/upgrading to latest version of vmware tools to fix.

     

5 Replies

  • The above is Info  message. Not an error. 

    Please post all text in Details tab of the failed job.

    Logs needed on the Backup Host:

    bpfis and VxMS.

  • Please see details tab text below as requested.

     

    23/09/2014 18:00:52 - Info nbjm(pid=3764) starting backup job (jobid=42117) for client PD-DC2-CITRIX2, policy VMware_NO_BLIB_DC2, schedule Daily_Incremental  
    23/09/2014 18:00:52 - Info nbjm(pid=3764) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=42117, request id:{B549646F-EA56-4F1B-B1FA-AA1528A31651})  
    23/09/2014 18:00:52 - requesting resource stu_disk_pd-dc2-nbu1
    23/09/2014 18:00:52 - requesting resource pd-dc2-nbumst.NBU_CLIENT.MAXJOBS.PD-DC2-CITRIX2
    23/09/2014 18:00:52 - requesting resource pd-dc2-nbumst.VMware.Datastore.pd-dc2-vcenter.pdports.co.uk/DC2/DC2 SAN Volume 1
    23/09/2014 18:00:52 - requesting resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk
    23/09/2014 18:00:52 - Info nbrb(pid=2888) Limit has been reached for the logical resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk    
    23/09/2014 18:03:01 - awaiting resource stu_disk_pd-dc2-nbu1 - Maximum job count has been reached for the storage unit
    23/09/2014 18:03:05 - Info nbrb(pid=2888) Limit has been reached for the logical resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk    
    23/09/2014 18:04:25 - awaiting resource stu_disk_pd-dc2-nbu1 - Maximum job count has been reached for the storage unit
    23/09/2014 18:04:29 - Info nbrb(pid=2888) Limit has been reached for the logical resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk    
    23/09/2014 18:04:36 - awaiting resource stu_disk_pd-dc2-nbu1 - Maximum job count has been reached for the storage unit
    23/09/2014 18:04:40 - Info nbrb(pid=2888) Limit has been reached for the logical resource pd-dc2-nbumst.VMware.Datastore.pd-dc2-vcenter.pdports.co.uk/DC2/DC2 SAN Volume 1 
    23/09/2014 18:50:47 - Info bpbrm(pid=106712) PD-DC2-CITRIX2 is the host to backup data from     
    23/09/2014 18:50:47 - Info bpbrm(pid=106712) reading file list for client        
    23/09/2014 18:50:47 - Info bpbrm(pid=106712) start bpfis on client         
    23/09/2014 18:50:47 - Info bpbrm(pid=106712) Starting create snapshot processing         
    23/09/2014 18:50:48 - Info bpfis(pid=4680) Backup started           
    23/09/2014 18:50:49 - snapshot backup of client PD-DC2-CITRIX2 using method VMware_v2
    23/09/2014 18:56:55 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - VMware_freeze: VIXAPI freeze (VMware snapshot) failed with -1: Unrecognized error
    23/09/2014 18:56:55 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - VMware error received: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.
    23/09/2014 18:56:55 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - vfm_freeze: method: VMware_v2, type: FIM, function: VMware_v2_freeze 
    23/09/2014 18:56:56 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL -        
    23/09/2014 18:56:56 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - vfm_freeze: method: VMware_v2, type: FIM, function: VMware_v2_freeze 
    23/09/2014 18:56:56 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL -        
    23/09/2014 18:56:56 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - snapshot processing failed, status 156   
    23/09/2014 18:56:56 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - snapshot creation failed, status 156   
    23/09/2014 18:56:56 - Warning bpbrm(pid=106712) from client PD-DC2-CITRIX2: WRN - ALL_LOCAL_DRIVES is not frozen    
    23/09/2014 18:56:56 - Info bpfis(pid=4680) done. status: 156          
    23/09/2014 18:56:56 - end operation
    23/09/2014 18:56:56 - Info bpfis(pid=4680) done. status: 156: snapshot error encountered       
    23/09/2014 18:57:00 - Info bpbrm(pid=107568) Starting delete snapshot processing         
    23/09/2014 18:57:03 - Info bpfis(pid=3036) Backup started           
    23/09/2014 18:57:03 - Critical bpbrm(pid=107568) from client PD-DC2-CITRIX2: FTL - cannot open C:\Program Files\Veritas\NetBackup\online_util\fi_cntl\bpfis.fim.PD-DC2-CITRIX2_1411495262.1.0    
    23/09/2014 18:57:03 - Info bpfis(pid=3036) done. status: 4207          
    23/09/2014 18:57:03 - end operation
    23/09/2014 18:57:03 - Info bpfis(pid=3036) done. status: 4207: Could not fetch snapshot metadata or state files  
    23/09/2014 19:01:02 - granted resource pd-dc2-nbumst.NBU_CLIENT.MAXJOBS.PD-DC2-CITRIX2
    23/09/2014 19:01:02 - granted resource pd-dc2-nbumst.VMware.Datastore.pd-dc2-vcenter.pdports.co.uk/DC2/DC2
    23/09/2014 19:01:02 - granted resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk
    23/09/2014 19:01:02 - granted resource MediaID=@aaaab;DiskVolume=PureDiskVolume;DiskPool=dp_disk_pd-dc2-nbu1;Path=PureDiskVolume;StorageServer=pd-dc2-nbu1;MediaServer=pd-dc2-nbu1
    23/09/2014 19:01:02 - granted resource stu_disk_pd-dc2-nbu1
    23/09/2014 19:01:02 - estimated 9292488 Kbytes needed
    23/09/2014 19:01:02 - begin Parent Job
    23/09/2014 19:01:02 - begin Application Snapshot, Step By Condition
    Status 0
    23/09/2014 19:01:02 - end Application Snapshot, Step By Condition; elapsed time: 0:00:00
    23/09/2014 19:01:02 - begin Application Snapshot, Read File List
    Status 0
    23/09/2014 19:01:02 - end Application Snapshot, Read File List; elapsed time: 0:00:00
    23/09/2014 19:01:02 - begin Application Snapshot, Create Snapshot
    23/09/2014 19:01:02 - started
    23/09/2014 19:01:03 - started process bpbrm (106712)
    23/09/2014 19:07:14 - end writing
    Status 156
    23/09/2014 19:07:14 - end Application Snapshot, Create Snapshot; elapsed time: 0:06:12
    23/09/2014 19:07:14 - begin Application Snapshot, Stop On Error
    Status 0
    23/09/2014 19:07:14 - end Application Snapshot, Stop On Error; elapsed time: 0:00:00
    23/09/2014 19:07:14 - begin Application Snapshot, Resources For Cleanup
    23/09/2014 19:07:14 - requesting resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk
    23/09/2014 19:07:14 - granted resource pd-dc2-nbumst.VMware.snapshot.vCenter.pd-dc2-vcenter.pdports.co.uk
    Status 0
    23/09/2014 19:07:14 - end Application Snapshot, Resources For Cleanup; elapsed time: 0:00:00
    23/09/2014 19:07:14 - begin Application Snapshot, Delete Snapshot
    23/09/2014 19:07:15 - started process bpbrm (107568)
    23/09/2014 19:07:21 - end writing
    Status 4207
    23/09/2014 19:07:21 - end Application Snapshot, Delete Snapshot; elapsed time: 0:00:07
    Status 156
    23/09/2014 19:07:21 - end Parent Job; elapsed time: 0:06:19
    snapshot error encountered(156)

  • 23/09/2014 18:56:55 - Critical bpbrm(pid=106712) from client PD-DC2-CITRIX2: FTL - VMware error received: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.

    This error looks to be the cause. Look at bpfis log with vmcloglevel = 6 for further logs. (see VMware for NetBackup Administration Guide for more information on vmcloglevel)

    This error can occur when The I/O in the virtual machine is very high and the quiesce operation is unable to flush all the data to disk as more and more I/O is being created. There can be different solutions to this including pausing active applications with high I/O or restart the vmware tools service in the guest. I have even seen reinstalling/upgrading to latest version of vmware tools to fix.

     

  • The message you initially posted relates to resource limits which have been set in the Master Servers host properties against the number of permitted snapshots

    The actual error will need troubleshooting for why the snapshot failed.

    You can check vcenter to see if that gives you clues and also enable bpfis logging on the vmware backup host and then examine thise logs after the backup fails next time.

    There are so many possible causes of a 156 that we cannot guess here.

    Full details of all of the logging you can setup is in the NBU VMware admin guide - troubleshooting section.

    #EDIT#

    Good spot symterry - missed that line!!

    So is this a citrix server as its name suggests? I guess that could be pretty busy?

    #EDIT 2#

    If you want to increase the resource limits you can do - if i remember i set them at reasonably low levels so that neither the vCenter nor the Appliance were over stretched .. especially with doing VMware accelerator backups .. if everything else is running well then you can try and increase them if you wish but often with VMware backups, especially accelerated ones, the more you do at the same time the slower they actually get.

  • So the error to focus on is:

     VMware error received: An error occurred while saving the snapshot: Failed to quiesce the virtual machine.

     

    What this is trying to say is that the problem is inside the virtual machine.  VMware tools was unable to quiesce the drives inside the VM.

    For Windows, VMware VSS Provider inside the VM is used to quiesce the VM.  Unless you installed a 3rd party quiesce agent like Symantec VSS Provider.

    There should be a VSS error in the Event Log in most cases.  Maybe you have a VSS Writer in a bad state.  Here is a helpful VMware article:

    http://kb.vmware.com/kb/1007696

    For Linux, depending on the VMware tools options used at installation, you may have vmsync installed in the VM to quiesce the VM.  Or again a 3rd party tool like SYMCquiesce to quiesce linux VMs.

    Everything you need to know about SYMCQuiesce:

    http://www.symantec.com/docs/HOWTO70978