cancel
Showing results for 
Search instead for 
Did you mean: 

Status 156 VMware error occurred while quiescing the virtual machine

DuckTape
Level 4

netbackup appliance 5220  version 2.5.1, netbackup version 7.5.0.4.  Linux master server backing up a windows 2008 r2  VM machine.

My VM admin moved serveral servers from 1 datastore to another. He called it v-motioned.  only selected servers fail with status 156.  A reboot of the server clears up the problem and goes away but we have some servers that run 24/7 and rebooting takes a act of congress. I seen where re-installing vmtools would help, changing VSS providers, but evrything will cause a reboot. So i dont know really what fixed it, the change or reboot.   All VSS is up & running, I can backup without quiescing on selected servers but this is not recommended and not possible on some servers.  i have tried stopping netbackup and clearing

/usr/openv/netbackup/online_util/fi_cntl/   folder and restarting. 

is any idea how to fix quiescing without rebooting the client?

detail log :

05/30/2013 04:33:50 - Info nbjm (pid=11843) starting backup job (jobid=137332) for client stsymantec, policy vmware_highland, schedule Full
05/30/2013 04:33:50 - Info nbjm (pid=11843) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=137332, request id:{A724CD60-C903-11E2-B1D8-D2F6FB7B2377})
05/30/2013 04:33:50 - requesting resource stu_disk_primaster
05/30/2013 04:33:50 - requesting resource primaster.NBU_CLIENT.MAXJOBS.stsymantec
05/30/2013 04:33:50 - requesting resource primaster.NBU_POLICY.MAXJOBS.vmware_highland
05/30/2013 04:33:50 - requesting resource primaster.VMware.Datastore.vmwhl-hixiv-san3
05/30/2013 04:33:50 - granted resource  primaster.NBU_CLIENT.MAXJOBS.stsymantec
05/30/2013 04:33:50 - granted resource  primaster.NBU_POLICY.MAXJOBS.vmware_highland
05/30/2013 04:33:50 - granted resource  primaster.VMware.Datastore.vmwhl-hixiv-san3
05/30/2013 04:33:50 - granted resource  MediaID=@aaaac;DiskVolume=PureDiskVolume;DiskPool=dp_disk_primaster;Path=PureDiskVolume;StorageServer=primaster;MediaServer=primaster
05/30/2013 04:33:50 - granted resource  stu_disk_primaster
05/30/2013 04:33:51 - Info bpbrm (pid=25075) stsymantec is the host to backup data from
05/30/2013 04:33:51 - Info bpbrm (pid=25075) reading file list from client
05/30/2013 04:33:51 - Info bpbrm (pid=25075) start bpfis on client
05/30/2013 04:33:51 - Info bpbrm (pid=25075) Starting create snapshot processing
05/30/2013 04:33:51 - estimated 0 kbytes needed
05/30/2013 04:33:51 - begin Parent Job
05/30/2013 04:33:51 - begin Application Snapshot: Step By Condition
Operation Status: 0
05/30/2013 04:33:51 - end Application Snapshot: Step By Condition; elapsed time 0:00:00
05/30/2013 04:33:51 - begin Application Snapshot: Read File List
Operation Status: 0
05/30/2013 04:33:51 - end Application Snapshot: Read File List; elapsed time 0:00:00
05/30/2013 04:33:51 - begin Application Snapshot: Create Snapshot
05/30/2013 04:33:51 - started process bpbrm (pid=25075)
05/30/2013 04:33:52 - Info bpfis (pid=25083) Backup started
05/30/2013 04:33:52 - snapshot backup of client stsymantec using method VMware_v2
05/30/2013 04:38:33 - Critical bpbrm (pid=25075) from client stsymantec: FTL - VMware snapshot failed: SYM_VMC_TASK_REACHED_ERROR_STATE
05/30/2013 04:38:33 - Critical bpbrm (pid=25075) from client stsymantec: FTL - VMware error received: An error occurred while quiescing the virtual machine. See the virtual machine's event log for details.
05/30/2013 04:38:33 - Critical bpbrm (pid=25075) from client stsymantec: FTL - snapshot creation failed, status 156
05/30/2013 04:38:33 - Warning bpbrm (pid=25075) from client stsymantec: WRN - ALL_LOCAL_DRIVES is not frozen
05/30/2013 04:38:33 - Info bpfis (pid=25083) done. status: 156
05/30/2013 04:38:33 - end Application Snapshot: Create Snapshot; elapsed time 0:04:42
05/30/2013 04:38:33 - Info bpfis (pid=0) done. status: 156: snapshot error encountered
05/30/2013 04:38:33 - Info bpbrm (pid=25515) Starting delete snapshot processing
05/30/2013 04:38:33 - Info bpfis (pid=0) Snapshot will not be deleted
05/30/2013 04:38:33 - end writing
Operation Status: 156
05/30/2013 04:38:33 - begin Application Snapshot: Stop On Error
Operation Status: 0
05/30/2013 04:38:33 - end Application Snapshot: Stop On Error; elapsed time 0:00:00
05/30/2013 04:38:33 - begin Application Snapshot: Cleanup Resources
Operation Status: 4294957297
05/30/2013 04:38:33 - end Application Snapshot: Cleanup Resources; elapsed time 0:00:00
05/30/2013 04:38:33 - begin Application Snapshot: Delete Snapshot
05/30/2013 04:38:33 - started process bpbrm (pid=25515)
05/30/2013 04:38:34 - Info bpfis (pid=25523) Backup started
05/30/2013 04:38:34 - Critical bpbrm (pid=25515) from client stsymantec: FTL - cannot open /usr/openv/netbackup/online_util/fi_cntl/bpfis.fim.stsymantec_1369902831.1.0
05/30/2013 04:38:34 - Info bpfis (pid=25523) done. status: 1542
05/30/2013 04:38:34 - end Parent Job; elapsed time 0:04:43
05/30/2013 04:38:34 - Info bpfis (pid=0) done. status: 1542: An existing snapshot is no longer valid and cannot be mounted for subsequent operations
05/30/2013 04:38:34 - end writing
Operation Status: 1542
05/30/2013 04:38:34 - end Application Snapshot: Delete Snapshot; elapsed time 0:00:01
Operation Status: 156
snapshot error encountered  (156)
 

9 REPLIES 9

Shilpa_Gadey
Level 3

How is the client selection in the policy configured ? Did you do a manual selection by going into the datastore and selecting client ? The client selection may need a reconfiguration.

bpdown
Level 4

This generally happens to us when the VM client has very low disk space on one of its drives...

Or, if the VM is running SQL, Exchange, DBs where lots of CPU is being used....

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

newbee,

you already have the solution.. 

. I seen where re-installing vmtools would help, changing VSS providers, but evrything will cause a reboot.

I am not seeing other way than this... 

DuckTape
Level 4

i am placing this in here in case these test may help someone else. After weeks of researching and trying things like:

  1. netbackup stop.   cleared /usr/openv/netbackup/online_util/fi_cntl/  directory.   then restarted Netbackup.

  2. VSS fix from this page https://www-secure.symantec.com/connect/blogs/vss-fixbat

  3. disable VMWARE application quiescing.  by creating a file called tools.conf. in C:\ProgramData\VMware\VMware Tools  that has  [vmbackup]  vss.disableAppQuiescing = true     in it. this is the link.  http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200906...

the one last final thing we were ask to do was provide the vmware log file by browsing the datastore the VM is on, going into the folder for the VM like below, right-mouse-click on the file and select download. This is vmware created log file.

we could not get to this one because of an error, which pointed to a flake up or lock error on this server with vmware.

what we were hoping was to find what process had things hung up and if there was a kill/restart of that process.

After being battle weary,  we selected the only option and just will schedule a REBOOT.

 

 

dealing with support on netbackup & my

DuckTape
Level 4

Found somewhat of a work around.

 

When we v-motioned a server, some servers erred during the v-motion because of a file lock coming from Netbackup but this did not stop v-motion from continuing . IF we continued with this error we always had to reboot. So when this occurred for a server, we stopped the v-motion process and ran a manual backup which released this file lock. Then restarted the v-motion process again and it would v-motion without errors and netbackup ran ok.
 

 

DG-2005
Level 5

Do you have a copy of this handy?

 

  1. VSS fix from this page https://www-secure.symantec.com/connect/blogs/vss-fixbat

 

DuckTape
Level 4

 this was used on windows 2008 r2

I attached the file i used but in case it dont work here is the contents::

 rem FILENAME: FIXVSS08.BAT

rem
 
net stop "System Event Notification Service"
 
net stop "Background Intelligent Transfer Service"
 
net stop "COM+ Event System"
 
net stop "Microsoft Software Shadow Copy Provider"
 
net stop "Volume Shadow Copy"
 
cd /d %windir%\system32
 
net stop vss
 
net stop swprv
 
regsvr32 /s ATL.DLL
 
regsvr32 /s comsvcs.DLL
 
regsvr32 /s credui.DLL
 
regsvr32 /s CRYPTNET.DLL
 
regsvr32 /s CRYPTUI.DLL
 
regsvr32 /s dhcpqec.DLL
 
regsvr32 /s dssenh.DLL
 
regsvr32 /s eapqec.DLL
 
regsvr32 /s esscli.DLL
 
regsvr32 /s FastProx.DLL
 
regsvr32 /s FirewallAPI.DLL
 
regsvr32 /s kmsvc.DLL
 
regsvr32 /s lsmproxy.DLL
 
regsvr32 /s MSCTF.DLL
 
regsvr32 /s msi.DLL
 
regsvr32 /s msxml3.DLL
 
regsvr32 /s ncprov.DLL
 
regsvr32 /s ole32.DLL
 
regsvr32 /s OLEACC.DLL
 
regsvr32 /s OLEAUT32.DLL
 
regsvr32 /s PROPSYS.DLL
 
regsvr32 /s QAgent.DLL
 
regsvr32 /s qagentrt.DLL
 
regsvr32 /s QUtil.DLL
 
regsvr32 /s raschap.DLL
 
regsvr32 /s RASQEC.DLL
 
regsvr32 /s rastls.DLL
 
regsvr32 /s repdrvfs.DLL
 
regsvr32 /s RPCRT4.DLL
 
regsvr32 /s rsaenh.DLL
 
regsvr32 /s SHELL32.DLL
 
regsvr32 /s shsvcs.DLL
 
regsvr32 /s /i swprv.DLL
 
regsvr32 /s tschannel.DLL
 
regsvr32 /s USERENV.DLL
 
regsvr32 /s vss_ps.DLL
 
regsvr32 /s wbemcons.DLL
 
regsvr32 /s wbemcore.DLL
 
regsvr32 /s wbemess.DLL
 
regsvr32 /s wbemsvc.DLL
 
regsvr32 /s WINHTTP.DLL
 
regsvr32 /s WINTRUST.DLL
 
regsvr32 /s wmiprvsd.DLL
 
regsvr32 /s wmisvc.DLL
 
regsvr32 /s wmiutils.DLL
 
regsvr32 /s wuaueng.DLL
 
sfc /SCANFILE=%windir%\system32\catsrv.DLL
 
sfc /SCANFILE=%windir%\system32\catsrvut.DLL
 
sfc /SCANFILE=%windir%\system32\CLBCatQ.DLL
 
net start "COM+ Event System"

 

DG-2005
Level 5

Excellent thank you! the link above had been removed, just wanted to add this to my collection :)

Bmitche
Level 5

We have the same issue.... which I believe is this.

Start the vmdk level backup via the vmware policy. Netbackup tells vshpere to create a snaphot. Vshpere flips some bit somewhere preventing the VM from being vmotioned while the snapshot is in effect. Netbackup does the backup, then tells vsphere to remove the snapshot.

 

What I have run into is....on some VM's not all and not the same one's everytime....apparently the bit does not get flipped back to "normal" after the snapshot is removed....leaving the VM un-vmotionable.

 

As in the previous post....the fix I use is running another backup. Sometime I have to re-run the backup a couple times but it will eventually be "fixed". We had our dba team write a process which will identify the VM's with the vmotion bit set. It is very handy when we have to re-run 50-60 backups because of the issue.

We have spoken to Symantec & VmWare about the issue and they both point the finger at each other. The re-run "fix" works but it would be nice if we didn't have to do it.