Forum Discussion

Chi_Chris_2012's avatar
11 years ago

VMWare snapshots being locked by NetBackup since 7.6.0.2 upgrade.

Since upgrading from 7.5.0.6 to 7.6.0.2 and our appliances from 2.5.3 to 2.6.2 we have been experiencing sporadic status 156 errors on some VM based backups.  The issue seems to be that NetBackup  locks the snapshot down and does not come back and deletes it.  Since the snapshot isn't being delete the netbackup doesn't release the lock and therefore the disk cannot be consolidated.  This causes the next day's backups to fail with status 156.

Our VMWare backup policies go to a single 5230 appliance and run early evening and finish early morning.  However, hours after the jobs finish and zero jobs are running to the appliance there are still  bpbkar processes running that correlate to the previous night's backup jobs .

If I don't kill these process then the disk wont consolidate and causes some of the next night's VMware backups to fail.  The only way to VM admin's to consolidate the disks is for me to kill the pids associated with the hung bpbkar process.

 

Below are the job details of one of the failed jobs:

 

7/21/2014 10:41:56 AM - Info nbjm(pid=30146) starting backup job (jobid=1798854) for client mp1cucm-sub03, policy VMQUERY_CISCO, schedule DIFF-2WK 
7/21/2014 10:41:56 AM - Info nbjm(pid=30146) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1798854, request id:{8B81D818-10ED-11E4-B49E-B151A9591D42}) 
7/21/2014 10:41:56 AM - requesting resource stu_disk_mpnbuapp02
7/21/2014 10:41:56 AM - requesting resource mpnbu01.NBU_CLIENT.MAXJOBS.mp1cucm-sub03
7/21/2014 10:41:56 AM - requesting resource mpnbu01.VMware.Datastore.chivmvc03/Cisco/Application Datastore
7/21/2014 10:41:56 AM - granted resource mpnbu01.NBU_CLIENT.MAXJOBS.mp1cucm-sub03
7/21/2014 10:41:56 AM - granted resource mpnbu01.VMware.Datastore.chivmvc03/Cisco/Application
7/21/2014 10:41:56 AM - granted resource MediaID=@aaaah;DiskVolume=PureDiskVolume;DiskPool=dp_disk_mpnbuapp02;Path=PureDiskVolume;StorageServer=mpnbuapp02;MediaServer=mpnbuapp02
7/21/2014 10:41:56 AM - granted resource stu_disk_mpnbuapp02
7/21/2014 10:41:56 AM - estimated 13991947 Kbytes needed
7/21/2014 10:41:56 AM - begin Parent Job
7/21/2014 10:41:56 AM - begin Application Snapshot, Step By Condition
Status 0
7/21/2014 10:41:56 AM - end Application Snapshot, Step By Condition; elapsed time: 0:00:00
7/21/2014 10:41:56 AM - begin Application Snapshot, Read File List
Status 0
7/21/2014 10:41:56 AM - end Application Snapshot, Read File List; elapsed time: 0:00:00
7/21/2014 10:41:56 AM - begin Application Snapshot, Create Snapshot
7/21/2014 10:41:56 AM - started process bpbrm (21073)
7/21/2014 10:41:56 AM - started
7/21/2014 10:41:57 AM - Info bpbrm(pid=21073) mp1cucm-sub03 is the host to backup data from    
7/21/2014 10:41:57 AM - Info bpbrm(pid=21073) reading file list for client       
7/21/2014 10:41:57 AM - Info bpbrm(pid=21073) start bpfis on client        
7/21/2014 10:41:57 AM - Info bpbrm(pid=21073) Starting create snapshot processing        
7/21/2014 10:41:57 AM - Info bpfis(pid=21082) Backup started          
7/21/2014 10:41:57 AM - snapshot backup of client mp1cucm-sub03 using method VMware_v2
7/21/2014 10:42:14 AM - Info bpbrm(pid=21073) INF - vmwareLogger: WaitForTaskCompleteEx: Unable to access file <unspecified filename> since it is locked <232>
7/21/2014 10:42:14 AM - Info bpbrm(pid=21073) INF - vmwareLogger: WaitForTaskCompleteEx: SYM_VMC_ERROR:  TASK_REACHED_ERROR_STATE     
7/21/2014 10:42:14 AM - Info bpbrm(pid=21073) INF - vmwareLogger: ConsolidateVMDisks: SYM_VMC_ERROR:  TASK_REACHED_ERROR_STATE     
7/21/2014 10:42:14 AM - Info bpbrm(pid=21073) INF - vmwareLogger: ConsolidateVMDisksAPI: SYM_VMC_ERROR:  TASK_REACHED_ERROR_STATE     
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL - VMware_freeze: VIXAPI freeze (VMware snapshot) failed with 36: SYM_VMC_TASK_REACHED_ERROR_STATE
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL - VMware error received: Unable to access file <unspecified filename> since it is locked
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL - vfm_freeze: method: VMware_v2, type: FIM, function: VMware_v2_freeze
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL -       
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL - vfm_freeze: method: VMware_v2, type: FIM, function: VMware_v2_freeze
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL -       
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL - snapshot processing failed, status 156  
7/21/2014 10:42:15 AM - Critical bpbrm(pid=21073) from client mp1cucm-sub03: FTL - snapshot creation failed, status 156  
7/21/2014 10:42:15 AM - Warning bpbrm(pid=21073) from client mp1cucm-sub03: WRN - ALL_LOCAL_DRIVES is not frozen   
7/21/2014 10:42:15 AM - Info bpfis(pid=21082) done. status: 156         
7/21/2014 10:42:15 AM - end Application Snapshot, Create Snapshot; elapsed time: 0:00:19
7/21/2014 10:42:15 AM - Info bpfis(pid=21082) done. status: 156: snapshot error encountered      
7/21/2014 10:42:15 AM - end writing
Status 156
7/21/2014 10:42:15 AM - end Parent Job; elapsed time: 0:00:19
7/21/2014 10:42:15 AM - begin Application Snapshot, Stop On Error
Status 0
7/21/2014 10:42:15 AM - end Application Snapshot, Stop On Error; elapsed time: 0:00:00
7/21/2014 10:42:15 AM - begin Application Snapshot, Resources For Cleanup
Status -9999
7/21/2014 10:42:15 AM - end Application Snapshot, Resources For Cleanup; elapsed time: 0:00:00
7/21/2014 10:42:15 AM - begin Application Snapshot, Delete Snapshot
7/21/2014 10:42:15 AM - started process bpbrm (21152)
7/21/2014 10:42:16 AM - Info bpbrm(pid=21152) Starting delete snapshot processing        
7/21/2014 10:42:16 AM - Info bpfis(pid=21161) Backup started          
7/21/2014 10:42:16 AM - Critical bpbrm(pid=21152) from client mp1cucm-sub03: FTL - cannot open /usr/openv/netbackup/online_util/fi_cntl/bpfis.fim.mp1cucm-sub03_1405957316.1.0    
7/21/2014 10:42:16 AM - Info bpfis(pid=21161) done. status: 4207         
7/21/2014 10:42:16 AM - end Application Snapshot, Delete Snapshot; elapsed time: 0:00:01
7/21/2014 10:42:16 AM - Info bpfis(pid=21161) done. status: 4207: Could not fetch snapshot metadata or state files 
7/21/2014 10:42:16 AM - end writing
Status 4207
7/21/2014 10:42:16 AM - end operation
Status 156
7/21/2014 10:42:16 AM - end operation
snapshot error encountered(156)

 

 

Has anyone come across this before? Any help would be greatly appreciated.

 

Thanks,

 

Chris

  • This bug has been patched for 7.6.0.2 and 7.6.0.3. The fix will be included in the 7.6.0.4 release of NBU, and only effects Linux backup hosts. (does not apply to windows)

    Please see:

    http://www.symantec.com/docs/TECH223790

    *PLEASE NOTE*

    If you are running an NBU appliance, there is a second patch that needs to be applied along with the regular bpbkar EEB. This smaller EEB (see TN above for list of patches - 1 set for 2.6.0.2, and 1 set for 2.6.0.3 appliances) needs to be applied before the larger EEB.

     

    Best regards,

     

    Chris Evensen