Forum Discussion

elanmbx's avatar
elanmbx
Level 6
10 years ago

2.6.1.2 Appliance VMware snapshot issue - creates but unable to delete snapshots

Recently upgraded entire environment to 7.6.1.2/2.6.1.2 from 7.6.1/2.6.1

Environment briefly:

Master: Solaris 10, NetBackup 7.6.1.2
Media: 4 x 5220 Appliances, NetBackup 7.6.1.2
VMware: vCenter 4.1

So was not having issues with the vCenter 4.1 environment prior to the ugprade.  But post upgrade I started getting Status 156s *after* a night of successful backups.  What I have narrowed it down to is that the appliance is able to request and get a snapshot done, but when it comes time to clean up the snapshot, it fails.  Strange thing is that this job completes with a Status 0 - but with the following errors:

ERRORS FROM JOB DETAILS:
08/08/2015 21:56:19 - Info bpbrm (pid=4754) INF - vmwareLogger: removeVirtualMachineSnapshot: SYM_VMC_ERROR: SOAP_ERROR
08/08/2015 21:56:20 - Info bpbrm (pid=4754) INF - vmwareLogger: RegisterExtensionAPI: SYM_VMC_ERROR: SOAP_ERROR
08/08/2015 21:56:20 - Info bpbrm (pid=4754) INF - vmwareLogger: SOAP 1.1 fault: "":ServerFaultCode [no subcode]
08/08/2015 21:56:20 - Critical bpbrm (pid=4754) from client denamrd8: FTL - VMware_thaw: VMware delete snapshot failed: SYM_VMC_SOAP_ERROR
08/08/2015 21:56:20 - Critical bpbrm (pid=4754) from client denamrd8: FTL - VMware error received: Permission to perform this operation was denied.
08/08/2015 21:56:20 - Info bpfis (pid=4771) done. status: 0
08/08/2015 21:56:20 - end Application Snapshot: Delete Snapshot; elapsed time 0:00:09

ERRORS FROM LOG FILE:
11:59:51.731 [28259] <2> onlfi_vfms_logf: INF - do_thaw: Retrieving snapshots returned  0. 1 snapshots were found.
11:59:51.731 [28259] <2> onlfi_vfms_logf: INF - do_thaw: Snapshot name=NBU_SNAPSHOT densyma02p 1439401488, mor=snapshot-5265965
11:59:51.731 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: Thread <1> acquired lock in removeVirtualMachineSnapshot, level 1
11:59:51.732 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: removeVirtualMachineSnapshot: <InvalidRequestFault xmlns="urn:vim25" xsi:type="InvalidRequest"></InvalidRequestFault>
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: removeVirtualMachineSnapshot: Starting ns2:RemoveSnapshotRequestType dump.
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: <?xml version="1.0" encoding="UTF-8"?>
<ns2:RemoveSnapshotRequestType xsi:type="ns2:RemoveSnapshotRequestType"><ns2:_this xsi:type="ns2:ManagedObjectReference" type="VirtualMachineSnapshot">snapshot-5265965</ns2:_this><ns2:removeChildre
n>false</ns2:removeChildren><ns2:consolidate>true</ns2:consolida
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: te></ns2:RemoveSnapshotRequestType>
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: removeVirtualMachineSnapshot: Completed ns2:RemoveSnapshotRequestType dump.
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: removeVirtualMachineSnapshot: SYM_VMC_ERROR:  SOAP_ERROR
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - vmwareLogger: Thread <1> releasing lock in removeVirtualMachineSnapshot, level 1
11:59:51.733 [28259] <2> onlfi_vfms_logf: INF - do_thaw: vmcRemoveSnapshot returned 6

Subsequent backups fail due to the old snapshot being orphaned.  I have to manually go in and delete the orphaned snapshots, then I can get another good backup... which promptly leaves another orphaned snapshot.

Support has been somewhat helpful, but at this point they are saying things point to VMware... and I don't really know where to start looking.  I'm somewhat hopeful since it is easily reproducable, but since the "official" word is that 4.1 is not supported in conjuction with 7.6.1 I'm not overly confident.

Anyone run into anything similar?

  • More pertinent info - I have created a role in VMware with FULL privileges, and that did not have any effect on this issue.  Still unable to remove snapshots.  To add to the strangeness - I can login with my NetBackup service account into vCenter and create/delete snapshots on these VMs without a problem.  It looks like some sort of communication issue between 7.6.1.2 and vCenter 4.1.

  • Bottom line is the version of vcenter is not supported. Symantec wont further if something is out of support.

    To get to know the *reason* behind the unsuccessful removal of the snapshot we need to get a set of logs and analyze them.

    There are several questions which we need to answer before we get into any t/s direction - 

    1. Do you see a delete snapshot operation in the vcenter while Netbackup finishes backup. 

    2. What is the state of the backup is you run it after deleting the shapshot. Successful/partial/fail?

    3. Please state the final answer from support. Which part of vmware process are they referring to?

  • 1. The delete snapshot job never shows up in vCenter.

    2. The state of both the snapshot *and* the backup job in NBU is "Successful."

    3. Support has indicated "The SOAP messages are coming from the vmwareLogger lines, which indicates this would be coming from the VMware side."

    As I stated prior - I'm not a VMware guy... so I'm not even sure what logs I should be pulling together... or where to get them?  ESX host?  vCenter?

    Thanks!

  • Here are the guidelines to create VxMS logging

    Create vxms log directory under netbackup log directory
    Windows:
    You will create vxms directory under \Veritas\NetBackup\logs
    UNIX and Solaris:
    You will create vxms directory under usr/openv/netbackup/logs/
    Step 2:
    Enable logging
    Windows:
    – Create DWORD registry ‘VXMS_VERBOSE’ under
    HKEY_LOCAL_MACHINE\SOFTWARE\Veritas\NetBackup\CurrentVersion\Config
    – Set any nonnegative numeric value including 0 to configure log level
    UNIX and Solaris:
    – Edit /usr/openv/netbackup/bp.conf and add new key ‘VXMS_VERBOSE=’

  • Hi,

     

    I don't see anything in these logs. I think you'll need to upgrade the VMware environment.

  • You can refer to following logs to the VMware tech support guys. As suggested by Riaan, upgrading VMware may help you as even if the issue is not resolved, you can atleast work with Symantec support till resolution. 

    Even from VMware perspective an upgrade will be essential as version 4.x is nearing EOL. As per KB2039567 The End of General Support Life date remains May 21, 2014. The end of Technical Guidance remains May 21, 2016. For information on what is provided during General Support and Technical Guidance, see the VMware support lifecycle policy.

    All the best!! 

    08/14/2015 14:01:16 : findSystemRoot:RvpFileSystemMgr.cpp:1337 <ERROR> : Not able to find system root
       08/14/2015 14:01:16 : g_vixInterfaceLogger:libvix.cpp:1775 <DEBUG> : [VFM_ESINFO] VixDiskLib: VixDiskLib_GetMetadataKeys: Insufficient buffer size (has 0, need 202): Error 24 at 4519.
       08/14/2015 14:01:16 : g_vixInterfaceLogger:libvix.cpp:1775 <DEBUG> : [VFM_ESINFO] VixDiskLib: VixDiskLib_ReadMetadata: Insufficient buffer size (has 0, need 9): Error 24 at 4611.

     08/14/2015 14:01:17 : g_vixInterfaceLogger:libvix.cpp:1775 <DEBUG> : [VFM_ESINFO] VixDiskLib: VixDiskLib_GetMetadataKeys: Insufficient buffer size (has 0, need 202): Error 24 at 4519.

       08/14/2015 14:01:17 : g_vixInterfaceLogger:libvix.cpp:1775 <DEBUG> : [VFM_ESINFO] VixDiskLib: VixDiskLib_ReadMetadata: Insufficient buffer size (has 0, need 2): Error 24 at 4611.
        EMCPOWER emcpower_parse_devname        : /nbcm_builds/NB/7.6.1.2/src/vxms/plugin/platforms/linux/osdep_emcpower.c.70 <INFO> : emcpower_parse_devname: is returning with ERROR
       08/14/2015 14:01:17 : rvp_map_open_file_index:mapintf.cpp:730 <ERROR> : Returning: 2058

       08/14/2015 14:01:49 : g_vixInterfaceLogger:libvix.cpp:1775 <DEBUG> : [VFM_ESINFO] 2015-08-14T14:01:49.061-06:00 [7F4423021700 trivia 'ThreadPool'] ThreadPool[idle:4, busy_io:0, busy_long:0] HandleWork(type: 1, fun: N5boost3_bi6bind_tINS0_11unspecifiedENS_8functionIFvRKNS_6system10error_codeEEEENS0_5list1INS0_5valueIS5_EEEEEE)