10-25-2021 01:07 PM
Since upgrading to 9.1, I have random VM's, sometimes a handful and sometimes 20+ where the snapshot is taken, the backup starts and completes then the snapshot goes to get deleted and what appears to be an error/timeout occurs which states that unlocking the VM failed and the snapshot wasn't deleted. However, vSphere shows no snapshot deletion errors and the snapshots are deleted. So, we are still getting successful backups but unnecessary status 1's.
This is the output when the snapshot reports status 1.
07:55:22.935 [127742] <2> RemoteVmLock::requestLockOperation: Sent 243 bytes on request send '<VddkHandler EOD="26"><operation>unlockVm</operation><vsphereServer>vSphere Server</vsphereServer><masterServer>Master Server</masterServer><jobId>Job ID</jobId><vmId>vm-2677765</vmId><retryCount>8</retryCount><retryDelay>1</retryDelay></VddkHandler>'
07:55:22.942 [127742] <2> RemoteVmLock::requestLockOperation: Received 38 bytes: <?xml version="1.0" encoding="UTF-8"?>
08:04:32.018 [127742] <8> RemoteVmLock::requestLockOperation: Did not receive response, timer expired (0)
08:04:32.018 [127742] <2> onlfi_vfms_logf: INF - VMware_thaw: vmLock.release() returned -1 (../vmware.cpp:7405)
08:04:32.018 [127742] <2> onlfi_vfms_logf: INF - VMware_thaw: FTL - VMware_thaw: Unable to unlock virtual machine: /vmmor/vm-2677765. (../vmware.cpp:770)
08:04:32.019 [127742] <2> onlfi_vfms_logf: INF - VMware_thaw: Send to job details returned 72 (../vmware.cpp:738)
08:04:32.024 [127742] <2> onlfi_vfms_logf: INF - vmwareLogger: LogoutAPI: Caller's ThreadId = 1, Login key=LoginKey, returned 0
08:04:32.025 [127742] <2> onlfi_vfms_logf: INF - dovSphereDisconnect: disconnect for vSphere Server completed with flags 0 and status 0
08:04:32.025 [127742] <2> onlfi_vfms_logf: INF - dovSphereDisconnect: disconnect for vSphere Server completed with flags 0 and status 0
08:04:32.025 [127742] <16> do_thaw: ERR - do_thaw : fis_unclone() failed with 48
08:04:32.026 [127742] <4> onlfi_thaw: INF - do_thaw return value: 48
08:04:32.026 [127742] <2> onlfi_thaw: INF - Extracting snapshot client error = 1
08:04:32.026 [127742] <32> onlfi_thaw: FTL - VfMS error 48; see following messages:
08:04:32.026 [127742] <32> onlfi_thaw: FTL - Failed to unlock virtual machine
08:04:32.026 [127742] <32> onlfi_thaw: FTL - vfm_thaw: method: VMware_v2, type: FIM, function: VMware_v2_thaw
From what I've seen in the logs, when the call is successful, the logging ends here.
07:55:22.942 [127742] <2> RemoteVmLock::requestLockOperation: Received 38 bytes: <?xml version="1.0" encoding="UTF-8"?>
10-25-2021 02:43 PM
Hi @Krutons
Can you provide a little more detail on the environment, such as OS version of the backup host and vSphere version. Also is there anything common with VM's affected (such as hardware version, OS , number of IP addresses or otherwise)?
I can see some internal issues related to VDDK versions and also VM clients with lots of IP addresses, but nothing specific to version 9.1.
Have you reviewed the vpVMreq and bpVMutil logs and do they provide any further relevant information?
Cheers
David
10-26-2021 06:52 AM
Hey @davidmoline ,
The VMware backup hosts are NetBackup 5240 Appliances running 4.1. The vSphere Client version is 6.7.0.50000.
I don't think there is anything common between the VM's as one might status 1 tomorrow and not status 1 again for another 2 weeks for example but I'll go through the list of ones that threw a status 1 most recently.
From the 5 that threw the status 1 last night, here is the info for each.
Hypervisor: VMware ESXi 6.7.0, 17700523
VMware Tools version: 11269 (11.0.5)
Number of IP's: 2
Hypervisor: VMware ESXi 6.7.0, 17700523
VMware Tools version: 11269 (11.0.5)
Number of IP's: 2
Hypervisor: VMware ESXi 6.7.0, 17700523
VMware Tools version: 11334 (11.2.6)
Number of IP's: 1
Hypervisor: VMware ESXi 6.7.0, 17700523
VMware Tools version: 2147483647
Number of IP's: 1
Hypervisor: VMware ESXi 6.7.0, 17700523
VMware Tools version: 11360 (11.3.0)
Number of IP's: 1
I've tried to gather bpVMutil logs but never seem to get on the VMware backup host. The current logging I have on the VMware backup host are the following.
BPFIS_VERBOSE = 4
VXMS_VERBOSE = 3
VDDK_VERBOSE = 3
How/where would I get the vpVMreq logs?
10-26-2021 12:02 PM
@Krutons I noticed a increased number of VMware VM backup failures on 9.1 (using VIP). I don't remember if I saw the same behaviour but since upgrading to 9.1.0.1, there has been a marked improvement.
10-26-2021 12:09 PM
Thanks for that info @X2 I might have to consider upgrading sooner rather than later if I keep drawing a blank. I really wanted to wait until 9.2 but might not wait now haha.
10-26-2021 12:13 PM
This is what I gathered from the release notes for 9.1.0.1:
New features for 9.1.0.1: none. Major improvements: latest version of Tomcat server, fixes to OpsCenter reporting and communication, crashing NetBackup services, licensing reporting issues, minor fixes for VMware backups, VxUpdate, etc.
and I just finished upgrading my last domain this morning. Hoping to rest a bit before jumping to 9.2.
10-26-2021 03:35 PM
Hi @Krutons
That other log was a typo - sorry - meant bpVMreq.
As for the suggesion of upgrading - you could start with just the backup host (I know this goes against normal practice of starting with the master - but this is all supported) and upgrade using the MR1 for the appliance.
https://www.veritas.com/content/support/en_US/downloads/update.UPD364543
David
10-27-2021 06:22 AM
@davidmoline Upgrading just the VMware backup host (each appliance has an MSDP disk pool) won't be an issue with the MSDP catalog being from a different version than the master?
10-28-2021 01:10 PM
Windows or Linux VM???
11-04-2021 12:05 PM
Both Linux and Windows servers.