cancel
Showing results for 
Search instead for 
Did you mean: 

VMware backups failing RC69 after NB upgrade to 10.0.1.1

Tomas_Pospichal
Level 4

Hello,

yesterday I did upgrade of Media Server from version 10.0.0.1 to 10.0.1.1. All was done based on official NetBackup Tech Notes and upgrade finished successfully.

However, suddenly stopped working only VMware backups which are now failing with error 69 - Invalid filelist specification.

Strange is that no all machines are affected, but 90% of them is which is the problem.

Log of snapshot part from WebUI:

Feb 08, 2023 7:54:27 AM - snapshot backup of client XXXXXXX using method VMware_v2

Feb 08, 2023 7:54:28 AM - Info bpbrm (pid=4592) INF - vmwareLogger: Creating snapshot for vCenter server XXXXXXX , ESX host XXXXXXX , BIOS UUID XXXXXXX , Instance UUID XXXXXXX , Display Name BBEPRN01, Hostname XXXXXXX

Feb 08, 2023 7:54:29 AM - Info bpbrm (pid=4592) INF - vmwareLogger: Connection state of virtual machine: connected.

Feb 08, 2023 7:54:42 AM - Info bpbrm (pid=4592) INF - vmwareLogger: httpsgetStrm: HTTP Err: err = <500>

Feb 08, 2023 7:54:44 AM - Info bpbrm (pid=4592) INF - virtLogger: Failed to backup VMware NVRAM settings.INF - REMAP FILE BACKUP NEW_STREAM

Feb 08, 2023 7:54:44 AM - end Application Snapshot: Create Snapshot; elapsed time 0:00:24
Feb 08, 2023 7:54:44 AM - Info bpfis (pid=7016) done. status: 0
Feb 08, 2023 7:54:44 AM - Info bpfis (pid=7016) done. status: 0: the requested operation was successfully completed
Feb 08, 2023 7:54:44 AM - end writing
Operation Status: 0

Feb 08, 2023 7:54:44 AM - end Child Job; elapsed time 0:00:24
Feb 08, 2023 7:54:44 AM - Info nbjm (pid=4356) snapshotid=XXXXXXX
Feb 08, 2023 7:54:44 AM - begin Application Snapshot: Stop On Error
Operation Status: 0

Feb 08, 2023 7:54:44 AM - end Application Snapshot: Stop On Error; elapsed time 0:00:00
Feb 08, 2023 7:54:44 AM - begin Application Snapshot: Cleanup Resources
Feb 08, 2023 7:54:44 AM - requesting resource XXXXXXX
invalid filelist specification(69)


Can be this the issue? What does it mean to REMAP FILE BACKUP NEW_STREAM"?
"Feb 08, 2023 7:54:44 AM - Info bpbrm (pid=4592) INF - virtLogger: Failed to backup VMware NVRAM settings.INF - REMAP FILE BACKUP NEW_STREAM"


Successfully backed-up VMware machines also contains following information but there is nothing about "REMAP FILE BACKUP NEW_STREAM" which is the difference:

 

Feb 07, 2023 6:35:13 PM - Info bpbrm (pid=6676) INF - vmwareLogger: httpsgetStrm: HTTP Err: err = <500>
Feb 07, 2023 6:35:16 PM - Info bpbrm (pid=6676) INF - virtLogger: Failed to backup VMware NVRAM settings.

 

bpfis errors:
08:37:24.089 [8328.6996] <4> bpfis: Starting keep alive thread.
08:37:24.089 [8328.6996] <4> bpfis: INF - BACKUP START 8328
08:37:24.089 [8328.6996] <2> bpfis main: receive filelist:<NEW_STREAM>
08:37:24.089 [8328.6996] <2> bpfis main: dynamic_stream_count:<0>
08:37:24.089 [8328.6996] <2> check_special_names: got path entry as :<NEW_STREAM>
08:37:24.089 [8328.6996] <2> check_special_names: after conversion returning :<NEW_STREAM>
08:37:24.089 [8328.6996] <2> path_exists: name NEW_STREAM
08:37:24.089 [8328.6996] <2> path_exists: name NEW_STREAM has UTF-8 chars.
08:37:24.089 [8328.6996] <2> path_exists: failed to get attribute for NEW_STREAM
08:37:24.089 [8328.6996] <2> path_exists: name NEW_STREAM doesn't exist.
08:37:24.089 [8328.6996] <2> bpfis main: receive filelist:<ALL_LOCAL_DRIVES>
08:37:24.089 [8328.6996] <2> bpfis main: dynamic_stream_count:<0>
08:37:24.089 [8328.6996] <2> check_special_names: got path entry as :<ALL_LOCAL_DRIVES>
08:37:24.089 [8328.6996] <2> check_special_names: after conversion returning :<ALL_LOCAL_DRIVES>
08:37:24.089 [8328.6996] <2> path_exists: name ALL_LOCAL_DRIVES
08:37:24.089 [8328.6996] <2> path_exists: name ALL_LOCAL_DRIVES has UTF-8 chars.
08:37:24.089 [8328.6996] <2> path_exists: failed to get attribute for ALL_LOCAL_DRIVES
08:37:24.089 [8328.6996] <2> path_exists: name ALL_LOCAL_DRIVES doesn't exist.


08:37:31.539 [8328.6996] <2> getCertInfoForVirtualization: isVirtualizationHostsSecureConnectEnabled returned false, returning EC_unimplemented.
08:37:31.539 [8328.6996] <2> vSphereConnect: Unable to read bp.conf for VIRTUALIZATION_HOSTS_CONNECT_TIMEOUT
08:37:31.539 [8328.6996] <2> getVmwareCipherList: Unable to read bp.conf for VMWARE_CIPHER_LIST - using default vmware cipher list XXXXXXX
08:37:31.853 [8328.6996] <2> getCrlCheckLevelFromConfig: crlCheckFlag value read from bpconf : 1
08:37:31.853 [8328.6996] <2> getCachedCertMapInfo: fstat file mod time = [1675818584], file size [1322849927934]
08:37:31.853 [8328.6996] <2> getCachedCertMapInfo: nout = [766], memMappedCertInfo = [[{"hostID": "XXXXXXX", "serverName": "XXXXXXX", "serverAltNames": "", "issuerName": "XXXXXXX", "certType": 1, "isServerMaster": 1, "issuedBy": "/CN=broker G1/OU=root@XXXXXXX/O=vx", "crlPath": "C:\\Program Files\\VERITAS\\NetBackup\\var\\vxss\\crl\\585ff46b.crl", "securityLevel": 2, "trustVersion": "Jtk5l9UoqMbJVDV", "trustStoreActions": [{"action": "ADD", "fingerprints": ["XXXXXXX"]}], "crlNextRefreshTime": 1675847384, "crlLastRefreshTime": 1675818584, "masterHostId": "XXXXXXX", "tlsSessionResumption": {"enable": 1, "handshakeIntervalInMinutes": 60}}]]
08:37:31.853 [8328.6996] <2> getCertDataByCAtypeExEx: Private key file path is not included in the response.
08:37:31.853 [8328.6996] <2> LoginWithCertManager::getLocalToken: tokenIssueTimeSec = 1675782701
08:37:31.853 [8328.6996] <2> LoginWithCertManager::getLocalToken: tokenExpTimeSec = 1675869101
08:37:31.853 [8328.6996] <2> LoginWithCertManager::getLocalToken: tokenLastFailTimeSec = 0
08:37:31.853 [8328.6996] <2> LoginWithCertManager::isJWTRefreshRequired: tokenExpTimeSec = 1675869101, timeNow = 1675841851, (tokenExpTimeSec-timeNow) = 27250
08:37:31.853 [8328.6996] <2> LoginWithCertManager::isJWTRefreshRequired: Token-refresh IS NOT required

 

08:37:51.992 [7784.8580] <8> bpfis: WRN - VfMS error 10; see following messages:
08:37:51.992 [7784.8580] <8> bpfis: WRN - Non-fatal method error was reported
08:37:51.992 [7784.8580] <8> bpfis: WRN -
08:37:51.992 [7784.8580] <8> bpfis: WRN - VfMS method error 0; see following message:
08:37:51.992 [7784.8580] <8> bpfis: WRN -
08:37:52.007 [7784.8580] <4> remote_vxfs: remote_vxfs_init called
08:37:52.007 [7784.8580] <8> bpfis: WRN - VfMS error 10; see following messages:
08:37:52.007 [7784.8580] <8> bpfis: WRN - Non-fatal method error was reported
08:37:52.007 [7784.8580] <8> bpfis: WRN -
08:37:52.007 [7784.8580] <8> bpfis: WRN - VfMS method error 0; see following message:
08:37:52.007 [7784.8580] <8> bpfis: WRN -

 

08:38:01.936 [7784.8580] <4> bpfis: INF - Failed to delete mount point C:\Program Files\VERITAS\NetBackup\online_util\fi_cntl\bpfis.fim.XXXXXXX_1675841839.1.0.NBU_DATA.xml; errno=22: Invalid argument
08:38:01.936 [7784.8580] <2> NBAllowedList::validatePath: Match found. Allowing C:\Program Files\VERITAS\NetBackup\online_util\fi_cntl\bpfis.fim.XXXXXXX_1675841839.1.0.snapreplicapair

 

Thank you in advance for any suggestions.

Tom

1 ACCEPTED SOLUTION

Accepted Solutions

Tomas_Pospichal
Level 4

It was - in the end - general error with not correctly configured vSphere user permissions. In our environment, there are security limitations for that user and it was working fine with previous version 10.0.0.1. Not really sure what rapidly changed from NetBackup 10.1.1 point of view but it is what it is.


For correct permissions, please follow:
https://www.veritas.com/support/en_US/article.100001960

View solution in original post

11 REPLIES 11

Tomas_Pospichal
Level 4

I did rollback (uninstall/install) to the previous version 10.0.0.1 and it started working again but this is not a solution.
Is there anything to be re-configured after upgrading Media servers to version 10.0.1.1?

Hi @Tomas_Pospichal 

What is the VMware evironment versions (ESXi, vCenter etc.)? What is the OS, version etc of the backup host? What is the storage used for the data stores (and is it different to the hosts that did backup). Is it only VMware backups that are the problem - or as you say only 90% of VMware backups? Is there anything different about the other 10% - differences in policy settings, or ESXi cluster they are attached to?

I also assume you mean you upgraded to NetBackup version 10.1.1? I trust the primary server is also at this version?

Cheers
David

 

Hello @davidmoline,

ESXi versionVMware ESXi, 7.0.3, 20842708
vCenter version - 7.0.3.01100
OS Version of upgraded Media server -  Windows 2016
OS Versions of NB Client Servers - Windows 2012 R2 - Windows 2019
Storage - Dell DataDomain

Yes, only VMware backups are affected.
The policy is the very same as before upgrade, no changes at all. ESXi clusters are different from time to time since there is configured migration between hosts, but they are all at the same version.

I also checked one of the ESXi hosts and there are VMs with OK backup and some of them with KO backup.

Yes, primary server was as a first one upgraded to the version 10.0.1.1 without any issue (actually there was one minor issue after upgrade Policy Execution Manager was stuck and the restart fixed it so I guess it has nothing to do with the described issue above).

It worked OK before upgrade, the only change was installing the 10.0.1.1 version.

 

 

Thank you.

 

Tom

Hi @Tomas_Pospichal 

Version looks okay to me. 

Going to version 10.1, the VDDK version NetBackup uses is now 8.0 - maybe the issue is a VMware virtual hardware version issue. 

David

Hello @davidmoline,

it seems that the very same problem with the 'RC69 Invalid filelist specification'  is now also on the second Media Server which stayed on the version 10.0.0.1.
Both Media Servers are used for the one specific VMware Backup Policy.

Do you, please, have any idea what else to check in terms of configuration or where to look?

 

Thank you very much.

 

Tom

Hi @Tomas_Pospichal 

I can see that the issue you are seeing is not isolated. I can also see you have a support case open for this. 

I have nothing further to offer beyond what I asked about previously - that is have you checked the VM hardware version of the machnies that are failing particular low versions? 

The problem you are now seeing on the other media server may still be related to the 10.1.1 media server acting as the VMware backup host for the backup (you can check in the job details to see which host is performing that action). 

Support is working on this - there is at least one Etrack open to investigate, I'm not sure I can provide anything further than what support can.

Cheers
David

@davidmoline 

Yes, there is as well, I was wondering if someone here already faced the issue.

However I did quick check and e.g. for VMs with version 11, I can see some of them failed and some did not, using the upgraded 10.1.1 Media Server.

Anyway thanks a lot.

Tom

Hi @Tomas_Pospichal 

Sorry I can't help more. That said - can you report back the fix to the list - I'm sure it will help others down the track.

Cheers
David

HI @Tomas_Pospichal 

Can you check the VMware permissions that the user connecting to vCenter has to make sure everything required is there - although not sure what may be different before to now (and for those particular VMs). The relevant article is https://www.veritas.com/support/en_US/article.100001960 

David

Sure, once I will have a clarification, I will surely share it here.

Thank you.

Tomas_Pospichal
Level 4

It was - in the end - general error with not correctly configured vSphere user permissions. In our environment, there are security limitations for that user and it was working fine with previous version 10.0.0.1. Not really sure what rapidly changed from NetBackup 10.1.1 point of view but it is what it is.


For correct permissions, please follow:
https://www.veritas.com/support/en_US/article.100001960