cancel
Showing results for 
Search instead for 
Did you mean: 

NBU 8.1.2 Media Server communication failure with Master Server following upgrade

Dockers
Level 2

we've come across a couple of challenges with our 8.1.2 upgrade.


After the master server upgrade we started on the media servers and straight away encountered issues downloading certificates required for communication between the Media and Master Server.


The media Servers are configured with a preferred network to backup our client database servers on a separate vLAN. This is where the crux of our issues currently reside:


If we disable the preferred network (rename PREFERRED_NETWORK to something else in the registry) clear to host cache communication between master/media server is established and the certificates can be uploaded. After making the certificates available for the media server we then reenable the preferred network and communication is lost again.


Running “bpclntcmd -pn” from the media server with the preferred network disabled proves connectivity however running it with the preferred network enabled doesn't provide a response.


Additionally, running “bpclntcmd -pn -verbose -debug” from the media server with preferred network enabled throws out the following errors:
13:47:37.508 [7764.9212] <2> check_vnetd_process_dup_permission: OpenProcessToken() failed: 5
13:47:37.508 [7764.9212] <2> vnet_check_windows_privileges: enabled SeDebugPrivilege privilege
13:47:37.524 [7764.9212] <8> init_local_connect_recs: [vnet_connect.c:1398] remote connections are prohibited 2 0x2
13:47:37.524 [7764.9212] <8> init_local_connect_recs: [vnet_connect.c:1398] remote connections are prohibited 1 0x1
13:47:37.524 [7764.9212] <16> connect_to_service: connect failed STATUS (18) CONNECT_FAILED
13:47:37.524 [7764.9212] <16> connect_to_service: JSON data = {"allow_large_status": [TRUNCATED] "connect failed STATUS (18) CONNECT_FAILED"}}
13:47:37.524 [7764.9212] <8> vnet_connect_to_service: [vnet_connect.c:290] connect_to_service() failed 18 0x12
13:47:37.524 [7764.9212] <2> bprd_connect: vnet_connect_to_service(MasterServer)failed: 18
13:47:37.524 [7764.9212] <16> do_request: Can't connect to host MasterServer: cannot connect on socket (25)Can't connect to host MasterServer: cannot connect on socket (25)

When running “bptestbpcd -host MediaServer -verbose -debug” from the Master Server (Preferred Network enabled) I get the following:


Host MasterServer (ip address) is not an authorized server for host MediaServer.
bptestbpcd: EXIT status = 46 server not allowed access


The preferred Network settings only contain the client names and IP Addresses of the Database clients, not the media servers or master server.


There is no preferred network configured between the master and media servers so I cannot understand why communication is lost when the preferred network settings are in place?


I suspect it’s relating to certificates but haven’t as yet found anything online similar to the issues we have.

Support have suggested speaking to our Network Team, however they do not know what the issue is. 


Does anyone have any idea’s?

5 REPLIES 5

Mike_Gavrilov
Moderator
Moderator
Partner    VIP    Accredited Certified

The fact that you don't get correct answer using bpclntcmd -pn shows that you're sending reqest using incorrct interface. It's better to open bprd log on  the master and check source ip that sends the request. 

Mike_Gavrilov
Moderator
Moderator
Partner    VIP    Accredited Certified

You may also try to run on media-server:

# bpclntcmd -pn -verbose -debug

It'll give you an idea which interface was used and what master-server  thinks about it.

While running the bpclntcmd -pn command from the media server you could create a folder named “bpclntcmd” under netbackup\logs. This would directly tell you the source IP used as well as the IP address of master server to which the request is sent. Then you could follow the bprd logs on the master server in order to trace back the connection on how the master responds to this request.

Run the bpclntcmd -pn command with these 2 logs once with the preferred network entry in place and once without it and you could compare the logs to determine the difference in the connections or what may seem to cause an issue

Thanks all, 

I'll test and respond on how it goes.

okay, so the bprd isn't providing me with any detail whilst the preferred network is configured on the media server.

When running bpclntcmd -pn -verbose from the media server with the preferred network enabled i get this output:

12:11:44.856 [956.7144] <2> logparams: -pn

12:11:44.966 [956.7144] <2> check_vnetd_process_dup_permission: OpenProcessToken() failed: 5

12:11:44.966 [956.7144] <2> vnet_check_windows_privileges: enabled SeDebugPrivilege privilege

12:11:45.044 [956.7144] <8> init_local_connect_recs: [vnet_connect.c:1398] remote connections are prohibited 2 0x2

12:11:45.060 [956.7144] <8> init_local_connect_recs: [vnet_connect.c:1398] remote connections are prohibited 1 0x1

12:11:45.060 [956.7144] <16> connect_to_service: connect failed STATUS (18) CONNECT_FAILED

12:11:45.060 [956.7144] <16> connect_to_service: JSON data = {"allow_large_status": {"timestamp": 1552479104, "who": "vnet_tss_init", "line_number": 32, "comment": "allow vnet status > 255", "data": false}, "direct_connect": {"timestamp": 1552479104, "who": "connect_to_service", "line_number": 838, "comment": "connect parameters", "data": {"who": "vnet_connect_to_service", "host": "Master Server", "service": "bprd", "override_required_interface": null, "extra_tries_on_connect": 0, "getsock_disable_to": 0, "overide_connect_timeout": 0, "connect_options": null}}, "status": {"timestamp": 1552479105, "who": "connect_to_service", "line_number": 985, "comment": "vnet status", "data": 18}, "connect_recs": {"timestamp": 1552479105, "who": "vnet_tss_get", "line_number": 97, "comment": "connect rec status messages", "data": "connect failed STATUS (18) CONNECT_FAILED"}}

12:11:45.060 [956.7144] <8> vnet_connect_to_service: [vnet_connect.c:290] connect_to_service() failed 18 0x12

12:11:45.060 [956.7144] <2> bprd_connect: vnet_connect_to_service(Master Server) failed: 18

12:11:45.060 [956.7144] <16> do_request: Can't connect to host Master Server: cannot connect on socket (25)

When preferred Network is disabled:

12:10:42.830 [6736.972] <2> logparams: -pn
12:10:42.846 [6736.972] <2> check_vnetd_process_dup_permission: OpenProcessToken() failed: 5
12:10:42.846 [6736.972] <2> vnet_check_windows_privileges: enabled SeDebugPrivilege privilege
12:10:42.877 [6736.972] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded
12:10:42.877 [6736.972] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\{GUID}:OUTBOUND
12:10:42.877 [6736.972] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
12:10:43.283 [6736.972] <2> logconnections: PROXY CONNECT FROM 10.72.199.27.62548 TO 10.72.199.33.1556 fd = 524
12:10:43.283 [6736.972] <2> logconnections: BPRD CONNECT FROM 127.0.0.1.62551 TO 127.0.0.1.62552 fd = 524

I can also see from the bprd log that when the preferred network is disabled bprd is communicating with the media server using the correct IP addresses (both media/master) however doesn't write out anything when the preferred network is enabled.

The following is an extract of the bpcd log from the media server when running bptestbpcd -pn -verbose:

12:17:26.182 [3124.1292] <2> LoginWithCertManager::getLocalToken: tokenIssueTimeSec = 1552479020

12:17:26.182 [3124.1292] <2> LoginWithCertManager::getLocalToken: tokenExpTimeSec = 1552565420

12:17:26.182 [3124.1292] <2> LoginWithCertManager::getLocalToken: tokenLastFailTimeSec = 0

12:17:26.182 [3124.1292] <2> LoginWithCertManager::isJWTRefreshRequired: tokenExpTimeSec = 1552565420, timeNow = 1552479446, (tokenExpTimeSec-timeNow) = 85974

12:17:26.182 [3124.1292] <2> LoginWithCertManager::isJWTRefreshRequired: Token-refresh IS NOT required

12:17:26.182 [3124.1292] <2> NBClientCURL::performCurlOperation: Obtained login token

12:17:26.244 [3124.1292] <8> nbclnt_curl_prefnet::usable_prefnet_settings: Preferred network setting(s) PROHIBITs use of [10.72.199.33] IP.

12:17:26.244 [3124.1292] <8> nbclnt_curl_prefnet::establish_initial_connection: Failed to get usable preferred network settings.Return Code [-1].

12:17:26.244 [3124.1292] <8> nbclnt_curl_prefnet::bio_connect: Connect FAILED with return code [-1] and status [-3].

12:17:26.244 [3124.1292] <8> NBClientCURL::connectToPBXExchange: Failed to connect to [Master Server]. VNET erro = -1.

12:17:26.244 [3124.1292] <2> curl_debug_logger(): == Info: Added Master Server:1556:BAD:ADD::125 to DNS cache

12:17:26.244 [3124.1292] <2> curl_debug_logger(): == Info: Hostname Master Server was found in DNS cache

12:17:26.244 [3124.1292] <2> curl_debug_logger(): == Info: Closing connection 0

12:17:26.244 [3124.1292] <16> NBClientCURL::performCurlOperation: Failed to perform operation: Couldn't connect to server

12:17:26.244 [3124.1292] <2> mapCurlError: Translating CURL status [7] to NetBackup status [8500]

12:17:26.244 [3124.1292] <2> NBClientCURL::performCurlOperation: Fetched httpcode = 0

12:17:26.244 [3124.1292] <2> NBClientCURL:~NBClientCURL: Performing curl_easy_cleanup()

12:17:26.244 [3124.1292] <16> getAliasesForHostidSyncWebapi: curl call fail with error:8500

12:17:26.244 [3124.1292] <8> getAliasesForHostid: Server call failed; hence clearing host alias cache

12:17:26.260 [3124.1292] <16> isHostidMatchesWithHostname: error:8500, while fetching hostdata by hostid:GUID failed

12:17:26.260 [3124.1292] <4> setPeerValidationStatusInTSS: Peer validation status in TSS is [0], resetting it to [7658]

12:17:26.260 [3124.1292] <2> bpcd valid_server: comparing Media Server and Master Server.domain

12:17:26.260 [3124.1292] <2> web_sync: invoking:https://Master Server:1556/netbackup/config/hosts/GUID/hostmappings?isapproved=true

12:17:26.260 [3124.1292] <2> NBClientCURL::NBClientCURL: Performing curl_easy_init()

12:17:26.260 [3124.1292] <2> NBClientCURL::NBClientCURL: Disabling any proxy option set in the environment.

12:17:26.260 [3124.1292] <2> NBClientCURL::NBClientCURL: Using paths as caCertFilePath [[install_path]\NetBackup\var\webtruststore\cacert.pem] privateKeyPath [[install_path]\NetBackup\var\vxss\credentials\keystore\PrivKeyFile.pem]

12:17:26.260 [3124.1292] <2> NBClientCURL::NBClientCURL: Setting default CURL timeout = [300] seconds

12:17:26.260 [3124.1292] <2> NBClientCURL::NBClientCURL: Disable signal during name resolution.

12:17:26.260 [3124.1292] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: ENABLE_NBCURL_VERBOSE is set

12:17:26.260 [3124.1292] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: DISABLE_NBCURL_CONNECTIONS is not set

12:17:26.260 [3124.1292] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: DISABLE_NBCURL_PREFERRED_NETWORKS is not set

12:17:26.260 [3124.1292] <2> web_sync: performing GET call

12:17:26.260 [3124.1292] <2> NBClientCURL::Get: URL: https://Master Server:1556/netbackup/config/hosts/GUID/hostmappings?isapproved=true

12:17:26.260 [3124.1292] <2> LoginWithCertManager::getLocalToken: tokenIssueTimeSec = 1552479020

12:17:26.260 [3124.1292] <2> LoginWithCertManager::getLocalToken: tokenExpTimeSec = 1552565420

12:17:26.260 [3124.1292] <2> LoginWithCertManager::getLocalToken: tokenLastFailTimeSec = 0

12:17:26.260 [3124.1292] <2> LoginWithCertManager::isJWTRefreshRequired: tokenExpTimeSec = 1552565420, timeNow = 1552479446, (tokenExpTimeSec-timeNow) = 85974

12:17:26.260 [3124.1292] <2> LoginWithCertManager::isJWTRefreshRequired: Token-refresh IS NOT required

12:17:26.260 [3124.1292] <2> NBClientCURL::performCurlOperation: Obtained login token

12:17:26.260 [3124.1292] <8> nbclnt_curl_prefnet::usable_prefnet_settings: Preferred network setting(s) PROHIBITs use of [10.72.199.33] IP.

12:17:26.260 [3124.1292] <8> nbclnt_curl_prefnet::establish_initial_connection: Failed to get usable preferred network settings.Return Code [-1].

12:17:26.260 [3124.1292] <8> nbclnt_curl_prefnet::bio_connect: Connect FAILED with return code [-1] and status [-3].

12:17:26.260 [3124.1292] <8> NBClientCURL::connectToPBXExchange: Failed to connect to [Master Server]. VNET erro = -1.

12:17:26.260 [3124.1292] <2> curl_debug_logger(): == Info: Added Master Server:1556:BAD:ADD::125 to DNS cache

12:17:26.260 [3124.1292] <2> curl_debug_logger(): == Info: Hostname Master Server was found in DNS cache

12:17:26.260 [3124.1292] <2> curl_debug_logger(): == Info: Closing connection 0

12:17:26.260 [3124.1292] <16> NBClientCURL::performCurlOperation: Failed to perform operation: Couldn't connect to server

By the looks of the last log it appears to me that the media server is telling me it's prohibiting the use of my master server IP Address when preferred network is enabled.