10-22-2020 06:17 PM
My nbu v8.1.2 daily backups of VMware & MSSQL polices have been running successfully.
Suddenly, I had encountered the below error only for VMware policy:
Error bptm cannot add fragment to image database, error = cannot connect on socket
(11) system call failed
My MSSQL backups are still running successfully.
i had checked firewall rules are already opened.
tcp 443 & 902 for nbu master/media & vmware vcenter & ESX hosts
- nbu master/media able to telnet vcenter tcp port 443 & ESX tcp port 902
tcp 1556 & 13724 for nbu master/media
- able to telnet nbu master/media tcp ports 1556/13724
I had searched the knowledgebase and tried the technotes.
https://www.veritas.com/support/en_US/article.100001950
https://www.veritas.com/support/en_US/article.100013415
https://vox.veritas.com/t5/NetBackup/cannot-connect-on-socket-25/td-p/620192
https://vox.veritas.com/t5/NetBackup/My-FS-Bakup-is-failing-with-25/td-p/525005
https://vox.veritas.com/t5/NetBackup/VMware-Backups-fails-with-status-11/m-p/342301#M69430
Does anyone have similiar error and managed to resolve?
Regards
Joshua
Solved! Go to Solution.
10-28-2020 11:08 PM - edited 10-29-2020 01:39 AM
Unfortunately my veritas support had lapsed recently.
But good news! I did the following steps to resolve the backup issue.
1) Deactivate the CDPUserSvc service via Registry Editor
2) Set the value Start to 4
3) Invole an administrative command prompt window and enter the following command:
4) Reboot the both NBU Master & Media servers
5) Check that CDPUserSvc service is no longer running via Services control panel applet.
It seems the KB4048953 from Microsoft Windows Update had caused my NBU services to go haywire.
10-23-2020 01:29 AM
I checked the logs in bprd folder and noticed this error in the log:
00:01:52.255 [15764.15204] <2> emmlib_QueryAppCredentials: (0) EMM_AppCredentials_Record => { MediaServerName => vcenter.abc.net AppServerName => vcenter.abc.net AppServerType => 10 (CEMM_NBU_VIRTUAL_MACHINE) AppServerSubType => 1 PortNumber => 443 UserName => nbuadmin Password => ****** PasswordKey => ****** CredentialType => 1563766176 (INVALID_CREDENTIAL_TYPE) SubTypeName => DisplayName =>
However, i could login to my vcenter using the same account & password and browse the vcenter to select additional VMs to add to my VMware_Backup policy.
10-23-2020 03:21 AM
The line that you saw in bprd is probably a 'red herring'. Plus it is a <2> - debug, not a warning <8>, error <16> or severe error <32>.
I would start with with bpbrm log on the media server - level 3.
This looks like comms error between bpbrm on media server and bpdbm on the master:
... cannot add fragment to image database, error = cannot connect on socket
10-25-2020 10:08 PM - edited 10-25-2020 10:32 PM
Hi Marianne,
I have tried to filter thru both logs for <16> & <32> : bpbrm from the media server and bpdbm from the master
I search the knowledgebase for solution on "The proxy protocol has been violated."
https://www.veritas.com/support/en_US/article.100043072
I checked my Netbackup Client Service was as per default set to start as LocalSystem and running.
bpdbm log (master server)
11:34:10.840 [7532.7860] <16> dump_proxy_info: statusmsg: The proxy protocol has been violated. The remote connector of connection ([local] [AF_INET:[10.20.34.38]:1556] -- [remote] [AF_INET:[10.20.36.21]:58215]) is not correctly using the proxy., nbu status = 7647, severity = 2, Additional Message: [PROXY] Encountered error (CERT_PROTOCOL_READING_JSON_LENGTH) while processing(CertProtocol)., nbu status = 2, severity = 1
bpbrm log (media server)
10:41:07.906 [11364.11332] <32> bpbrm Exit: unexpected termination of client cute_cats.abc.net
10:41:07.906 [11364.11332] <2> set_job_details: Tfile (219762): LOG 1603680067 32 bpbrm 11364 unexpected termination of client cute_cats.abc.net
10:41:07.906 [11364.11332] <2> send_job_file: job ID 219762, ftype = 3 msg len = 91, msg = LOG 1603680067 32 bpbrm 11364 unexpected termination of client cute_cats.abc.net
10:41:07.906 [11364.11332] <2> ConnectionCache::connectAndCache: Acquiring new connection for host masterserver01.abc.net, query type 1
Regards
Terence
10-25-2020 11:31 PM
You need to find log entries that correspond with this timestamp:
23 Oct, 2020 8:26:31 cannot add fragment to image database, error = cannot connect on socket
Best to locate the PID in Job details, then extract all entries for that PID.
Among the bpbrm entries, there will be attemp to connect to bpbdm.
If you cannot post full logs, please log a call with Veritas Support and submit logs to them.
10-26-2020 01:39 AM - edited 10-26-2020 01:43 AM
I hope you can help me take a look.
I have attached new logs instead.
Job ID: 219837 (Snapshot) & 219838 (Backup)
10-26-2020 11:40 PM
10-27-2020 12:08 AM
Thank you for your help!
Managed to locate the following PIDs from the Job Details:
Job Type: Snapshot (Job ID: 219837)
Job Type: Backup (Job ID: 219838)
10-27-2020 01:04 AM
Thanks!
So, the snapshot was successful:
14:50:30.420 [9136.8680] <2> bpbrm Exit: client snapshot EXIT STATUS 0: the requested operation was successfully completed
Backup job started 1 second later :
14:50:31.139 [7404.1340] <2> logparams: -backup -S masterserver01.abc.net -c cute_cats.abc.net -ct 40 -ru root -cl VMware_Backup -sched Full .....
Weird to see that the Scedule name is Full, but schedule type is INCR ?
14:50:32.795 [7404.1340] <2> build_cmd_for_non_mpx_backup_archive_verify_import: backup_cmd = /usr/openv/netbackup/bin/bpbkar bpbkar32 -r 1209600 -dt 0 -to 0 -bpstart_time 1603695360 -clnt cute_cats.abc.net -class VMware_Backup -sched Full -st INCR .....
It seems that the backup then fails within a few seconds without any data being transferred.
This time we do not see the same "cannot add fragment to image database, .... " error as before, just status 25 and 11:
14:50:39.858 [7404.1340] <2> bpbrm wait_for_child: start
14:50:39.858 [7404.1340] <2> bpbrm wait_for_child: child exit_status = 25
14:50:39.874 [7404.1340] <2> inform_client_of_status: INF - Server status = 11
14:50:39.874 [7404.1340] <2> put_string: cannot write data to network: An existing connection was forcibly closed by the remote host.
14:50:39.874 [7404.1340] <16> inform_client_of_status: could not send server status message to client
I interpret 'bpbrm wait_for_child' as waiting for bpbkar.
So, here it looks like issue between bpbrm and bpbkar.
Can we assume that the media server is also the backup host?
Do you have bpbkar log on the media server?
Have you perhaps checked Task Manager for orphaned bpbkar processes after completed or failed backups?
Curious about the schedule name and type - this failed backup is actually running an INCR backup.
When last did a Full backup (with type FULL schedule) run?
10-28-2020 12:11 AM
Yes, the backup schedule was originally named as Full.
I was troubleshooting by testing out Differential Incremental Backup and Full Backup.
The last successful Full backup was on 20th Oct.
Yes, media server is also the backup host.
10-28-2020 12:34 AM
I do not see much in bpbkar other than the status 11 and then socket errors because bpbrm has already terminated the backup.
I am wondering about this debug <2> message:
14:50:35.186 [1704.4896] <2> tar_base::V_vTarMsgW: INF - FIML_initialize_vxms failed (14) (../vxms_bridge.cpp:257)
A search for this error found an old TN that I do not believe is relevant in your situation:
https://www.veritas.com/support/en_US/article.100003025
Probably best to log a Support call with Veritas.
Be sure that you have VxMS logging enable - IMHO, this is probably where the issue will be found.
10-28-2020 11:08 PM - edited 10-29-2020 01:39 AM
Unfortunately my veritas support had lapsed recently.
But good news! I did the following steps to resolve the backup issue.
1) Deactivate the CDPUserSvc service via Registry Editor
2) Set the value Start to 4
3) Invole an administrative command prompt window and enter the following command:
4) Reboot the both NBU Master & Media servers
5) Check that CDPUserSvc service is no longer running via Services control panel applet.
It seems the KB4048953 from Microsoft Windows Update had caused my NBU services to go haywire.
10-29-2020 01:23 AM
10-29-2020 01:38 AM
I searched all over the net any possible solution and chanced upon an article which mentioned KB4048953 & CDPUsersvc causing problem on Window 10 & Windows Server 2016 platform.