β04-04-2014 02:55 AM
We are installing NetBackup 7.6.0.1 on Windows 2012 and backing up NetApp OnTap 8.2.1RC2 Cluster-Mode.
Large jobs (over 800 GB) are failing with Status 23 after the dump to tape has successfully completed. Small jobs are successful. I have not determined the threshhold at which they will fail but a test job of 37 GB ran successfully. Successful and failed jobs are all on the same NetApp storage node and SVM and use the same tape devices. The large jobs always fail and the small jobs always pass so the problem is consistent.
I've done some digging around and thought it could be solved by setting a NDMP timeout but I could not see this documented in 7.6
End of Activity Monitor log:
01/04/2014 19:48:28 - Info ndmpagent(pid=5764) xxxxsn0202: DUMP: Tue Apr 1 19:48:29 2014 : We have written 817613361 KB.
01/04/2014 19:53:28 - Info ndmpagent(pid=5764) xxxxsn0202: DUMP: Tue Apr 1 19:53:29 2014 : We have written 853167693 KB.
01/04/2014 19:57:02 - Info ndmpagent(pid=5764) xxxxsn0202: ENHANCED_DAR_ENABLED is 'T'
01/04/2014 19:57:02 - Info ndmpagent(pid=5764) xxxxsn0202: ACL_START is '910273288192'
01/04/2014 19:57:02 - Info ndmpagent(pid=5764) xxxxsn0202: DUMP: dumping (Pass V) [ACLs]
01/04/2014 19:57:03 - Info ndmpagent(pid=5764) xxxxsn0202: DUMP: 888938765 KB
01/04/2014 19:57:03 - Info ndmpagent(pid=5764) xxxxsn0202: DUMP: DUMP IS DONE
01/04/2014 19:57:03 - Info ndmpagent(pid=5764) NDMP backup successfully completed, path = /xxxxsvminmixpb0001/xxxxvolindtspb_vsvr_datastore_0001
01/04/2014 19:57:17 - Info bptm(pid=5040) EXITING with status 23 <----------
01/04/2014 19:57:17 - Critical bpbrm(pid=616) unexpected termination of client xxxxsn0202
01/04/2014 19:57:17 - Info ndmpagent(pid=0) done. status: 23: socket read failed
01/04/2014 19:57:17 - end writing; write time: 1:01:34
socket read failed(23)
BPTM log
19:57:05.337 [5040.5132] <2> NdmpMoverSession[0]: ndmp_mover_set_window(0, 18446744073709551615)
19:57:05.337 [5040.5132] <2> NdmpMoverSession[0]: Attempting to create server to which remote host xxxxsn0202 can connect - IPv6
19:57:05.337 [5040.5132] <2> NdmpMoverSession[0]: Using hostname from config file: NDMP_DATA_CONNECTION_HOST_NAME=192.168.120.80
19:57:05.337 [5040.5132] <8> retry_getaddrinfo_for_real: [vnet_addrinfo.c:1059] getaddrinfo() failed RV=11001 NAME=192.168.120.80 SVC=0
19:57:05.337 [5040.5132] <8> retry_getaddrinfo: [vnet_addrinfo.c:894] retry_getaddrinfo_for_real failed RV=11001 NAME=192.168.120.80 SVC=0
19:57:05.337 [5040.5132] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1643] retry_getaddrinfo() failed RV=11001 NAME=192.168.120.80 SVC=NULL
19:57:05.337 [5040.5132] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1271] vnet_cached_getaddrinfo_and_update() failed 6 0x6
19:57:05.337 [5040.5132] <2> NdmpMoverSession[0]: CreateServer: vnet_cached_getaddrinfo failed, host = 192.168.120.80, status = 6, error = 11001
19:57:05.337 [5040.5132] <2> NdmpMoverSession[0]: MoverConnect method failed for IPv6. Reinitializing and attempting to use IPv4.
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: ndmp_mover_set_window(0, 18446744073709551615)
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: Attempting to create server to which remote host xxxxsn0202 can connect - IPv4
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: Using hostname from config file: NDMP_DATA_CONNECTION_HOST_NAME=192.168.120.80
19:57:05.353 [5040.5132] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1649] found via getaddrinfo NAME=192.168.120.80 SVC=NULL
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: Using IP address: 192.168.120.80 (for configured host 192.168.120.80)
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: Creating server for 192.168.120.80 in Server Port Window 1025 to 5000
19:57:05.368 [5040.5132] <2> vnet_bind_to_port_addr_extra_ipi: vnet_bind.c.195: 0: Function failed: 10 0x0000000a
19:57:05.368 [5040.5132] <2> NdmpMoverSession[0]: Failed to bind to port in Server Port Window (check Port Ranges in Host Properties)
19:57:05.368 [5040.5132] <2> NdmpMoverSession[0]: MoverConnect method failed. Reinitializing and attempting to use MoverListen method.
19:57:05.384 [5040.5132] <2> NdmpMoverSession[0]: ndmp_mover_set_window(0, 18446744073709551615)
19:57:05.384 [5040.5132] <2> NdmpMoverSession[0]: ndmp_mover_listen failed, status = NDMP_CONN_TYPE_UNCONFIGURED_ERR
19:57:05.384 [5040.5132] <16> NdmpMoverSession[0]: ERROR Start failed
19:57:05.384 [5040.5132] <16> check_and_process_mover_tasks: NDMP Mover Client Setup failed
19:57:05.399 [5040.5132] <2> NdmpMoverSession[0]: Shutdown complete
19:57:05.399 [5040.5132] <2> write_data_tir: ndmp_task = 1
19:57:05.399 [5040.5132] <2> write_data_tir: status_to_return = -1
19:57:05.399 [5040.5132] <2> check_error_history: just tpunmount: called from bptm line 21767, EXIT_Status = 23
19:57:05.399 [5040.5132] <2> io_close: closing D:\Program Files\Veritas\NetBackup\db\media\tpreq\drive_HP.ULTRIUM6-SCSI.003, from bptm.c.16640
β04-04-2014 04:59 AM
Have you tried to set NDMP progress timeout ?
See http://www.symantec.com/docs/TECH21246
Similar discussion over here, but unfortunately no solution:
https://www-secure.symantec.com/connect/forums/ndmp-backup-failed-status-23
β04-05-2014 12:54 PM
β04-06-2014 09:10 AM
This is your error here.
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: Creating server for 192.168.120.80 in Server Port Window 1025 to 5000
19:57:05.368 [5040.5132] <2> vnet_bind_to_port_addr_extra_ipi: vnet_bind.c.195: 0: Function failed: 10 0x0000000a
19:57:05.368 [5040.5132] <2> NdmpMoverSession[0]: Failed to bind to port in Server Port Window (check Port Ranges in Host Properties)
Here is one technote with the exact error.
http://www.symantec.com/docs/TECH138437
Which points back to an issue with the ndmp.cfg which we know you have modified from the following line.
19:57:05.353 [5040.5132] <2> NdmpMoverSession[0]: Using hostname from config file: NDMP_DATA_CONNECTION_HOST_NAME=192.168.120.80
I suggest removing this line and allowing NetBackup to setup the ndmp server based on the interface being used to talk to the filer.
β04-07-2014 01:10 AM
Yes Marianne, I have tried the NDMP progress timeout as suggested in TECH21246 but it made no difference. I wasn't sure if it was still valid in NBU 7.6. I couldn't find reference to NDMP_PROGRESS_TIMEOUT in my ndmp log file.
I also had already seen that very promising previous discussion and had sent a personal message to the poster to see if he had a resolution. Unfortunately I've had no reply.
β04-07-2014 01:14 AM
Hi Amol. Before reading your post we were already in the process of trying that :)
Unfortunately the backup we started before the weekend is still running at a pathetic 916 KB/Sec! So useless :(
β04-07-2014 01:16 AM
I think that error is a red herring as I got the same error before I tinkered with the ndmp.cfg file. We've taken it out now anyway.
β04-07-2014 05:21 AM
Is there anything else on the server running on the following? "Server Port Window 1025 to 5000"
Or blocking it?
Even if the job is not doing Remote NDMP, NetBackup may still open a server port for easier meta data trasmission.
Ndmpagent log should show exactly what port was being opened.