cancel
Showing results for 
Search instead for 
Did you mean: 

NDMP backup failed status 23

ccftr
Level 2

Hello Community,

 

I have  a problem regarding large NDMP-Backups from a NetApp Filer to a tape library. (NetBackup 7.5.0.4)

The Job log of the NDMP-Policy reports:

5/23/2013 1:24:35 PM - Info ndmpagent(pid=4392) XM10PFL: DUMP: 1214705364 KB         
5/23/2013 1:24:36 PM - Info ndmpagent(pid=4392) XM10PFL: DUMP: DUMP IS DONE        
5/23/2013 1:24:36 PM - Info ndmpagent(pid=4392) XM10PFL: DUMP: Deleting "/vol/dat_90/../snapshot_for_backup.501" snapshot.        
5/23/2013 1:24:36 PM - Info ndmpagent(pid=4392) NDMP backup successfully completed, path = /vol/dat_90     
5/23/2013 1:26:03 PM - Info bptm(pid=5864) EXITING with status 23 <----------      

and the NDMP-Agent log on the media-Server :

0:,0:,0,(114|A80:XM10PFL: DUMP: Thu May 23 13:24:09 2013 : We have written 1211250887 KB.|)
0:,0:,0,(114|A44:XM10PFL: ENHANCED_DAR_ENABLED is 'T'|)
0:,0:,0,(114|A46:XM10PFL: DUMP: dumping (Pass V) [ACLs]|)
0:,0:,0,(114|A36:XM10PFL: DUMP: 1214705364 KB|)
0:,0,(114|A35:XM10PFL: DUMP: DUMP IS DONE|)
0:,0,(114|A90:XM10PFL: DUMP: Deleting "/vol/dat_90/../snapshot_for_backup.501" snapshot.|)
35:NdmpBackupManager::NotifyDataHalted,2,(33|A1:1|A25:NDMP_DATA_HALT_SUCCESSFUL|)
36:NdmpBackupManager::NotifyMoverHalted,2,(35|A1:1|A30:NDMP_MOVER_HALT_CONNECT_CLOSED|)
,36:NdmpBackupManager::UpdateTotalKbytes,4,(24|u32:14348771|u64:1214705536|u64:918321344|)
27:NdmpFhManager::UpdateCounts,4,(188|i32:169|i32:15662|)
0:,0:,0,(31|A19:/vol/dat_90|)
73:Sending EXIT STATUS 0: the requested operation was successfully completed,31:ConnectionToBrm::SendExitStatus,1
0:,0:,12:MainShutdown,2,(7|A1:0|u32:4392|u32:2872|)
 

For me, it looks like the NDMP-Backup is correct, but the NetBackup Server reports a "status 23" error. Small NDMP-Backups from he same NetApp filer are working. The registry-key "KeepAliveTime" was set to 30 minutes and the NetBackup Media-/Master-Server Timeouts are set to 30 minutes, nothing helped.

Does anyone have an idea how to solve this?

Best regards

Florian

 

5 REPLIES 5

huanglao2002
Level 6

ccftr
Level 2

Since I'm working with Netbackup 7.5.0.4, the touch file is not in use anymore, so the paramater "client connect timeout" was set to 1800 seconds in the properties of all Media-Server but it does not help.

watsons
Level 6

Another touch file is available: NDMP_PROGRESS_TIMEOUT, check this out:

https://www-secure.symantec.com/connect/forums/does-ndmpprogresstimeout-apply-nb7x-and

ccftr
Level 2

The touch file does not help (I tried), because the core NDMP-Backup does not crash, it finished (with the correct size of the Backup) with the message:

"ndmpagent(pid=4392) NDMP backup successfully completed, path = /vol/dat_90"

But afterwards the NetBackup Server said:

 "Info bptm(pid=5864) EXITING with status 23 <----------      "

I do not have any idea how to solve this.

Scott_Gourley
Level 2
Employee Accredited Certified

You'll need to look at the bptm log (and isolate the pid) on the media server.

bptm(pid=5864) EXITING with status 23

Enable verbose 5 for bptm and bpbrm on the media server to review the logs.

 

This is from the troubleshooter:

Status 23: socket read failed

Do the following, as appropriate:
* Check the NetBackup Problems report for clues on where and why the failure occurred. If you cannot determine the cause from the Problems report, create debug log directories for the processes that could have returned this status code. Then, retry the operation and check the resulting debug logs.
* Corrupt binaries are one possible cause for this error.
Load a fresh bptm from the install media to try to resolve the problem.
* The following information applies only to Sun Solaris:
Verify that all operating system patches are installed.
See the Operating Notes section of the NetBackup Release Notes.
* The following information applies only to Windows systems:
Verify that the recommended service packs are installed.
* On Windows master servers, check the LIST_FILES_TIMEOUT value and ensure that this value is at least 1800.