03-28-2014 01:40 PM
Hello community, I am have been troubleshooting a client issue for about 4 days and was looking to see if anyone would have any additional ideas on resolving this. I'll do my best as to help explain and give some background information:
Client: Windows 2003 x64 Enterprise with NBU v7.1.0.4 client installed (Physical server). Policy is a Windows filesystem policy with ALL_LOCAL_Drives (no multistreaming enabled)
3 Media & Master Server: Windows 2008 Standard x64 SP2 with NBU client v7.1.0.4 (*media and master servers are all separate physical servers totally 4)
Error occurs writing to either Tape or a Data Domain device
Brief timeline of events
1:25:59.101 PM: [19572.32028] <2> TransporterRemote::write[2](): DBG - | An Exception of type [SocketWriteException] has occured at: | Module: @(#) $Source: src/ncf/tfi/lib/TransporterRemote.cpp,v $ $Revision: 1.54 $ , Function: TransporterRemote::write[2](), Line: 321 | Local Address: [0.0.0.0]:0 | Remote Address: [0.0.0.0]:0 | OS Error: 10053 (An established connection was aborted by the software in your host machine.
) | Expected bytes: 16384 | (../TransporterRemote.cpp:321)
1:25:59.101 PM: [19572.32028] <16> tar_tfi::processException:
An Exception of type [SocketWriteException] has occured at:
Module: @(#) $Source: src/ncf/tfi/lib/TransporterRemote.cpp,v $ $Revision: 1.54 $ , Function: TransporterRemote::write[2](), Line: 321
Module: @(#) $Source: src/ncf/tfi/lib/Packer.cpp,v $ $Revision: 1.89 $ , Function: Packer::getBuffer(), Line: 656
Module: tar_tfi::getBuffer, Function: H:\7104\src\cl\clientpc\util\tar_tfi.cpp, Line: 312
Local Address: [0.0.0.0]:0
Remote Address: [0.0.0.0]:0
OS Error: 10053 (An established connection was aborted by the software in your host machine.
)
Expected bytes: 16384
and I also see this in bpbkar:
:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - FS_DleBEAO::DeInit - exiting.
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedssql2.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsshadow.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsss.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsadgran.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsnt5.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsev.dll
1:26:00.194 PM: [19572.32028] <2> ov_log::V_GlobalLog: INF - unloading bedsxese.dll
1:26:00.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:01.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:02.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:03.194 PM: [19572.32028] <16> dtcp_read: TCP - failure: recv socket (592) (TCP 10053: Software caused connection abort)
1:26:03.194 PM: [19572.32028] <4> OVShutdown: INF - Shutdown wait finished
bpclncmd -ip --> from both client and server
* bpclntcmd -hn / * bpclntcmd -pn / run bpcoverage -c clientname
Does anyone have any other suggestions to help troubleshoot? Or am I missing anything??
Thank you for any help.
Solved! Go to Solution.
03-28-2014 01:51 PM
Apologies, I have to go out so have not read all the details you posted, will do so later.
In general, status 24 is not NBU, so you need to look OS level, in fact I have never seen NBU cause a 24.
Check this post for comments, and in particular the post I made (sorry it is long, but it may give you a solution).
https://www-secure.symantec.com/connect/forums/netbackup-status-code-24-possible-parameters-check
Also this one,
https://www-secure.symantec.com/connect/forums/netbackup-solaris-10-media-server-issue
Make sure all interfaces are fully resolvable, a very common cause of 24s
Martin
04-03-2014 10:44 AM
So you mean to say that u r unable to take the backup only for C drive right?
What is the free space and utilizes space for C?
Please verify this
https://www-secure.symantec.com/connect/forums/incremental-backup-failing-only-d-drive-status-code-4224-full-backup-completing-successfully#comment-9435631
Check C-drive for fragmentation.
Large drive with heavy fragmentation may cause timeout while 'walking' the filesystem looking for changed files.
Please post all text in Job Details for failed job and ensure all of the following log folders exist:
On media server: bptm and bpbrm
On client: bpbkar and bpfis
03-28-2014 01:51 PM
Apologies, I have to go out so have not read all the details you posted, will do so later.
In general, status 24 is not NBU, so you need to look OS level, in fact I have never seen NBU cause a 24.
Check this post for comments, and in particular the post I made (sorry it is long, but it may give you a solution).
https://www-secure.symantec.com/connect/forums/netbackup-status-code-24-possible-parameters-check
Also this one,
https://www-secure.symantec.com/connect/forums/netbackup-solaris-10-media-server-issue
Make sure all interfaces are fully resolvable, a very common cause of 24s
Martin
04-03-2014 10:20 AM
Thank you for the reply.
I've gone through the links and we've also gone through the TCP Chimney settings, DNS resolutions etc.
The latest change in our symptom description of the issue is that I can successfully back up the D:\ and other non OS partitions successfully. However, once it gets a certain way through the C:\ it fails with a 24. This being said, I think it's safe to rule out a network issue or a DNS issue. I have ruled out the virus scan directories, NBU directories however still seeing a failure.
Thanks for the feedback.
04-03-2014 10:44 AM
So you mean to say that u r unable to take the backup only for C drive right?
What is the free space and utilizes space for C?
Please verify this
https://www-secure.symantec.com/connect/forums/incremental-backup-failing-only-d-drive-status-code-4224-full-backup-completing-successfully#comment-9435631
Check C-drive for fragmentation.
Large drive with heavy fragmentation may cause timeout while 'walking' the filesystem looking for changed files.
Please post all text in Job Details for failed job and ensure all of the following log folders exist:
On media server: bptm and bpbrm
On client: bpbkar and bpfis