cancel
Showing results for 
Search instead for 
Did you mean: 

SQL backup file read failed (13)

Amirr88
Level 0

Hi all,

Master = 7.7.2 SunOS(5.10)

Media = 7.7.2 Windows2008(6)

Client = 7.7.2 Windows2008(6)

I have some issue with SQL backup currently as 1 of the DB are failling with error ( 13) file read failed while using Windows Media agent. While other DB's are not impacted.

Can someone help me as  i already log a call with veritas and they advise to change setting on network end that i believe does not related to this issue

 

08/26/2017 20:00:18 - Info nbjm (pid=19376) starting backup job (jobid=1083363) for client EAPMSSQLK232, policy KONE_EAPMSSQLK23Q_SQL, schedule Default-Application-Backup
08/26/2017 20:00:18 - Info nbjm (pid=19376) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1083363, request id:{20E6F19C-8A56-11E7-9712-002128C38D7A})
08/26/2017 20:00:18 - requesting resource DTC_STG_TL_ALLMS
08/26/2017 20:00:18 - requesting resource TSGNBMSUX01.NBU_CLIENT.MAXJOBS.EAPMSSQLK232
08/26/2017 20:00:18 - requesting resource TSGNBMSUX01.NBU_POLICY.MAXJOBS.KONE_EAPMSSQLK23Q_SQL
08/26/2017 20:00:18 - granted resource  TSGNBMSUX01.NBU_CLIENT.MAXJOBS.EAPMSSQLK232
08/26/2017 20:00:18 - granted resource  TSGNBMSUX01.NBU_POLICY.MAXJOBS.KONE_EAPMSSQLK23Q_SQL
08/26/2017 20:00:18 - granted resource  TSC178
08/26/2017 20:00:18 - granted resource  HP.ULTRIUM5-SCSI.016
08/26/2017 20:00:18 - granted resource  TSGNBMAWI01-hcart2-robot-tld-0
08/26/2017 20:00:28 - estimated 0 kbytes needed
08/26/2017 20:00:28 - Info nbjm (pid=19376) started backup (backupid=EAPMSSQLK232_1503748828) job for client EAPMSSQLK232, policy KONE_EAPMSSQLK23Q_SQL, schedule Default-Application-Backup on storage unit TSGNBMAWI01-hcart2-robot-tld-0
08/26/2017 20:00:29 - started process bpbrm (pid=20376)
08/26/2017 20:00:31 - connecting
08/26/2017 20:00:31 - Info bpbrm (pid=20376) EAPMSSQLK232 is the host to backup data from
08/26/2017 20:00:31 - Info bpbrm (pid=20376) reading file list for client
08/26/2017 20:00:33 - Info bpbrm (pid=20376) listening for client connection
08/26/2017 20:00:40 - Info bpbrm (pid=20376) INF - Client read timeout = 1800
08/26/2017 20:00:40 - Info bpbrm (pid=20376) accepted connection from client
08/26/2017 20:00:40 - Info dbclient (pid=11776) Backup started
08/26/2017 20:00:40 - connected; connect time: 0:00:00
08/26/2017 20:00:41 - Info bptm (pid=28668) start
08/26/2017 20:00:41 - Info bptm (pid=28668) using 262144 data buffer size
08/26/2017 20:00:43 - Info bptm (pid=28668) setting receive network buffer to 262144 bytes
08/26/2017 20:00:43 - Info bptm (pid=28668) using 32 data buffers
08/26/2017 20:00:43 - Info bptm (pid=28668) start backup
08/26/2017 20:00:43 - Info bptm (pid=28668) backup child process is pid 6468.6600
08/26/2017 20:00:43 - Info bptm (pid=28668) Waiting for mount of media id TSC178 (copy 1) on server TSGNBMAWI01.
08/26/2017 20:00:43 - Info bptm (pid=6468) start
08/26/2017 20:00:43 - mounting TSC178
08/26/2017 20:01:45 - Info bptm (pid=28668) media id TSC178 mounted on drive index 26, drivepath {4,0,1,4}, drivename HP.ULTRIUM5-SCSI.016, copy 1
08/26/2017 20:01:45 - mounted TSC178; mount time: 0:01:02
08/26/2017 20:01:51 - positioning TSC178 to file 11
08/26/2017 20:03:39 - positioned TSC178; position time: 0:01:48
08/26/2017 20:03:39 - begin writing
08/26/2017 20:03:41 - Info dbclient (pid=11776) dbclient(pid=11776) wrote first buffer(size=65536)
08/26/2017 21:08:49 - Error bpbrm (pid=20376) socket read failed, An existing connection was forcibly closed by the remote host.  (10054)
08/26/2017 21:09:51 - Info dbclient (pid=11776) done. status: 13: file read failed
08/26/2017 21:09:52 - end writing; write time: 1:06:13
file read failed  (13)

 

2 REPLIES 2

Marianne
Level 6
Partner    VIP    Accredited Certified

This looks like a timeout to me.
Default of 5 min Client Read Timeout is hardly ever enough for large db's.

Please post these sections of bpbrm, bptm (on media server) and dbclient on client:

bpbrm: PID 20376
08/26/2017 20:00:29 - started process bpbrm (pid=20376)
up to 
08/26/2017 21:08:49 - Error bpbrm (pid=20376) socket read failed, An existing connection was forcibly closed by the remote host.  (10054)

bptm: PID 28668
08/26/2017 20:00:41 - Info bptm (pid=28668) start
and child bptm 
08/26/2017 20:00:43 - Info bptm (pid=6468) start
up to last entry for these two PIDs.

dbclient: PID 11776
08/26/2017 20:00:40 - Info dbclient (pid=11776) Backup started
08/26/2017 21:09:51 - Info dbclient (pid=11776) done. status: 13: file read failed

If there are lots of info (I guess Veritas would've asked for level 5 logs), please paste info in separate .txt files (e.g. bpbrm.txt).

As Marianne says this looks like a timeout, have a couple questions

Is the problematic database big compared to the working ones ?

What happens if you run backup of just that database ?

Can you SQL DBA see anything that might cause heavy load on database while your running the backup ?

Two things that often cause 10054 errors is antivirus and external firewalls, the later especially jobs with long idle time.

Increasing CLIENT_CONNECT_TIMEOUT & CLIENT_READ_TIMEOUT and/or creating/decrase the tcp KEEPALIVE on the client & servers might help. 

Think windows default keepalive on windows is about 2 hours and the microsoft recommendation is about 5 minutes.

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue