Forum Discussion

Colin_North's avatar
14 years ago

SQL Agent backup jobs end with Status 6 or 150

Hi guys,

I have a problem backing up one particular database on an SQL 2005 server using the Netbackup SQL agent. There are about 8 or 10 databases on this server and all backup OK except one, which generally fails with status 6 or status 150. The database is 400GB in size, so is quite large, and used to backup without any issues.

I have looked through the SQL server's event logs and NBU logs - some extracts below. I did find an SQL 995 Operating System error, and wondered if this is an SQL bug? But as mentioned all other databases for this server backup fine.

Any help anyone can provide would be appreciated. Colin.

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (6)
Event ID: 17055
Date:  19/04/2012
Time:  07:00:09
User:  NT AUTHORITY\SYSTEM
Computer: server-name
Description:
3041 :
BACKUP failed to complete the command backup database "database-name" to VIRTUAL_DEVICE='VNBU0-5188-2076-1334779681' with blocksize = 65536, maxtransfersize = 65536, buffercount = 1


For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: e1 0b 00 00 10 00 00 00   á.......
0008: 0f 00 00 00 47 00 42 00   ....G.B.
0010: 45 00 44 00 43 00 2d 00   E.D.C.-.
0018: 53 00 51 00 4c 00 53 00   S.Q.L.S.
0020: 43 00 50 00 30 00 32 00   C.P.0.2.
0028: 00 00 00 00 00 00         ......

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17055
Date:  19/04/2012
Time:  07:00:09
User:  NT AUTHORITY\SYSTEM
Computer: server-name
Description:
18210 :
BackupMedium::ReportIoError: write failure on backup device 'VNBU0-5188-2076-1334779681'. Operating system error 995(error not found).


For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 22 47 00 00 10 00 00 00   "G......
0008: 0f 00 00 00 47 00 42 00   ....G.B.
0010: 45 00 44 00 43 00 2d 00   E.D.C.-.
0018: 53 00 51 00 4c 00 53 00   S.Q.L.S.
0020: 43 00 50 00 30 00 32 00   C.P.0.2.
0028: 00 00 00 00 00 00         ...... 

Event Type: Error
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17055
Date:  19/04/2012
Time:  07:00:09
User:  NT AUTHORITY\SYSTEM
Computer: server-name
Description:
18210 :
BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device 'VNBU0-5188-2076-1334779681'. Operating system error 995(error not found).


For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 22 47 00 00 10 00 00 00   "G......
0008: 0f 00 00 00 47 00 42 00   ....G.B.
0010: 45 00 44 00 43 00 2d 00   E.D.C.-.
0018: 53 00 51 00 4c 00 53 00   S.Q.L.S.
0020: 43 00 50 00 30 00 32 00   C.P.0.2.
0028: 00 00 00 00 00 00         ...... 

07:00:04.451 [5188.5652] <16> writeToServer: ERR - send() to server on socket failed: An existing connection was forcibly closed by the remote host.
07:00:04.451 [5188.5652] <16> dbc_put: ERR - writeToServer() failed when sending keepalive
07:00:04.451 [5188.5652] <16> writeToServer: ERR - send() to server on socket failed: An existing connection was forcibly closed by the remote host.
07:00:04.451 [5188.5652] <16> closeApi: ERR - could not write client EXIT STATUS message to the NAME socket
07:00:04.451 [5188.5652] <16> VxBSASendData: ERR - Could not do a bsa_put().
07:00:04.451 [5188.5652] <4> getServerName: Read server name from nb_master_config: master-server
07:00:04.482 [5188.5652] <2> vnet_vnetd_service_socket: ..\libvlibs\vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
07:00:04.482 [5188.5652] <2> vnet_vnetd_service_socket: ..\libvlibs\vnet_vnetd.c.2048: service: bprd
07:00:04.701 [5188.5652] <2> logconnections: BPRD CONNECT FROM client-ip.4610 TO master-ip.13724
07:00:04.982 [5188.5652] <4> getServerName: Read server name from nb_master_config: master-server
07:00:04.998 [5188.5652] <2> vnet_vnetd_service_socket: ..\libvlibs\vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
07:00:04.998 [5188.5652] <2> vnet_vnetd_service_socket: ..\libvlibs\vnet_vnetd.c.2048: service: bprd
07:00:05.217 [5188.5652] <2> logconnections: BPRD CONNECT FROM client-ip.4611 TO master-ip.13724
07:00:09.279 [5188.5652] <16> DBthreads::dbclient: ERR - Error in VxBSASendData: 1.
07:00:09.279 [5188.5652] <1> DBthreads::dbclient:     CONTINUATION: - All backup jobs have completed.
07:00:09.279 [5188.5652] <8> close_image: Session being terminated abnormally, cleaning up
07:00:09.279 [5188.5652] <4> close_image: INF - backup FAILED
07:00:09.279 [5188.5652] <4> close_image: INF ---- end of Backup ---

07:00:09.279 [5188.5652] <4> KillAllThreads: INF - Killing group #0
07:00:09.279 [5188.5652] <4> KillAllThreads: INF - Issuing SignalAbort to MS SQL Server VDI
07:00:09.279 [5188.2040] <4> KillAllThreads: INF - Killing group #0
07:00:09.279 [5188.2228] <4> KillAllThreads: INF - Killing group #0
07:00:09.435 [5188.5652] <16> CDBbackrec::ProcessVxBSAerror: ERR - Error in DBthreads::dbclient: 6.
07:00:09.435 [5188.5652] <1> CDBbackrec::ProcessVxBSAerror:     CONTINUATION: - The system cannot find the file specified.
07:00:09.435 [5188.5652] <16> DBthreads::dbclient: ERR - Error in VxBSAEndData: 6.
07:00:09.435 [5188.5652] <1> DBthreads::dbclient:     CONTINUATION: - The handle used to associate this call with a previous VxBSAInit() call is invalid.
07:00:12.279 [5188.2076] <4> DBDisconnect: INF - Logging out of SQL Server with handle <0x00b21a20>
07:00:12.279 [5188.2076] <4> getServerName: Read server name from nb_master_config: master-server
07:00:12.295 [5188.2076] <2> vnet_vnetd_service_socket: ..\libvlibs\vnet_vnetd.c.2034: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
07:00:12.295 [5188.2076] <2> vnet_vnetd_service_socket: ..\libvlibs\vnet_vnetd.c.2048: service: bprd
07:00:12.514 [5188.2076] <2> logconnections: BPRD CONNECT FROM client-ip.4618 TO master-ip.13724
07:00:16.608 [5188.2076] <16> CDBbackrec::FreeDeviceSet(): ERR - Error in VDS->Close: 0x80770004.
07:00:16.608 [5188.2076] <1> CDBbackrec::FreeDeviceSet():     CONTINUATION: - An abort request is preventing anything except termination actions.
07:00:16.608 [5188.2076] <4> Dbbackrec::Perform: INF - OPERATION #7 of batch C:\Program Files\VERITAS\NetBackup\DbExt\MsSql\sqlfull.bch FAILED with STATUS 1 (0 is normal). Elapsed time = 31935(0) seconds.
07:00:18.623 [5188.4596] <4> CDBbackmain::dbbackup: INF - Results of executing <C:\Program Files\VERITAS\NetBackup\DbExt\MsSql\sqlfull.bch>:

<6> operations succeeded. <1> operations failed.

INFO Server Status:  Communication with the server has not been iniatated or the server status has not been retrieved from the server.
INFO Error in VxBSASendData: 1.
INFO All backup jobs have completed.
INFO Error in DBthreads::dbclient: 6.
INFO The system cannot find the file specified.
INFO Error in VxBSAEndData: 6.
INFO The handle used to associate this call with a previous VxBSAInit() call is invalid.
ENDOPER TRUE
 

6 Replies

  • You say it generally fails now; does it mean it works sometimes?

    From the log it would seem that the media server has diconnected the session, hence the write error to the BSA object. If this is an incremental backup, it could be due to client read timeout setting, as incremental backups sometimes does not write data for an extended period, thus the media server disconnects.

    You could check on the media server's bpbrm/bptm log files if this is the case. If so, adjust the client read timeout for the DB server in media server settings.

    /A

     

  • Wee need to see full dbclient log to see what happened prior to this:

    07:00:04.451 [5188.5652] <16> writeToServer: ERR - send() to server on socket failed: An existing connection was forcibly closed by the remote host.

    You will also need bpbrm and bptm logs on the media server.

    The above error indicates a possible timeout on the media server, but we do not have sufficient info to say for sure.

    We often see large database backups fail due to timeout.
    For this reason we increase 'Client Connect Timeout' and 'Client Read Timeout' to 1800 (as a start) on the media server(s).

     

  • Thanks for the replies - I have adjusted the client read timeout value from 300 to 1800, I had thought that this could be a cause.

    I'll run the backup job again this evening, will post if there's any further issues.

    Colin.

  • Looked like a timeout issue, so adjusting the client read timeout settings helped. I did open a call with Symantec, but that wasn't much help.

  • We would love to assist, but without the logs that I mentioned above we are merely guessing.
  • create dbclient , bphdb folders on client create a new policy to run problematic DB backup only and share the logs