01-12-2015 08:18 AM
Hello,
We recently had a case open with Symantec about the error that I am pasting below. It was thought to be ghost tape drives at the time which we fixed. We had 3 more failures come out this weekend, and was wondering if anyone else has had this issue prior?
We recently upgraded from LTO-4 to LTO-6.
We within the last few months have gone from 7.1.0.4 to 7.6.0.3.
We have added (2) new media servers to our environment to bring us up to (4).
Any advice or hints would be welcome.
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup plus archivelog command at 01/12/2015 01:56:02
ORA-27192: skgfcls: sbtclose2 returned error - failed to close file
ORA-19511: Error received from media manager layer, error text:
Failed to process backup file <bk_2911_1_868757927>
ORA-19502: write error on file "bk_2911_1_868757927", block number 90785 (block size=8192)
ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: Error received from media manager layer, error text:
VxBSASendData: Failed with error:
Solved! Go to Solution.
01-12-2015 10:03 AM
Hi,
This seems like a comms issue, there are no errors from the bptm / bpbrm logs. They show that connection/backup was closed from the client side. In the client logs you see this. It was writing then timed out.
Is this a RAC or single Instance?
01:39:12.658 [29981] <2> int_WriteData: INF - writing buffer # 1 of size 262144
01:39:39.376 [29981] <2> int_WriteData: INF - writing buffer # 1000 of size 262144
01:40:07.354 [29981] <2> int_WriteData: INF - writing buffer # 2000 of size 262144
01:56:00.281 [29981] <16> writeToServer: ERR - send() to server on socket failed: Connection timed out (110)
01:56:00.281 [29981] <16> dbc_put: ERR - failed sending data to server
01:56:00.281 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.281 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <2> int_CloseImage: INF - Backup - closing <bk_2911_1_868757927>
01:56:00.320 [29981] <16> writeToServer: ERR - send() to server on socket failed: Bad file descriptor (9)
01:56:00.320 [29981] <16> write_LF_END_tarHeader: ERR - failed writing LF_END_U_LEN_FILE record on DATA socket
01:56:00.320 [29981] <16> finishTarImage: ERR - write_LF_END_tarHeader() failed.
01:56:00.320 [29981] <16> VxBSAEndData: ERR - EndData unable to bsa_finishTarImage().
01:56:00.320 [29981] <16> xbsa_EndData: ERR - VxBSAEndData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> int_CloseImage: ERR - Failed to process backup file <bk_2911_1_868757927>
01-12-2015 08:42 AM
What does the bptm/bpbrm logs show on the media server?
01-12-2015 08:47 AM
And dbclient log on the oracle client.
01-12-2015 08:53 AM
Hello in the bptm log I am seeing a lot of no connection could be made even though the server never went down, nor did Netbackup
00:13:25.075 [4692.4960] <2> vnet_pbxConnect: pbxConnectEx Succeeded
00:13:25.075 [4692.4960] <2> do_pbx_service: [vnet_connect.c:2160] via PBX bpdbm CONNECT FROM 172.28.25.40.59762 TO 172.28.25.25.1556 fd = 728
00:13:25.075 [4692.4960] <2> async_connect: [vnet_connect.c:1691] connect async CONNECT FROM 172.28.25.40.59762 TO 172.28.25.25.1556 fd = 728
00:13:25.075 [4692.4960] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO veritasarsenal 172.28.25.25 bpdbm VIA pbx
00:13:25.075 [4692.4960] <2> logconnections: BPDBM CONNECT FROM 172.28.25.40.59762 TO 172.28.25.25.1556 fd = 728
00:13:25.075 [4692.4960] <2> vnet_check_vxss_client_magic_with_info_ex: [vnet_vxss_helper.c:943] Ignoring VxSS authentication 2 0x2
00:13:25.278 [4692.4960] <2> db_CLIENTsend: reset client protocol version from 0 to 8
00:13:25.278 [4692.4960] <2> db_CLIENTsend: prec->offline=0
00:13:25.278 [4692.4960] <2> db_CLIENTsend: prec->online_at=0
00:13:25.278 [4692.4960] <2> db_CLIENTsend: prec->offline_at=0
00:13:25.278 [4692.4960] <2> db_CLIENTsend: sending 46 bytes
00:13:25.496 [4692.4960] <2> isDONE: reply DONE 227 <?xml version="1.0" encoding="utf-8"?><StatusMsgDoc><StatusMsg NBUStatus="227" Severity="2" VXULoid="271" VXULtid="227"><MsgString>no entity was found</MsgString></StatusMsg></StatusMsgDoc>
00:13:25.496 [4692.4960] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227
00:13:25.496 [4692.4960] <2> ConnectionCache::connectAndCache: Acquiring new connection for host veritasarsenal, query type 223
00:13:25.496 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.496 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface ANY
00:13:25.496 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.496 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface ANY
00:13:25.496 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:25.496 [4692.4960] <2> vnet_pbxConnect: pbxConnectEx Succeeded
00:13:25.496 [4692.4960] <2> do_pbx_service: [vnet_connect.c:2160] via PBX bpdbm CONNECT FROM 172.28.25.40.59763 TO 172.28.25.25.1556 fd = 732
00:13:25.496 [4692.4960] <2> async_connect: [vnet_connect.c:1691] connect async CONNECT FROM 172.28.25.40.59763 TO 172.28.25.25.1556 fd = 732
00:13:25.496 [4692.4960] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO veritasarsenal 172.28.25.25 bpdbm VIA pbx
00:13:25.496 [4692.4960] <2> logconnections: BPDBM CONNECT FROM 172.28.25.40.59763 TO 172.28.25.25.1556 fd = 732
00:13:25.496 [4692.4960] <2> vnet_check_vxss_client_magic_with_info_ex: [vnet_vxss_helper.c:943] Ignoring VxSS authentication 2 0x2
00:13:25.730 [4692.4960] <2> db_CLIENTsend: reset client protocol version from 0 to 8
00:13:25.730 [4692.4960] <2> db_CLIENTsend: prec->offline=0
00:13:25.730 [4692.4960] <2> db_CLIENTsend: prec->online_at=0
00:13:25.730 [4692.4960] <2> db_CLIENTsend: prec->offline_at=0
00:13:25.730 [4692.4960] <2> db_CLIENTsend: sending 46 bytes
00:13:25.949 [4692.4960] <2> isDONE: reply DONE 227 <?xml version="1.0" encoding="utf-8"?><StatusMsgDoc><StatusMsg NBUStatus="227" Severity="2" VXULoid="271" VXULtid="227"><MsgString>no entity was found</MsgString></StatusMsg></StatusMsgDoc>
00:13:25.949 [4692.4960] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227
00:13:25.949 [4692.4960] <2> verify_client: ../bpbrm.c.41740: db_getCLIENT failed for CLIENT: db135
00:13:25.949 [4692.4960] <2> verify_client: ../bpbrm.c.41910: db_getCLIENT failed: 227 227 0x000000e3
00:13:25.949 [4692.4960] <2> bpbrm start_bpcd_stat: ../bpbrm.c.23199: bpcd_client: db135
00:13:25.949 [4692.4960] <2> bpbrm start_bpcd_stat: ../bpbrm.c.23202: bpcd_client_hostname: db135
00:13:25.949 [4692.4960] <2> local_bpcr_connect: bpcr.c.333: connect_opts = 0x01000100 connect_opts2 = 0x01010100
00:13:25.949 [4692.4960] <2> local_bpcr_connect: bpcr.c.340: connect_opts = 0x01000100
00:13:25.949 [4692.4960] <2> local_bpcr_connect: bpcr.c.379: daemon_port_type = 0
00:13:25.949 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.949 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface ANY
00:13:25.949 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.949 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface ANY
00:13:25.949 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.964 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface ANY
00:13:25.964 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:26.963 [4692.4960] <8> async_connect: [vnet_connect.c:1700] getsockopt SO_ERROR returned 10061 0x274d
00:13:26.963 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:27.977 [4692.4960] <8> async_connect: [vnet_connect.c:1700] getsockopt SO_ERROR returned 10061 0x274d
00:13:27.977 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:27.992 [4692.4960] <2> async_connect: [vnet_connect.c:1691] connect async CONNECT FROM 172.28.25.40.785 TO 172.28.26.243.13782 fd = 728
00:13:27.992 [4692.4960] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS
status: RETRY, (42) CONNECT_REFUSED; system: (10061) No connection could be made because the target machine actively refused it. ; FROM 0.0.0.0 TO db135 172.28.26.243 bpcd VIA pbx
status: RETRY, (42) CONNECT_REFUSED; system: (10061) No connection could be made because the target machine actively refused it. ; FROM 0.0.0.0 TO db135 172.28.26.243 bpcd VIA vnetd
status: SUCCESS; FROM 0.0.0.0 TO db135 172.28.26.243 bpcd
01-12-2015 08:57 AM
I have attached both error logs, anything relating to db135 is related to the error I am seeing. I saw a lot of connection refused when looking through the log even though both were up.
01-12-2015 09:06 AM
Hello here is the output of the dbclient log attached.
01-12-2015 10:03 AM
Hi,
This seems like a comms issue, there are no errors from the bptm / bpbrm logs. They show that connection/backup was closed from the client side. In the client logs you see this. It was writing then timed out.
Is this a RAC or single Instance?
01:39:12.658 [29981] <2> int_WriteData: INF - writing buffer # 1 of size 262144
01:39:39.376 [29981] <2> int_WriteData: INF - writing buffer # 1000 of size 262144
01:40:07.354 [29981] <2> int_WriteData: INF - writing buffer # 2000 of size 262144
01:56:00.281 [29981] <16> writeToServer: ERR - send() to server on socket failed: Connection timed out (110)
01:56:00.281 [29981] <16> dbc_put: ERR - failed sending data to server
01:56:00.281 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.281 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <2> int_CloseImage: INF - Backup - closing <bk_2911_1_868757927>
01:56:00.320 [29981] <16> writeToServer: ERR - send() to server on socket failed: Bad file descriptor (9)
01:56:00.320 [29981] <16> write_LF_END_tarHeader: ERR - failed writing LF_END_U_LEN_FILE record on DATA socket
01:56:00.320 [29981] <16> finishTarImage: ERR - write_LF_END_tarHeader() failed.
01:56:00.320 [29981] <16> VxBSAEndData: ERR - EndData unable to bsa_finishTarImage().
01:56:00.320 [29981] <16> xbsa_EndData: ERR - VxBSAEndData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> int_CloseImage: ERR - Failed to process backup file <bk_2911_1_868757927>
01-12-2015 10:55 AM
If I understand correctly this backup is the only backup on this host.
01-12-2015 11:13 AM
Please attach debug text as a file next time. The post was held back by the anti spam system.
Best Regrads
Nicolai
01-12-2015 08:31 PM
Is this an intermittent of constant failure?
01-13-2015 12:04 AM
Lets go back to basics and troubleshoot as if this is a new setup.
Is master also the media server?
Which OS and NBU version on the client?
Is there a firewall between server(s) and clients?
If so, ensure port 1556 is open in all directions.
If Windows client, is Windows firewall disabled?
Any security software that will prevent port connectivity or connect-back? (Check AV settings)
Are NetBackup Client services and Symantec Pivate Branch Exchange service running on the client?
Test port connectivity and hostname lookup as follows: