cancel
Showing results for 
Search instead for 
Did you mean: 

Error received from media manager layer

CoMurphy
Level 4

Hello,

 

We recently had a case open with Symantec about the error that I am pasting below. It was thought to be ghost tape drives at the time which we fixed. We had 3 more failures come out this weekend, and was wondering if anyone else has had this issue prior?

We recently upgraded from LTO-4 to LTO-6.

We within the last few months have gone from 7.1.0.4 to 7.6.0.3.

We have added (2) new media servers to our environment to bring us up to (4).

 

Any advice or hints would be welcome.

 

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of backup plus archivelog command at 01/12/2015 01:56:02

ORA-27192: skgfcls: sbtclose2 returned error - failed to close file

ORA-19511: Error received from media manager layer, error text:

   Failed to process backup file <bk_2911_1_868757927>

ORA-19502: write error on file "bk_2911_1_868757927", block number 90785 (block size=8192)

ORA-27030: skgfwrt: sbtwrite2 returned error

ORA-19511: Error received from media manager layer, error text:

   VxBSASendData: Failed with error:

1 ACCEPTED SOLUTION

Accepted Solutions

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Hi,

 

This seems like a comms issue, there are no errors from the bptm / bpbrm logs. They show that connection/backup was closed from the client side. In the client logs you see this. It was writing then timed out.

 

Is this a RAC or single Instance?

 

01:39:12.658 [29981] <2> int_WriteData: INF - writing buffer # 1 of size 262144
01:39:39.376 [29981] <2> int_WriteData: INF - writing buffer # 1000 of size 262144

01:40:07.354 [29981] <2> int_WriteData: INF - writing buffer # 2000 of size 262144
01:56:00.281 [29981] <16> writeToServer: ERR - send() to server on socket failed: Connection timed out (110)
01:56:00.281 [29981] <16> dbc_put: ERR - failed sending data to server
01:56:00.281 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().

01:56:00.281 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <2> int_CloseImage: INF - Backup - closing <bk_2911_1_868757927>
01:56:00.320 [29981] <16> writeToServer: ERR - send() to server on socket failed: Bad file descriptor (9)
01:56:00.320 [29981] <16> write_LF_END_tarHeader: ERR - failed writing LF_END_U_LEN_FILE record on DATA socket
01:56:00.320 [29981] <16> finishTarImage: ERR - write_LF_END_tarHeader() failed.
01:56:00.320 [29981] <16> VxBSAEndData: ERR - EndData unable to bsa_finishTarImage().
01:56:00.320 [29981] <16> xbsa_EndData: ERR - VxBSAEndData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> int_CloseImage: ERR - Failed to process backup file <bk_2911_1_868757927>

View solution in original post

10 REPLIES 10

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

What does the bptm/bpbrm logs show on the media server?

Will_Restore
Level 6

And dbclient log on the oracle client.

CoMurphy
Level 4

Hello in the bptm log I am seeing a lot of no connection could be made even though the server never went down, nor did Netbackup

 

00:13:25.075 [4692.4960] <2> vnet_pbxConnect: pbxConnectEx Succeeded
00:13:25.075 [4692.4960] <2> do_pbx_service: [vnet_connect.c:2160] via PBX bpdbm CONNECT FROM 172.28.25.40.59762 TO 172.28.25.25.1556 fd = 728
00:13:25.075 [4692.4960] <2> async_connect: [vnet_connect.c:1691] connect async CONNECT FROM 172.28.25.40.59762 TO 172.28.25.25.1556 fd = 728
00:13:25.075 [4692.4960] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO veritasarsenal 172.28.25.25 bpdbm VIA pbx
00:13:25.075 [4692.4960] <2> logconnections: BPDBM CONNECT FROM 172.28.25.40.59762 TO 172.28.25.25.1556 fd = 728
00:13:25.075 [4692.4960] <2> vnet_check_vxss_client_magic_with_info_ex: [vnet_vxss_helper.c:943] Ignoring VxSS authentication 2 0x2
00:13:25.278 [4692.4960] <2> db_CLIENTsend: reset client protocol version from 0 to 8
00:13:25.278 [4692.4960] <2> db_CLIENTsend: prec->offline=0
00:13:25.278 [4692.4960] <2> db_CLIENTsend: prec->online_at=0
00:13:25.278 [4692.4960] <2> db_CLIENTsend: prec->offline_at=0
00:13:25.278 [4692.4960] <2> db_CLIENTsend: sending 46 bytes
00:13:25.496 [4692.4960] <2> isDONE: reply DONE 227 <?xml version="1.0" encoding="utf-8"?><StatusMsgDoc><StatusMsg NBUStatus="227" Severity="2" VXULoid="271" VXULtid="227"><MsgString>no&#32;entity&#32;was&#32;found</MsgString></StatusMsg></StatusMsgDoc>
00:13:25.496 [4692.4960] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227
00:13:25.496 [4692.4960] <2> ConnectionCache::connectAndCache: Acquiring new connection for host veritasarsenal, query type 223
00:13:25.496 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.496 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface  ANY
00:13:25.496 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.496 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface  ANY
00:13:25.496 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:25.496 [4692.4960] <2> vnet_pbxConnect: pbxConnectEx Succeeded
00:13:25.496 [4692.4960] <2> do_pbx_service: [vnet_connect.c:2160] via PBX bpdbm CONNECT FROM 172.28.25.40.59763 TO 172.28.25.25.1556 fd = 732
00:13:25.496 [4692.4960] <2> async_connect: [vnet_connect.c:1691] connect async CONNECT FROM 172.28.25.40.59763 TO 172.28.25.25.1556 fd = 732
00:13:25.496 [4692.4960] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO veritasarsenal 172.28.25.25 bpdbm VIA pbx
00:13:25.496 [4692.4960] <2> logconnections: BPDBM CONNECT FROM 172.28.25.40.59763 TO 172.28.25.25.1556 fd = 732
00:13:25.496 [4692.4960] <2> vnet_check_vxss_client_magic_with_info_ex: [vnet_vxss_helper.c:943] Ignoring VxSS authentication 2 0x2
00:13:25.730 [4692.4960] <2> db_CLIENTsend: reset client protocol version from 0 to 8
00:13:25.730 [4692.4960] <2> db_CLIENTsend: prec->offline=0
00:13:25.730 [4692.4960] <2> db_CLIENTsend: prec->online_at=0
00:13:25.730 [4692.4960] <2> db_CLIENTsend: prec->offline_at=0
00:13:25.730 [4692.4960] <2> db_CLIENTsend: sending 46 bytes
00:13:25.949 [4692.4960] <2> isDONE: reply DONE 227 <?xml version="1.0" encoding="utf-8"?><StatusMsgDoc><StatusMsg NBUStatus="227" Severity="2" VXULoid="271" VXULtid="227"><MsgString>no&#32;entity&#32;was&#32;found</MsgString></StatusMsg></StatusMsgDoc>
00:13:25.949 [4692.4960] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227
00:13:25.949 [4692.4960] <2> verify_client: ../bpbrm.c.41740: db_getCLIENT failed for CLIENT: db135
00:13:25.949 [4692.4960] <2> verify_client: ../bpbrm.c.41910: db_getCLIENT failed: 227 227 0x000000e3
00:13:25.949 [4692.4960] <2> bpbrm start_bpcd_stat: ../bpbrm.c.23199: bpcd_client: db135
00:13:25.949 [4692.4960] <2> bpbrm start_bpcd_stat: ../bpbrm.c.23202: bpcd_client_hostname: db135
00:13:25.949 [4692.4960] <2> local_bpcr_connect: bpcr.c.333: connect_opts = 0x01000100 connect_opts2 = 0x01010100
00:13:25.949 [4692.4960] <2> local_bpcr_connect: bpcr.c.340: connect_opts = 0x01000100
00:13:25.949 [4692.4960] <2> local_bpcr_connect: bpcr.c.379: daemon_port_type = 0
00:13:25.949 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.949 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface  ANY
00:13:25.949 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.949 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface  ANY
00:13:25.949 [4692.4960] <2> vnet_sortaddrs: [vnet_addrinfo.c:3856] sorted addrs: 1 0x1
00:13:25.964 [4692.4960] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4754] using interface  ANY
00:13:25.964 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:26.963 [4692.4960] <8> async_connect: [vnet_connect.c:1700] getsockopt SO_ERROR returned 10061 0x274d
00:13:26.963 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:27.977 [4692.4960] <8> async_connect: [vnet_connect.c:1700] getsockopt SO_ERROR returned 10061 0x274d
00:13:27.977 [4692.4960] <2> async_connect: [vnet_connect.c:1499] connect in progress 1 0x1
00:13:27.992 [4692.4960] <2> async_connect: [vnet_connect.c:1691] connect async CONNECT FROM 172.28.25.40.785 TO 172.28.26.243.13782 fd = 728
00:13:27.992 [4692.4960] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS
    status: RETRY, (42) CONNECT_REFUSED; system: (10061) No connection could be made because the target machine actively refused it. ; FROM 0.0.0.0 TO db135 172.28.26.243 bpcd VIA pbx
    status: RETRY, (42) CONNECT_REFUSED; system: (10061) No connection could be made because the target machine actively refused it. ; FROM 0.0.0.0 TO db135 172.28.26.243 bpcd VIA vnetd
    status: SUCCESS; FROM 0.0.0.0 TO db135 172.28.26.243 bpcd

CoMurphy
Level 4

I have attached both error logs, anything relating to db135 is related to the error I am seeing. I saw a lot of connection refused when looking through the log even though both were up.

CoMurphy
Level 4

Hello here is the output of the dbclient log attached.

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Hi,

 

This seems like a comms issue, there are no errors from the bptm / bpbrm logs. They show that connection/backup was closed from the client side. In the client logs you see this. It was writing then timed out.

 

Is this a RAC or single Instance?

 

01:39:12.658 [29981] <2> int_WriteData: INF - writing buffer # 1 of size 262144
01:39:39.376 [29981] <2> int_WriteData: INF - writing buffer # 1000 of size 262144

01:40:07.354 [29981] <2> int_WriteData: INF - writing buffer # 2000 of size 262144
01:56:00.281 [29981] <16> writeToServer: ERR - send() to server on socket failed: Connection timed out (110)
01:56:00.281 [29981] <16> dbc_put: ERR - failed sending data to server
01:56:00.281 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().

01:56:00.281 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> dbc_put: ERR - invalid handle received from the application
01:56:00.320 [29981] <16> VxBSASendData: ERR - Could not do a bsa_put().
01:56:00.320 [29981] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <2> int_CloseImage: INF - Backup - closing <bk_2911_1_868757927>
01:56:00.320 [29981] <16> writeToServer: ERR - send() to server on socket failed: Bad file descriptor (9)
01:56:00.320 [29981] <16> write_LF_END_tarHeader: ERR - failed writing LF_END_U_LEN_FILE record on DATA socket
01:56:00.320 [29981] <16> finishTarImage: ERR - write_LF_END_tarHeader() failed.
01:56:00.320 [29981] <16> VxBSAEndData: ERR - EndData unable to bsa_finishTarImage().
01:56:00.320 [29981] <16> xbsa_EndData: ERR - VxBSAEndData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
01:56:00.320 [29981] <16> int_CloseImage: ERR - Failed to process backup file <bk_2911_1_868757927>

CoMurphy
Level 4

If I understand correctly this backup is the only backup on this host.

Nicolai
Moderator
Moderator
Partner    VIP   

Please attach debug text as a file next time. The post was held back by the anti spam system.

Best Regrads

Nicolai

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Is this an intermittent of constant failure?

Marianne
Level 6
Partner    VIP    Accredited Certified

Lets go back to basics and troubleshoot as if this is a new setup.

Is master also the media server?
Which OS and NBU version on the client?

Is there a firewall between server(s) and clients?
If so, ensure port 1556 is open in all directions.

If Windows client, is Windows firewall disabled?
Any security software that will prevent port connectivity or connect-back? (Check AV settings)

Are NetBackup Client services and Symantec Pivate Branch Exchange service running on the client?

Test port connectivity and hostname lookup as follows:

  1. Create bpcd log folder on client.
    Create bprd log folder on master if not already there followed by restart of NBU on the master.
  2. On master, run this command:
    bptestbpcd -client <client-name> -verbose -debug
    (Command is in ...netbackup\bin\admincmd)
    Check output for errors as well as client's bpcd log.
    If nothing is logged in bpcd, there is a firewall preventing connection.
    If there is separate media server, repeat above steps.
  3. On client, run this command:
    bpclntcmd -pn
    (Command is in ...netbackup\bin)
    Check output to see if master responds with client name.
    If not, check bprd on the master to see if client connection request was received. Look for IP address from client and how the master is resolving IP address to hostname.
    If nothing is logged, the client request did not get there.
    Check that client has correct hostname configured for Master server in NBU and OS level. Check at OS level on the client if Master server hostname is resolving to correct IP address.
    See : 

    How to verify name resolution for NetBackup (tm) systems, using the "bpclntcmd" command
    http://www.symantec.com/docs/TECH27430