03-15-2016 06:13 AM
Hi Team,
We are facing backup failure issue for MSSQL & Oracle Backups with error code 24 on linux media servers.
whereas backup is working fine with unix media servers.
Backup is failing immediately.
Getting below error :
bpbkar
10:54:39.284 [9625] <16> flush_archive(): ERR - Cannot write to STDOUT. Errno = 110: Connection timed out
10:54:39.284 [9625] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 24: socket write failed
10:54:39.284 [9625] <4> bpbkar Exit: INF - EXIT STATUS 24: socket write failed
10:54:39.284 [9625] <2> bpbkar Exit: INF - Close of stdout complete
10:54:39.284 [9625] <4> bpbkar Exit: INF - setenv FINISHED=0
bpbrm
10:54:24.439 [24284] <2> bpbrm main: ADDED FILES TO DB FOR <clientname>_1457951888 1 /xxxxxx/xxxxx
10:54:39.284 [24284] <16> bpbrm main: from client <clientname>: ERR - Cannot write to STDOUT. Errno = 110: Connection timed out
10:54:39.284 [24284] <2> set_job_details: Tfile (6978384): LOG 1457952879 16 bpbrm 24284 from client <clientname>-b: ERR - Cannot write to STDOUT. Errno = 110: Connection timed out
bptestbpcd, bplcntcmd outlput looks fine.
Could you please help to identify issue.
Thanks in advance..
03-15-2016 07:47 AM
from the linux media servers run command
/usr/openv/netbackup/bin/admincmd/bptestbpcd/bptestbpcd -client xxx.acme.com -debug -verbose
Show us what the command return.
Also try adding CLIENT_READ_TIMEOUT = 1800 to bp.conf on the media servers.
What about file system backup, do they run on the Linux media servers.
03-15-2016 09:51 AM
Other file system backups are working fine and I am also tested filesystem backup from sql host and its also completing successfully.
Already set CLIENT_READ_TIMEOUT = 1800 in bp.conf on the media servers
Below are the logs :
bptestbpcd <client name> -verbose -debug
:
16:46:25.993 [24841] <2> bptestbpcd: VERBOSE = 0
16:46:25.993 [24841] <2> ConnectionCache::connectAndCache: Acquiring new connection for host <master server>, query type 223
16:46:26.007 [24841] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:46:26.007 [24841] <2> logconnections: BPDBM CONNECT FROM 10.84.128.24.37006 TO 10.84.128.8.1556 fd = 3
16:46:26.028 [24841] <2> db_CLIENTsend: reset client protocol version from 0 to 7
16:46:26.103 [24841] <2> db_end: Need to collect reply
16:46:26.103 [24841] <2> db_freeEXDB_INFO: ?
16:46:26.103 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.103 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/127/dbcc7327+veritas_pbx,1,20,2,
1,0+lonwv21260ms01-b.txt
16:46:26.108 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.108 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/0e4/58ce2ee4+0,1,402,0,1,0+10.11
6.68.77.txt
16:46:26.108 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.108 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/127/dbcc7327+vnetd,1,20,2,1,0+lo
nwv21260ms01-b.txt
16:46:26.113 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.113 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/127/dbcc7327+bpcd,1,20,2,1,0+lon
wv21260ms01-b.txt
16:46:26.123 [24841] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:46:26.123 [24841] <2> logconnections: BPCD CONNECT FROM 10.84.128.24.43989 TO 10.116.68.77.1556 fd = 3
16:46:26.123 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.123 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/0e4/58ce2ee4+veritas_pbx,1,20,2,
1,0+10.116.68.77.txt
16:46:26.126 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.126 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/0e4/58ce2ee4+vnetd,1,20,2,1,0+10
.116.68.77.txt
16:46:26.132 [24841] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:46:26.155 [24841] <2> do_pbx_service: ../../libvlibs/vnet_connect.c.1776: 0: via PBX: VNETD CONNECT FROM 10.84.128.24.36750 TO 10.116.68.77.1556 fd = 4
16:46:26.155 [24841] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.445: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
16:46:26.369 [24841] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.462: 0: ipc_string: 51303
1 1 1
10.84.128.24:43989 -> 10.116.68.77:1556
16:46:26.460 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6635: 0: fopen() failed: 2 0x00000002
16:46:26.460 [24841] <2> file_to_addrinfo: ../../libvlibs/vnet_addrinfo.c.6636: 0: fopen() failed: /usr/openv/var/host_cache/1ff/ffffffff+vnetd,1,1,0,1,0+.tx
t
10.84.128.24:36750 -> 10.116.68.77:1556
16:46:26.468 [24841] <2> bpcr_get_peername_rqst: Server peername length = 23
16:46:26.470 [24841] <2> bpcr_get_hostname_rqst: Server hostname length = 14
16:46:26.472 [24841] <2> bpcr_get_clientname_rqst: Server client name length = 14
16:46:26.473 [24841] <2> bpcr_get_version_rqst: bpcd version: 07100000
16:46:26.474 [24841] <2> bpcr_get_platform_rqst: Server client platform length = 7
16:46:26.476 [24841] <2> bpcr_get_version_rqst: bpcd version: 07100000
16:46:26.518 [24841] <2> bpcr_patch_version_rqst: theRest == > <
16:46:26.519 [24841] <2> bpcr_get_version_rqst: bpcd version: 07100000
16:46:26.561 [24841] <2> bpcr_patch_version_rqst: theRest == > <
16:46:26.562 [24841] <2> bpcr_get_version_rqst: bpcd version: 07100000
PEER_NAME = <media server>.xxxxxx.com
HOST_NAME = <client host>
CLIENT_NAME = <client host>
VERSION = 0x07100000
PLATFORM = win_x64
PATCH_VERSION = 7.1.0.0
SERVER_PATCH_VERSION = 7.1.0.0
MASTER_SERVER = <master server>
EMM_SERVER = <master server>
16:46:26.582 [24841] <2> vnet_pbxConnect: pbxConnectEx Succeeded
16:46:26.609 [24841] <2> do_pbx_service: ../../libvlibs/vnet_connect.c.1776: 0: via PBX: VNETD CONNECT FROM 10.84.128.24.43174 TO 10.116.68.77.1556 fd = 5
16:46:26.609 [24841] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.445: 0: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
16:46:26.816 [24841] <2> vnet_vnetd_connect_forward_socket_begin: ../../libvlibs/vnet_vnetd.c.462: 0: ipc_string: 51305
10.84.128.24:43174 -> 10.116.68.77:1556
<2>bptestbpcd: EXIT status = 0
16:46:26.878 [24841] <2> bptestbpcd: EXIT status = 0
03-15-2016 02:00 PM
Status 24 is really hard to troubleshoot. I would start troubleshooting by figuring out when the issue started and what was changed at the time.
Have you checked TCP window scaling? https://www.veritas.com/support/en_US/article.TECH124766
Whats the value set for /usr/openv/netbackup/NET_BUFFER_SZ ?
03-15-2016 11:26 PM
03-16-2016 01:41 AM
Thanks for input. Previously we were backing up from Unix media servers recently we changed it on Linux media servers. Since then we are facing intermediate 24 error failures for DB backups.
03-16-2016 02:14 AM
03-16-2016 02:24 AM
Changes in /etc/sysctl.conf can be taken into effect immediately by issuing command sysctl -p (no reboot required).
03-16-2016 02:45 AM
Do you use port bonding ?
LACP - mode ??