cancel
Showing results for 
Search instead for 
Did you mean: 

Socket Write Failed on RHEL 5 Client

Jason_James
Level 3

I've been troubleshooting the backup errors on a RHEL 5 client for a bit, and am receiving an error 24, socket write error.  Initially, we were receiving a status 57 error, but remedied that by installing xinetd and starting the service, then reinstalling the NBU client software.  All the services that need to run such as xinetd, bpcd, and bpjava-msvc are running, and don't really know where to go from there.  Everything that I've done was immediately following the initial install, it has never worked, however, with the same configuation on my workstation (ip tables are correct as well) everything works fine. 
ps -ef | grep xinetd
returns
root      3246     1  0 Mar18 ?        00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid

and all xinetd based services that are necessary show as on from a chkconfig --list

At this point after checking iptables, checking the running processes, and ensuring that the services are turned on, I don't know where to look next, so any help would be appreciated.




1 ACCEPTED SOLUTION

Accepted Solutions

CY
Level 6
Certified
Take a look at this TechNote and see if it helps.

http://seer.entsupport.symantec.com/docs/337118.htm

View solution in original post

14 REPLIES 14

rjrumfelt
Level 6
the myriad tech notes regarding socket errors?

Before getting into the technotes, you'll want to enable verbose 5 logging by adding VERBOSE = 5 to the bp.conf file, and creating the following directories on the client in question:

<install_path>/openv/netbackup/logs/bpcd
<install_path>/openv/netbackup/logs/bpbkar

If you have not checked the t/n's out yet, below are some good places to start:

http://seer.entsupport.symantec.com/docs/347002.htm

http://seer.entsupport.symantec.com/docs/336452.htm

As soon as you get some logs, you can paste snippets here to help us determine exactly where and why socket connections are giving you trouble.

Jason_James
Level 3
Thanks for the input

rjrumfelt
Level 6

Let us know the NBU and OS version of your master server.  Are there any media servers?

Run the following command and paste the output here, stripping away of course any IP's or hostnames, just replace with something that lets us know what is going on (I'm assuming your master is a UNIX variant, if not the same command resides on Windows master servers as well):

<install_path>/openv/netbackup/bin/admincmd/bptestbpcd -client <client_hostname> -verbose -debug
The output might help us out.

Jason_James
Level 3

Master server is NBU 6.5, Red Hat 2.6, we've got 2 media servers.  As for the output from <install_path>/openv/netbackup/bin/admincmd/bptestbpcd -client <client_hostname> -verbose -debug:

sdbckmstr[/usr/openv/netbackup/bin/admincmd]sudo ./bptestbpcd -client
xxxxxxxxxx -verbose -debug
16:26:59.439 [16523] <2> bptestbpcd: VERBOSE = 0
16:26:59.462 [16523] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2046:
VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
16:26:59.462 [16523] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2060:
service: bpcd
16:26:59.502 [16523] <2> logconnections: BPCD CONNECT FROM
xxx.xxx.xxx.xxx.xxxxx TO xxx.xxx.xxx.xxx13724
16:26:59.502 [16523] <2> vnet_connect_to_vnetd_extra: vnet_vnetd.c.180:
msg: VNETD CONNECT FROM xxx.xxx.xxx.xxx.xxxxx TO xxx.xxx.xxx.xxx.13724
fd = 4
16:26:59.515 [16523] <2> vnet_vnetd_connect_forward_socket_begin:
vnet_vnetd.c.533: VN_REQUEST_CONNECT_FORWARD_SOCKET: 10 0x0000000a
16:26:59.556 [16523] <2> vnet_vnetd_connect_forward_socket_begin:
vnet_vnetd.c.550: ipc_string: /tmp/vnet-18656270499306868983000000000-a9KUEo
16:26:59.596 [16523] <2> put_long: (11) network write() error:
Connection reset by peer (104); socket = 3
16:26:59.596 [16523] <2> bpcr_put_vnetd_forward_socket: put_string
/tmp/vnet-18656270499306868983000000000-a9KUEo failed: 104
16:26:59.596 [16523] <2> local_bpcr_connect:
bpcr_put_vnetd_forward_socket failed: 24
16:26:59.597 [16523] <2> ConnectToBPCD:
bpcd_connect_and_verify(hostname, hostname) failed: 24
<16>bptestbpcd main: Function ConnectToBPCD(hostname) failed: 24
16:26:59.597 [16523] <16> bptestbpcd main: Function
ConnectToBPCD(hostname) failed: 24
<2>bptestbpcd: socket write failed
16:26:59.597 [16523] <2> bptestbpcd: socket write failed
<2>bptestbpcd: EXIT status = 24
16:26:59.597 [16523] <2> bptestbpcd: EXIT status = 24
socket write failed


Also, both of our main NBU admins are out this week, so I, along with a couple of others, are helping out while they're gone, and my knowledge in this field is very limited,  so I really appreciate the help.

CY
Level 6
Certified
Take a look at this TechNote and see if it helps.

http://seer.entsupport.symantec.com/docs/337118.htm

rjrumfelt
Level 6
some good logging after last night's backup run.  Can you parse through them (bpcd primarily for now) and see if you see any errors?

Also have you done the usual name resolution checks?

From the client:

bpclntcmd -pn

If you have a bprd log directory enabled to the master, the output of this command and any errors will be logged there.

bpclntcmd -self

bpclntcmd -hn <hostname_of_master>

bpclntcmd -ip <ip_of_master>

From the master server:

bpclntcmd -hn <hostname_of_client>

bpclnt -ip <ip_of_client>

You can paste the results of those commands here as well if they return any errors.

I know I had an issue last week where someone had switched around the nsswitch.conf file on one of my clients, which resulted in backup failures.  I didnt get socket errors, but sometimes these socket errors can be masked as other types of connection errors.  Are you relying on DNS in this environment or host files?  Check the nsswitch.conf file and make sure it is telling your client to resolve to the appropriate service first, either host files or DNS.

fieldnmultistre
Level 3
Also make sure bprd log is on master server.  Try test backup with verbose 5 set on master server but careful logs grow quick so turn down after test

Jason_James
Level 3

and if no joy there, I'll post some snippets of the commands that were suggested.  Also, I'm not sure if this has any bearing, but we had an issue with DNS where the ip for the client in question was resolving to the wrong hostname.  Our network technicians resolved the issue, and everything seems to be in order, but I'm wondering if that has anything to do with this. 
I can't get to the client until after noon, but as soon as I can, I'll get all of the information that is needed to troubleshoot this further.

rjrumfelt
Level 6

your hosts file at all?

Will_Restore
Level 6
Probably.  Netbackup is very picky about name resolution - forward and back.  

rjrumfelt
Level 6
will help determine if there is an issue with forward/reverse name resolution

Jason_James
Level 3
It's all DNS

Jason_James
Level 3

I've disabled it for the time being and am getting a full backup.  I appreciate all of the input.

rjrumfelt
Level 6
as the solution.