09-11-2014 11:46 PM
Backup failing with error code 25 as cannot connect to socket. Backups running too slow may take 2 to 3 days and is still running and results with the failure.
NBU version: 7.6
09/11/2014 20:00:00 - Info nbjm (pid=49490) starting backup job (jobid=9549) for client hmsboifile, policy PROD_REMOTE_SERVERS, schedule DLY_DIFF
09/11/2014 20:00:00 - Info nbjm (pid=49490) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=9549, request id:{1EF05662-3A18-11E4-9A90-B667FB42B3B8})
09/11/2014 20:00:00 - requesting resource stu_disk_nbulv1
09/11/2014 20:00:00 - requesting resource nbulv1.NBU_CLIENT.MAXJOBS.hmsboifile
09/11/2014 20:00:00 - requesting resource nbulv1.NBU_POLICY.MAXJOBS.PROD_REMOTE_SERVERS
09/11/2014 20:00:02 - granted resource nbulv1.NBU_CLIENT.MAXJOBS.hmsboifile
09/11/2014 20:00:02 - granted resource nbulv1.NBU_POLICY.MAXJOBS.PROD_REMOTE_SERVERS
09/11/2014 20:00:02 - granted resource MediaID=@aaaad;DiskVolume=PureDiskVolume;DiskPool=dp_disk_nbulv1;Path=PureDiskVolume;StorageServer=nbulv1;MediaServer=nbulv1
09/11/2014 20:00:02 - granted resource stu_disk_nbulv1
09/11/2014 20:00:03 - estimated 0 kbytes needed
09/11/2014 20:00:03 - begin Parent Job
09/11/2014 20:00:03 - begin Stream Discovery: Start Notify Script
09/11/2014 20:00:03 - Info RUNCMD (pid=95057) started
09/11/2014 20:00:04 - Info RUNCMD (pid=95057) exiting with status: 0
Operation Status: 0
09/11/2014 20:00:04 - end Stream Discovery: Start Notify Script; elapsed time 0:00:01
09/11/2014 20:00:04 - begin Stream Discovery: Stream Discovery
Operation Status: 25
09/11/2014 20:00:20 - end Stream Discovery: Stream Discovery; elapsed time 0:00:16
09/11/2014 20:00:20 - begin Stream Discovery: Stop On Error
Operation Status: 0
09/11/2014 20:00:20 - end Stream Discovery: Stop On Error; elapsed time 0:00:00
09/11/2014 20:00:20 - begin Stream Discovery: End Notify Script
09/11/2014 20:00:20 - Info RUNCMD (pid=95424) started
09/11/2014 20:00:20 - Info RUNCMD (pid=95424) exiting with status: 0
Operation Status: 0
09/11/2014 20:00:20 - end Stream Discovery: End Notify Script; elapsed time 0:00:00
Operation Status: 25
09/11/2014 20:00:20 - end Parent Job; elapsed time 0:00:17
cannot connect on socket (25)
09-12-2014 12:22 AM
Well, this backup attempt failed within 20 seconds because of connection issue.
Which OS and NBU version on the client?
Which NBU version on master and/or media server?
Is OS firewall disabled on the client?
Is there a firewall between the client and master and/or media server?
Have you ensured that port 1556 and 13724 is open between client and server(s)?
Your policy indicates that the client(s) may be in a remote location.
Please tell us more about this setup?
09-12-2014 01:12 AM
09-12-2014 01:31 AM
NBU Service automatically gets strucked or stopped.
Which NBU service?
I guess you are referring to the client?
If NBU services are getting stopped you need to find out what is stopping it.
Look for security and/or AV software that may be stopping services.
McAfee is normally a 'suspect'.
Check Event Viewer System and Application log for errors.
09-12-2014 02:20 AM
09-12-2014 02:43 AM
I have never seen NBU services stopping because of these reasons (slowness or hung status.)
bpbkar process will terminate when backup fails due to timeout.
The status 25 means that connection to NBU Client service (via PBX) has failed.
It is normally security or AV software killing NBU services.
09-12-2014 06:22 AM
I think you should perform network testing between the media server and client. You might have a flaky link along the path or a network device that is subject to high congestions. There are tools that can track the number of dropped packets and report connection quality statistics. You might want to perform some NIC tuning along the path.
09-15-2014 06:58 AM
09-15-2014 07:09 AM
You need to help us to help you.
The only information you have provided thus far is initial connection failure.
Connection failure and slow backups are 2 different issues.
Please show us details of job where you see some data being written before the job fails.
Logs needed to troubleshoot:
On media server: bptm and bpbrm
On client: bpbkar and bpcd.
Create the log folders under ...\netbackup\logs on media server and client.
09-15-2014 07:49 AM
09-29-2014 07:53 AM
Hi, Please find attached logs and suggest.
Thanks in advance
09-29-2014 08:42 AM
(ignore this, I was looking at the wrong files)
Did the backup retry multiple times? that's what looks like in the bpcd log. Is there any indication of what might be happening to the client service in the Windows event logs on the client or the media server?
09-29-2014 08:55 AM
This looks to be a timeout .. from the description it seems to be a remote office location using NBU De-Dupe and accelerator and has yet to do a good backup.
Firstly make sure that the client is set to use VSS for its backups (set under Master Server Host propeties - Client attributes) and then set the client connect and client read timeouts on your media servers to 3600 and then set that client as having a resilient network as it clealy has issues with the remote connection (again Master Host properties)
See if those help get that first backup
09-29-2014 09:03 AM
I would double check the child job and post the detailed status here. The parent job was finished in 20 seconds, but eventually was marked with the 25 after the child presumable failed with said error. The message is not timestamped indicating this was from NBJM (It is not part of the message prior).
09-30-2014 12:52 AM
10-01-2014 05:55 AM
Kindly suggest what further could be done
10-01-2014 06:25 AM
Did you perform network testing and optimization as I suggested? The problem you are having is a network problem.
Have you engaged the network administrator?
Does this connection cross a WAN link?
What is the slowest link on the path?
10-01-2014 06:43 AM