cancel
Showing results for 
Search instead for 
Did you mean: 

Backup failure with error code 25 and is very slow.

ankur1809
Level 5

Backup failing with error code 25 as cannot connect to socket. Backups running too slow may take 2 to 3 days and is still running and results with the failure. 


NBU version: 7.6


09/11/2014 20:00:00 - Info nbjm (pid=49490) starting backup job (jobid=9549) for client hmsboifile, policy PROD_REMOTE_SERVERS, schedule DLY_DIFF
09/11/2014 20:00:00 - Info nbjm (pid=49490) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=9549, request id:{1EF05662-3A18-11E4-9A90-B667FB42B3B8})
09/11/2014 20:00:00 - requesting resource stu_disk_nbulv1
09/11/2014 20:00:00 - requesting resource nbulv1.NBU_CLIENT.MAXJOBS.hmsboifile
09/11/2014 20:00:00 - requesting resource nbulv1.NBU_POLICY.MAXJOBS.PROD_REMOTE_SERVERS
09/11/2014 20:00:02 - granted resource  nbulv1.NBU_CLIENT.MAXJOBS.hmsboifile
09/11/2014 20:00:02 - granted resource  nbulv1.NBU_POLICY.MAXJOBS.PROD_REMOTE_SERVERS
09/11/2014 20:00:02 - granted resource  MediaID=@aaaad;DiskVolume=PureDiskVolume;DiskPool=dp_disk_nbulv1;Path=PureDiskVolume;StorageServer=nbulv1;MediaServer=nbulv1
09/11/2014 20:00:02 - granted resource  stu_disk_nbulv1
09/11/2014 20:00:03 - estimated 0 kbytes needed
09/11/2014 20:00:03 - begin Parent Job
09/11/2014 20:00:03 - begin Stream Discovery: Start Notify Script
09/11/2014 20:00:03 - Info RUNCMD (pid=95057) started
09/11/2014 20:00:04 - Info RUNCMD (pid=95057) exiting with status: 0
Operation Status: 0
09/11/2014 20:00:04 - end Stream Discovery: Start Notify Script; elapsed time 0:00:01
09/11/2014 20:00:04 - begin Stream Discovery: Stream Discovery
Operation Status: 25
09/11/2014 20:00:20 - end Stream Discovery: Stream Discovery; elapsed time 0:00:16
09/11/2014 20:00:20 - begin Stream Discovery: Stop On Error
Operation Status: 0
09/11/2014 20:00:20 - end Stream Discovery: Stop On Error; elapsed time 0:00:00
09/11/2014 20:00:20 - begin Stream Discovery: End Notify Script
09/11/2014 20:00:20 - Info RUNCMD (pid=95424) started
09/11/2014 20:00:20 - Info RUNCMD (pid=95424) exiting with status: 0
Operation Status: 0
09/11/2014 20:00:20 - end Stream Discovery: End Notify Script; elapsed time 0:00:00
Operation Status: 25
09/11/2014 20:00:20 - end Parent Job; elapsed time 0:00:17
cannot connect on socket  (25)

 

17 REPLIES 17

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Well, this backup attempt failed within 20 seconds because of connection issue.

Which OS and NBU version on the client?
Which NBU version on master and/or media server?

Is OS firewall disabled on the client?

Is there a firewall between the client and master and/or media server?

Have you ensured that port 1556 and 13724 is open between client and server(s)?

Your policy indicates that the client(s) may be in a remote location.
Please tell us more about this setup?

ankur1809
Level 5
Hi, OS: Windows 2008 R2 , NBU version on client : 7.6.0.1 Master server version :7.6.0.1 NBU Service automatically gets strucked or stopped. Yes firewall is already disabled Telnet Result for 1556 and 13724 port is successful. Master server for these particular clients is at remote location else its not like AIR backups . it is a normal backup getting backed up on disk pool on the same site.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

NBU Service automatically gets strucked or stopped.

 

Which NBU service? 
I guess you are referring to the client?

If NBU services are getting stopped you need to find out what is stopping it.

Look for security and/or AV software that may be stopping services.
McAfee is normally a 'suspect'.

Check Event Viewer System and Application log for errors.

ankur1809
Level 5
Yes on client netbackup client service. May be due to slowness or hung status. As if now services are UP let me re-run backups and will share next result.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I have never seen NBU services stopping because of these reasons (slowness or hung status.)
bpbkar process will terminate when backup fails due to timeout.

The status 25 means that connection to NBU Client service (via PBX) has failed.

It is normally security or AV software killing NBU services.

INT_RND
Level 6
Employee Accredited

I think you should perform network testing between the media server and client. You might have a flaky link along the path or a network device that is subject to high congestions. There are tools that can track the number of dropped packets and report connection quality statistics. You might want to perform some NIC tuning along the path.

ankur1809
Level 5
Can you please guide me how to proceed because still backup is failing. Even in few streams KB's of data is written and then job fails.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You need to help us to help you.

The only information you have provided thus far is initial connection failure.

Connection failure and slow backups are 2 different issues. 

Please show us details of job where you see some data being written before the job fails.

Logs needed to troubleshoot:

On media server: bptm and bpbrm

On client: bpbkar and bpcd.

Create the log folders under ...\netbackup\logs on media server and client.

ankur1809
Level 5
Please find data being written as backups are still running but further this leads to failure of backup. I will send you required logs as well.

ankur1809
Level 5

Hi, Please find attached logs and suggest. 

Thanks in advance

RonCaplinger
Level 6

(ignore this, I was looking at the wrong files)

Did the backup retry multiple times?  that's what looks like in the bpcd log.  Is there any indication of what might be happening to the client service in the Windows event logs on the client or the media server?

Mark_Solutions
Level 6
Partner Accredited Certified

This looks to be a timeout .. from the description it seems to be a remote office location using NBU De-Dupe and accelerator and has yet to do a good backup.

Firstly make sure that the client is set to use VSS for its backups (set under Master Server Host propeties - Client attributes) and then set the client connect and client read timeouts on your media servers to 3600 and then set that client as having a resilient network as it clealy has issues with the remote connection (again Master Host properties)

See if those help get that first backup

mnolan
Level 6
Employee Accredited Certified

I would double check the child job and post the detailed status here. The parent job was finished in 20 seconds, but eventually was marked with the 25 after the child presumable failed with said error.  The message is not timestamped indicating this was from NBJM (It is not part of the message prior).

ankur1809
Level 5
Mark, VSS for the client is already enabled. I have increased Client connection and client timeout to 3600. Please see attachment as much Kb of data is written though % of backup done still denotes 0.

ankur1809
Level 5

Kindly suggest what further could be done

INT_RND
Level 6
Employee Accredited

Did you perform network testing and optimization as I suggested? The problem you are having is a network problem.

Have you engaged the network administrator? 

Does this connection cross a WAN link?

What is the slowest link on the path?

ankur1809
Level 5
I have involved network team as well to trace dropped packets and to check connectivity as well.