cancel
Showing results for 
Search instead for 
Did you mean: 

Multistream backup failed error code 25

Sivaranjani
Level 4
Partner

Netbackup version : 7.1.0.4

Master OS : windows 2008

 

Backup is failing with error code 25 (cannot connect to socket) for windows client [Multistream enabled].

The connectivity between the master and client is OK.

I have tried the backup of the same client without Multistream and it completed successfully.

 

Why the backup is failing??

 

 

 

 

9 REPLIES 9

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

hi

it seems there is a reverse communication issue.that is from client to Master server.

what is the netbackup and OS versions of client.

and give the output of bpclntcmd -pn 

bpcltcmd -self from the client.. 

and also you need to enable the bpfis logs in the client and need to look for error message. provide the log also., if log dir is not avaliable, please create it and trigger the backup.

installpath\veritas\netbackup/logs/bpfis.

Andy_Welburn
Level 6

I would doubt it's a config issue if it works one way but not the other....

 

What's the OS of the client?

Is it purely load on the client when multi-streaming is enabled?

Can you check resource utlization during a multistreamed backup?

i.e. instead of the single job utilising x% of resources you now have 'y' jobs using ~y*x% of resources.

 

You could leave multiple data streams enabled but reduce max jobs per client (or policy) to see if this alleviates the issue - this could essentially be the same as switching multiple data streams off tho'

Sivaranjani
Level 4
Partner

Client Netbackup version : 7.1.0.4

OS: windows 2003

 

During the backup the cpu utilization is below 50% and memory utilization is around 50%

there is no much load on the client.

 

I have attached the bpfis logs from client.

 

Failed job logs.
 


13/12/2012 4:19:05 PM - Info nbjm(pid=5080) starting backup job (jobid=214265) for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed  
13/12/2012 4:19:05 PM - estimated 0 Kbytes needed
13/12/2012 4:19:05 PM - Info nbjm(pid=5080) started backup job for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed on storage unit symantec-hcart-robot-tld-0
13/12/2012 4:19:27 PM - Info bpbrm(pid=2556) HSLCP2 is the host to backup data from    
13/12/2012 4:19:27 PM - Info bpbrm(pid=2556) telling media manager to start backup on client    
13/12/2012 4:19:27 PM - Info bptm(pid=8092) using 65536 data buffer size        
13/12/2012 4:19:27 PM - Info bptm(pid=8092) using 256 data buffers        
13/12/2012 4:21:32 PM - Info bpbrm(pid=10096) sending bpsched msg: CONNECTING TO CLIENT FOR HSLCP2_1355395745    
13/12/2012 4:21:32 PM - connecting
13/12/2012 4:22:14 PM - Error bpbrm(pid=10096) cannot create data socket, The operation completed successfully.  (0)  
13/12/2012 4:23:18 PM - Info bpbrm(pid=2556) child done, status 25        
13/12/2012 4:23:18 PM - Info bpbrm(pid=2556) sending message to media manager: STOP BACKUP HSLCP2_1355395745    
13/12/2012 4:23:19 PM - Info bpbrm(pid=2556) media manager for backup id HSLCP2_1355395745 exited with status 150: termination requested by administrator
13/12/2012 4:23:19 PM - end writing
cannot connect on socket(25)
================================================================================================================================
13/12/2012 4:19:05 PM - Info nbjm(pid=5080) starting backup job (jobid=214266) for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed  
13/12/2012 4:19:05 PM - Info nbjm(pid=5080) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=214266, request id:{1B990197-4962-48DB-87E2-1F79A4C1848B})  
13/12/2012 4:19:05 PM - requesting resource symantec-hcart-robot-tld-0
13/12/2012 4:19:05 PM - requesting resource symantec.NBU_CLIENT.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - requesting resource symantec.NBU_POLICY.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - granted resource symantec.NBU_CLIENT.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - granted resource symantec.NBU_POLICY.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - granted resource DC2WD2
13/12/2012 4:19:05 PM - granted resource IBM.ULTRIUM-TD4.007
13/12/2012 4:19:05 PM - granted resource symantec-hcart-robot-tld-0
13/12/2012 4:19:06 PM - estimated 0 Kbytes needed
13/12/2012 4:19:06 PM - Info nbjm(pid=5080) started backup job for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed on storage unit symantec-hcart-robot-tld-0
13/12/2012 4:21:32 PM - Info bpbrm(pid=2556) HSLCP2 is the host to backup data from    
13/12/2012 4:21:32 PM - Info bpbrm(pid=2556) telling media manager to start backup on client    
13/12/2012 4:21:32 PM - Info bptm(pid=8092) using 65536 data buffer size        
13/12/2012 4:21:32 PM - Info bptm(pid=8092) using 256 data buffers        
13/12/2012 4:23:41 PM - Info bpbrm(pid=9272) sending bpsched msg: CONNECTING TO CLIENT FOR HSLCP2_1355395746    
13/12/2012 4:23:41 PM - connecting
13/12/2012 4:24:22 PM - Error bpbrm(pid=9272) cannot create data socket, The operation completed successfully.  (0)  
13/12/2012 4:25:46 PM - Info bpbrm(pid=2556) child done, status 25        
13/12/2012 4:25:46 PM - Info bpbrm(pid=2556) sending message to media manager: STOP BACKUP HSLCP2_1355395746    
13/12/2012 4:25:47 PM - Info bpbrm(pid=2556) media manager for backup id HSLCP2_1355395746 exited with status 150: termination requested by administrator
13/12/2012 4:25:47 PM - end writing
cannot connect on socket(25)
=====================================================================================================================================
3/12/2012 4:19:05 PM - Info nbjm(pid=5080) starting backup job (jobid=214267) for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed  
13/12/2012 4:19:05 PM - Info nbjm(pid=5080) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=214267, request id:{A4B0AD11-E6F4-4C74-9D6E-8180EA355058})  
13/12/2012 4:19:05 PM - requesting resource symantec-hcart-robot-tld-0
13/12/2012 4:19:05 PM - requesting resource symantec.NBU_CLIENT.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - requesting resource symantec.NBU_POLICY.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - granted resource symantec.NBU_CLIENT.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - granted resource symantec.NBU_POLICY.MAXJOBS.HSLCP2
13/12/2012 4:19:05 PM - granted resource DC2WD2
13/12/2012 4:19:05 PM - granted resource IBM.ULTRIUM-TD4.007
13/12/2012 4:19:05 PM - granted resource symantec-hcart-robot-tld-0
13/12/2012 4:19:07 PM - estimated 0 Kbytes needed
13/12/2012 4:19:07 PM - Info nbjm(pid=5080) started backup job for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed on storage unit symantec-hcart-robot-tld-0
13/12/2012 4:24:03 PM - Info bpbrm(pid=2556) HSLCP2 is the host to backup data from    
13/12/2012 4:24:03 PM - Info bpbrm(pid=2556) telling media manager to start backup on client    
13/12/2012 4:24:04 PM - Info bptm(pid=8092) using 65536 data buffer size        
13/12/2012 4:24:04 PM - Info bptm(pid=8092) using 256 data buffers        
13/12/2012 4:26:09 PM - Info bpbrm(pid=6288) sending bpsched msg: CONNECTING TO CLIENT FOR HSLCP2_1355395747    
13/12/2012 4:26:09 PM - connecting
13/12/2012 4:26:50 PM - Error bpbrm(pid=6288) cannot create data socket, The operation completed successfully.  (0)  
13/12/2012 4:28:15 PM - Info bpbrm(pid=2556) child done, status 25        
13/12/2012 4:28:15 PM - Info bpbrm(pid=2556) sending message to media manager: STOP BACKUP HSLCP2_1355395747    
13/12/2012 4:28:16 PM - Info bpbrm(pid=2556) media manager for backup id HSLCP2_1355395747 exited with status 150: termination requested by administrator
13/12/2012 4:28:16 PM - end writing
cannot connect on socket(25)
===================================================================================================================================
13/12/2012 4:16:32 PM - Info nbjm(pid=5080) starting backup job (jobid=214263) for client HSLCP2, policy HSLCP2, schedule CC-CP_Wed  
13/12/2012 4:16:32 PM - Info nbjm(pid=5080) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=214263, request id:{C0B97312-C3E4-4688-8F99-76CA89051FD2})  
13/12/2012 4:16:32 PM - requesting resource symantec-hcart-robot-tld-0
13/12/2012 4:16:32 PM - requesting resource symantec.NBU_CLIENT.MAXJOBS.HSLCP2
13/12/2012 4:16:32 PM - requesting resource symantec.NBU_POLICY.MAXJOBS.HSLCP2
13/12/2012 4:16:32 PM - granted resource symantec.NBU_CLIENT.MAXJOBS.HSLCP2
13/12/2012 4:16:32 PM - granted resource symantec.NBU_POLICY.MAXJOBS.HSLCP2
13/12/2012 4:16:32 PM - granted resource DC2WD2
13/12/2012 4:16:32 PM - granted resource IBM.ULTRIUM-TD4.007
13/12/2012 4:16:32 PM - granted resource symantec-hcart-robot-tld-0
13/12/2012 4:16:32 PM - estimated 0 Kbytes needed
13/12/2012 4:16:32 PM - begin Parent Job
13/12/2012 4:16:32 PM - begin Snapshot, Start Notify Script
13/12/2012 4:16:32 PM - started
13/12/2012 4:16:33 PM - Info RUNCMD(pid=8552) started            
13/12/2012 4:16:33 PM - Info RUNCMD(pid=8552) exiting with status: 0        
Status 0
13/12/2012 4:16:33 PM - end Snapshot, Start Notify Script; elapsed time: 00:00:01
13/12/2012 4:16:33 PM - begin Snapshot, Step By Condition
Status 0
13/12/2012 4:16:33 PM - end Snapshot, Step By Condition; elapsed time: 00:00:00
13/12/2012 4:16:33 PM - begin Snapshot, Stream Discovery
Status 0
13/12/2012 4:16:33 PM - end Snapshot, Stream Discovery; elapsed time: 00:00:00
13/12/2012 4:16:33 PM - begin Snapshot, Read File List
Status 0
13/12/2012 4:16:33 PM - end Snapshot, Read File List; elapsed time: 00:00:00
13/12/2012 4:16:33 PM - begin Snapshot, Create Snapshot
13/12/2012 4:16:33 PM - started process bpbrm (9676)
13/12/2012 4:17:19 PM - Info bpbrm(pid=9676) HSLCP2 is the host to backup data from    
13/12/2012 4:17:19 PM - Info bpbrm(pid=9676) reading file list from client        
13/12/2012 4:17:19 PM - Info bpbrm(pid=9676) start bpfis on client        
13/12/2012 4:17:19 PM - begin Create Snapshot
13/12/2012 4:17:43 PM - Info bpfis(pid=928) Backup started          
13/12/2012 4:18:43 PM - Info bpfis(pid=928) done. status: 0          
13/12/2012 4:18:43 PM - end Create Snapshot; elapsed time: 00:01:24
13/12/2012 4:19:04 PM - end writing
Status 0
13/12/2012 4:19:04 PM - end Snapshot, Create Snapshot; elapsed time: 00:02:31
13/12/2012 4:19:04 PM - begin Snapshot, Policy Execution Manager Preprocessed
13/12/2012 4:19:09 PM - Info bpfis(pid=0) done. status: 0: the requested operation was successfully completed    
Status 25
13/12/2012 4:31:48 PM - end Snapshot, Policy Execution Manager Preprocessed; elapsed time: 00:12:44
13/12/2012 4:31:48 PM - begin Snapshot, Stop On Error
Status 0
13/12/2012 4:31:48 PM - end Snapshot, Stop On Error; elapsed time: 00:00:00
13/12/2012 4:31:48 PM - begin Snapshot, Delete Snapshot On Exit
13/12/2012 4:31:48 PM - begin Snapshot, Delete Snapshot On Exit
13/12/2012 4:31:48 PM - started process bpbrm (10100)
13/12/2012 4:33:29 PM - end writing
Status 0
13/12/2012 4:33:29 PM - end Snapshot, Delete Snapshot On Exit; elapsed time: 00:01:41
13/12/2012 4:33:29 PM - begin Snapshot, End Notify Script
13/12/2012 4:33:29 PM - Info RUNCMD(pid=10208) started            
13/12/2012 4:33:30 PM - Info RUNCMD(pid=10208) exiting with status: 0        
Status 0
13/12/2012 4:33:30 PM - end Snapshot, End Notify Script; elapsed time: 00:00:01
Status 25
13/12/2012 4:33:30 PM - end Snapshot, Delete Snapshot On Exit; elapsed time: 00:01:42
cannot connect on socket(25)

 

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

hi,

what is bpclntcmd -pn  saying when you run this on client side..

Marianne
Level 6
Partner    VIP    Accredited Certified

Is Master server also the media server?

Do you have bpcd log on the client?
Can you see that bpcd log is written to when backup fails with status 25?

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

hi ,

below is the Critical message from bpfis log that you have attached... 

16:34:49.937 [4736.4884] <32> fis_rebuild_from_db: FTL - cannot open C:\Program Files\Veritas\NetBackup\online_util\fi_cntl\bpfis.fim.HSLCP2_1355395592.0

 

below is the discussion 

https://www-secure.symantec.com/connect/forums/failed-snapshot

copied form above discussion

#####

 

19:22:53.559 [6052.5744] <32> fis_rebuild_from_db: cannot open C:\Program Files\Veritas\NetBackup\online_util\fi_cntl\bpfis.fim.Client.com_1311903706.0

... which compared to yours ...

18:06:42.368 [6632.8844] <32> fis_rebuild_from_db: FTL - cannot open C:\Program Files\Veritas\NetBackup\online_util\fi_cntl\bpfis.fim.tus3monclupib02_1340154395.0

... is very similar.

Here was the solution :

"I raised a case with Symantec, took the Vxms logs and found that, they receiving a null on some data index, it could be caused by corruption.

So, adviced to run chkdsk/defrag command in volume D drive.

After doing a chkdsk/defrag on the drive, the backups are happening and no failures till date."

##########

you can also check 

http://www.symantec.com/docs/TECH168824

and sometimes reboot of the client serves also would help.. 

 

hope this helps. 

Sivaranjani
Level 4
Partner

C:\Program Files\VERITAS\NetBackup\bin>Bpclntcmd.exe -pn
expecting response from server 10.60.0.120
hslcp2 hslcp2 124.30.56.102 1541

Sivaranjani
Level 4
Partner

I ahve attached the bpcd logs.

C:\Program Files\VERITAS\NetBackup\bin>Bpclntcmd.exe -pn
expecting response from server 10.60.0.120
hslcp2 hslcp2 124.30.56.102 1541

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Errors in bpcd are similar to those seen in TN http://www.symantec.com/docs/TECH162303

 

Cause

Starting with NetBackup 7.0.1, server processes will attempt to communicate with client processes using the PBX port (1556) before attempting to utilize the vnetd port (13724) or the bpcd daemon port (13782).  If a firewall is present between the hosts and discards the connection request to PBX without responding, then the server process will wait 10 seconds before attempting to connect to vnetd or the daemon port.

Note that this delay will not occur;

·                             If a firewall is not present and PBX is not running on the remote client, because the TCP stack on the remote host will return a TCP RST immediately upon receipt of the TCP SYN for port 1556.

·                             If a firewall is present, and does not permit connection requests to port 1556, and returns a TCP RST to the originator when it blocks connection to the port.

Solution

Opening the PBX port (1556) through the firewall - bidirectional - between all NetBackup hosts will avoid the 10 second delay waiting for the initial connection to PBX to timeout.  Having the PBX port open is the long term direction of NetBackup and is required to use many of the new features such as SAN Client, Oracle Cloning, etc.

Other possible workarounds include:

A) Disabling multiplexing in the backup policy.  Because bptm is started much later in the process flow, there are fewer cumulative delays and the vnetd process is less likely to timeout.

B) Extending the vnetd timeout.  While this will prevent this particular failure, it will allow these delays to accumulate which may have undesirable effects on concurrent processes or later events in this process.  This should only be used of if other options are not available.