cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

bptm error :get_exactly_n_bytes_or_eof_abs: read from socket failed: Connection timed out (145). This only with a Full backup ( since one week)

oui_mm
Level 2
Hello (My 1e post : ) Since last week we have a strange issue with a failed small (<40GB) solaris backup. Config envr : Media+client: same Solaris sparc host (sol10) Master server : windows 2008 Storage node : Sepaton based hardware Reference hour : 12:00:32 BPTM Messages ( I think at the end of the job ... after sending +- 80% of thr data to the Sepaton Storage node ) (bptm sending last data to sepaton ....) 11:39:58.056 [19855] <2> 6590726:bptm:19855:uxvenusc052: 1457347198.56163 :: SEPOST: stspi_write_image :: 1057 :: lengt h: 262144, offset: 3670528 11:39:58.057 [19855] <2> 6590726:bptm:19855:uxvenusc052: 1457347198.57526 :: SEPOST: stspi_write_image :: 1091 :: return status: 0 11:44:31.895 [8821] <2> SetMaxDataLimit: maximum data size: current=-3 max=-3 11:44:31.896 [8821] <2> initialize: fd values STDOUTSOCK=4 STDERRSOCK=5 11:44:31.901 [8821] <2> bptm: INITIATING (VERBOSE = 5): -rptdrv -jobid -1450998735 -jm 11:44:31.901 [8821] <2> bptm: PORT_STATUS = 0x00000000 11:44:31.903 [8821] <2> main: Sending [EXIT STATUS 0] to NBJM 11:44:31.903 [8821] <2> bptm: EXITING with status 0 <---------- 11:54:31.834 [17702] <2> SetMaxDataLimit: maximum data size: current=-3 max=-3 11:54:31.834 [17702] <2> initialize: fd values STDOUTSOCK=4 STDERRSOCK=5 11:54:31.839 [17702] <2> bptm: INITIATING (VERBOSE = 5): -rptdrv -jobid -1450998787 -jm 11:54:31.840 [17702] <2> bptm: PORT_STATUS = 0x00000000 11:54:31.840 [17702] <2> main: Sending [EXIT STATUS 0] to NBJM 11:54:31.841 [17702] <2> bptm: EXITING with status 0 <---------- (!!!! then, for every FULL BCK , Alwas the same error and this since last week ) 12:00:32.705 [19855] <2> get_exactly_n_bytes_or_eof_abs: read from socket failed: Connection timed out (145) 12:00:32.706 [19855] <2> set_job_details: Tfile (6590726): LOG 1457348432 16 bptm 19855 system call failed - Connection timed out (at bptm.c.27404) 12:00:32.706 [19855] <2> send_job_file: job ID 6590726, ftype = 3 msg len = 89, msg = LOG 1457348432 16 bptm 19855 syst em call failed - Connection timed out (at bptm.c.27404) 12:00:32.706 [19855] <2> ConnectionCache::connectAndCache: Acquiring new connection for host hictbrumzbu010, query type 1 12:00:32.710 [19855] <2> vnet_pbxConnect: pbxConnectEx Succeeded 12:00:32.710 [19855] <2> logconnections: BPDBM CONNECT FROM 10.25.13.2.49269 TO 10.35.10.38.1556 fd = 11 12:00:32.767 [19855] <2> db_end: Need to collect reply 12:00:32.783 [19855] <16> write_data_tir: system call failed - Connection timed out (at bptm.c.27404) 12:00:32.783 [19855] <2> 6590726:bptm:19855:uxvenusc052: 1457348432.783616 :: SEPOST: stspi_get_image_prop_v10 :: 358 : : image_name: uxvenusc052_1457345992_C1_TIR, server_name: 10.15.10.167 BPRM Messages 11:37:38.742 [19839] <2> bpbrm wait_for_child: start 12:00:45.877 [19839] <2> bpbrm wait_for_child: child exit_status = 23 signal_status = 0 12:00:45.877 [19839] <2> bpbrm kill_child_process: start 12:00:45.877 [19839] <2> bpbrm Exit: attempting to send mail to root on uxvenusc052 Bpbkar Messages (nothing special) ... 11:37:38.439 [19851] <2> bpbkar delete_old_files_recur: INF - checking files in directory /usr/openv/netbackup/logs/user_ops/root/jobs for prefix = jbp and older than 3 days 11:37:38.439 [19851] <4> bpbkar Exit: INF - bpbkar exit normal 11:37:38.439 [19851] <4> bpbkar Exit: INF - EXIT STATUS 0: the requested operation was successfully completed 11:37:38.439 [19851] <4> bpbkar Exit: INF - setenv FINISHED= the veritas troubelshooting course describe the netbck process and data the flow as next: -a-bpkar sends data to bptm child process . bptm stores this in SHARED MEMORY segments because client and media is the same bpkar send tis direct to the Sharem Mem segments -b-bptm direct the shared memory segment to the allocated storage media ( speaton block by block) -c-bptm connects to bpdmb processes on the master server and update an image header in the image dba. This for each fragment. -d-bpvkar send bck metadata to bpbrm after the data is send to bptm -e- and finally bpbrm sends meta data to bpdbm ( master ) -> image catalog update My problem is to "pin point" the problem..... or an bptm issue with the sepaton or an bptm shared memory segments issue ( see above -C-) !!!!! (bpbkar has already a 'FINISHED' state , bprm is waiting for bptm input and bptm is sending data to the sepaton and metadata to bpdm) or .. something else and what is the meaning of "get_exactly_n_bytes_or_eof_abs: read from socket failed: Connection timed out (145)" " normally this is linked to a network issue " Also the ouptut of vxlogview on the master servers was not helping me (not one reference message) ( used cmd vxlogview.cmd -d all -X "jobid=6590726" ) And ..... INCREMENTAL BACKUPS are working fine ------ K. Regards
2 REPLIES 2

Marianne
Level 6
Partner    VIP    Accredited Certified
Snippets of log files do not help. Please copy logs to .txt files ( bpbrm.txt and bptm.txt ) and upload as File attachments. If logs are too big, extract all references for the relevant PIDs ( [19855] for bptm) and save in .txt files to upload.

mph999
Level 6
Employee Accredited

I think the issue is between bptm and bpdbm (I think , but am not 100% sure without testing that bptm and bpdbm communicate for the TIR info).

Suggest

 

1.  Turn off TIR and try again (at least that way you get a backup)

2.  Get full bptm log at verbose 5 and bpdbm + bpbrm