cancel
Showing results for 
Search instead for 
Did you mean: 

phase 2 import problem

FelixTam
Level 3

import fail log as following:

10/19/2010 10:17:04 AM - begin Import
10/19/2010 10:17:07 AM - started process bptm (1892)
10/19/2010 10:17:07 AM - started process bptm (1888)
10/19/2010 10:17:09 AM - started process bptm (1892)
10/19/2010 10:17:09 AM - mounting A00066
10/19/2010 10:17:07 AM - requesting resource A00066
10/19/2010 10:17:07 AM - granted resource A00066
10/19/2010 10:17:07 AM - granted resource IBM.ULTRIUM-TD4.000
10/19/2010 10:18:07 AM - mounted; mount time: 00:00:58
10/19/2010 10:18:07 AM - positioning A00066 to file 1
10/19/2010 10:18:07 AM - positioned A00066; position time: 00:00:00
10/19/2010 10:18:07 AM - begin reading
10/19/2010 1:39:03 PM - end reading; read time: 03:20:56
10/19/2010 1:39:03 PM - Error bptm(pid=1892) wait for child pid 1888 timeout       
10/19/2010 1:39:03 PM - Critical bpbrm(pid=3788) from client backup-server: FTL - socket read failed     
10/19/2010 1:39:03 PM - Error bpimport(pid=1164) from host backup-server, FTL - socket read failed     
10/19/2010 1:39:03 PM - Error bpimport(pid=1164) Import of policy policy, schedule time (backup-server_1281944821) failed, tar had an unexpected error.
10/19/2010 1:39:03 PM - Error bpimport(pid=1164) Status = no images were successfully processed.      
10/19/2010 1:39:03 PM - Error bpbrm(pid=3788) could not write EXIT STATUS to stderr      
10/19/2010 1:39:03 PM - end Import; elapsed time: 03:21:59
no images were successfully processed(191)

phase 1 import successfully, but phase 2 import fail

any help?

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified

Probably best to update NBU to 6.5.6 before logging a call with Symantec.

I found reference to a number of fixes that are related to imports (not exactly what we see here, but worth a try. Support would probably suggest the same).

http://www.symantec.com/business/support/index?page=content&id=TECH129076

View solution in original post

14 REPLIES 14

Marianne
Level 6
Partner    VIP    Accredited Certified

Hopefully you have bptm log?

We need to see what "child pid 1888" was doing...

Otherwise you need to enable bptm log on media server, retry phase 2 and see what is logged in bptm log.

FelixTam
Level 3

thank for the response

the attachment is the first half of the log

in my first post

the catelog job will "hang" on the begin read line.and you will find that the bptm log have these lines:

10:05:37.156 [3740.4080] <2> bptm: INITIATING (VERBOSE = 0): -rptdrv -jobid -1286899669 -jm
10:05:37.156 [3740.4080] <2> drivename_open: Called with Create 0, file IBM.ULTRIUM-TD4.000
10:05:37.156 [3740.4080] <2> drivename_checklock: Called
10:05:37.156 [3740.4080] <2> drivename_checklock: File is locked
10:05:37.156 [3740.4080] <2> report_drives: DRIVE = IBM.ULTRIUM-TD4.000 LOCK = TRUE CURTIME = 1287673537
10:05:37.156 [3740.4080] <2> drivename_close: Called for file IBM.ULTRIUM-TD4.000
10:05:37.156 [3740.4080] <2> report_drives: MODE = 0
10:05:37.156 [3740.4080] <2> report_drives: TIME = 1287672817
10:05:37.156 [3740.4080] <2> report_drives: MASTER = backup-server
10:05:37.156 [3740.4080] <2> report_drives: SR_KEY = 0 1
10:05:37.156 [3740.4080] <2> report_drives: PATH = {3,0,0,0}
10:05:37.156 [3740.4080] <2> report_drives: MEDIA = A00066
10:05:37.156 [3740.4080] <2> report_drives: REQID = 902
10:05:37.156 [3740.4080] <2> report_drives: ALOCID = 461
10:05:37.156 [3740.4080] <2> report_drives: RBID = {0F9596F4-E009-487E-9686-FC613C9AC5BE}
10:05:37.156 [3740.4080] <2> report_drives: PID = 3844
10:05:37.156 [3740.4080] <2> report_drives: FILE = C:\Program Files\Veritas\NetBackup\db\media\tpreq\drive_IBM.ULTRIUM-TD4.000
10:05:37.156 [3740.4080] <2> main: Sending [EXIT STATUS 0] to NBJM
10:05:37.156 [3740.4080] <2> bptm: EXITING with status 0 <----------

that is the place that the job hanging

FelixTam
Level 3

this is the entire log that you need

thank for you help

Marianne
Level 6
Partner    VIP    Accredited Certified

Seems there is a disconnect between bptm processes 3844 and 1048.

bptm PID 3844 did the right thing - there are updates throughout bptm log confirming that data is read from tape:

09:54:38.359 [3844.3036] <4> read_backup: begin reading backup id backup-server_1281944821 (import), copy 1, fragment 1, from media id A00066 on drive IBM.ULTRIUM-TD4.000 (index 0)

10:14:39.968 [3844.3036] <2> read_data: Total Kbytes transferred 73899392

10:34:39.484 [3844.3036] <2> read_data: Total Kbytes transferred 147288512

10:54:39.578 [3844.3036] <2> read_data: Total Kbytes transferred 220887744

11:14:39.562 [3844.3036] <2> read_data: Total Kbytes transferred 294276864

11:34:39.578 [3844.3036] <2> read_data: Total Kbytes transferred 367635968

11:54:39.750 [3844.3036] <2> read_data: Total Kbytes transferred 440995072

12:14:39.531 [3844.3036] <2> read_data: Total Kbytes transferred 514474240

12:34:39.812 [3844.3036] <2> read_data: Total Kbytes transferred 588133504

12:54:39.781 [3844.3036] <2> read_data: Total Kbytes transferred 661162432

13:14:39.421 [3844.3036] <2> read_data: Total Kbytes transferred 734881728

13:16:20.109 [3844.3036] <2> read_data: read short block, bytes = 23552, remainder = 0
13:16:20.125 [3844.3036] <2> read_data: ReadFile returned FALSE, A tape access reached a filemark. (1101);bytes = 0
13:16:20.125 [3844.3036] <2> read_data: ReadFile detected EOM or EOF, bytes = 0
13:16:20.125 [3844.3036] <2> read_data: waited for empty buffer 384149 times, delayed 767930 times

13:16:20.125 [3844.3036] <2> read_data: Total Kbytes transferred 740921431

13:16:20.125 [3844.3036] <16> wait_for_sigcld: wait for child pid 1048 timeout
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, terminating
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, ignoring
13:16:20.125 [3844.3036] <2> mm_child_done: SIGCHLD: exit=82, pid=1048

 

PID 1048:

09:53:35.343 [3844.3036] <2> read_backup: import child process is pid 1048.2868
09:53:35.359 [1048.2868] <2> bptm: INITIATING (VERBOSE = 0): -import -pid 1080 -c backup-server -bt 1287672811 -b backup-server_1281944821 -hostname backup-server -L C:\Program..Files\Veritas\NetBackup\Logs\user_ops\BACKUP-SERVER\ix1540-1\logs\jbpImport2-20101021095330.log -ru root -rclnt backup-server -rclnthostname backup-server -everything -no_callback -connect_options 0x01010100 -jobid 902 -den 6 -mmflush 3844 0 65536 30 10 0 0 0 0 774592875 0 0

09:53:35.671 [1048.2868] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2057: service: bpjobd
09:53:35.671 [1048.2868] <2> logconnections: BPJOBD CONNECT FROM 192.168.204.208.1370 TO 192.168.204.208.13724

09:53:35.671 [1048.2868] <2> job_connect: Connected to the host backup-server contype 10 jobid <902> socket <1344>
09:53:35.671 [1048.2868] <2> job_connect: Connected on port 1370
09:53:35.671 [1048.2868] <2> setup_mm_child: [3844] child using 30 data buffers
09:53:35.671 [1048.2868] <2> setup_mm_child: [3844] child buffer size is 65536
09:53:35.671 [1048.2868] <2> setup_mm_child: [3844] buffer address = 0x13c0000, buf control = 0x15a0000, ReadyPtr = 0x15a02d0, res_cntl = 0x15a02d8

 

No more updates for PID 1048. Just the disconnect between 3844 & 1048:

13:16:20.125 [3844.3036] <16> wait_for_sigcld: wait for child pid 1048 timeout
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, terminating
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, ignoring
13:16:20.125 [3844.3036] <2> mm_child_done: SIGCHLD: exit=82, pid=1048

 

You never mentioned your O/S and NBU version?

Seems you will need to log a call with Symantec Support...

 

 

FelixTam
Level 3

Oh...

my os is windows 2003 R2

netbackup version is 6.5.2A

what do you means call symantec support?send mail to symantec call for help?

Marianne
Level 6
Partner    VIP    Accredited Certified

Probably best to update NBU to 6.5.6 before logging a call with Symantec.

I found reference to a number of fixes that are related to imports (not exactly what we see here, but worth a try. Support would probably suggest the same).

http://www.symantec.com/business/support/index?page=content&id=TECH129076

Yasuhisa_Ishika
Level 6
Partner Accredited Certified
to increase Client Read Timeout?

FelixTam
Level 3

i have increase as 3000 but still no effect

and i remember that last time i catelog the tape, it will have transmit rate on the GUI interface, but this time the GUI don't have any transmit rate, but this time only the time usage will increase.

Marianne
Level 6
Partner    VIP    Accredited Certified

I don't believe that any NBU timeout will have any effect.  I suspect some sort of disconnect between NBU bptm processes. That is why I suggested upgrading to 6.5.6. If the problem still persists after the upgrade, open a support call with Symantec.

Rajesh_s1
Level 6
Certified

I think its basically due to some networt issue , Please monitor ping and findout any network drop is there or not ...

Marianne
Level 6
Partner    VIP    Accredited Certified

Rajesh - have another look at bptm log and Activity Monitor details - every bit of processing happens on the master (backup-server):

09:53:35.359 [1048.2868] <2> bptm: INITIATING (VERBOSE = 0): -import -pid 1080 -c backup-server -bt 1287672811 -b backup-server_1281944821 -hostname backup-server -L C:\Program..Files\Veritas\NetBackup\Logs\user_ops\BACKUP-SERVER\ix1540-1

The two bptm processes that are losing connection are both running on the same server (see full bptm log).

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Initially i request you to update your NBU Server

First check and make sure the Tape Drive, Tape Cartridge and third the import procedure of Netbackup Server:-

Does your import is failing just ? and your backup/Restore are going good. Actually i want to confirm that your Tape Drive might not doing some problem after a specific time. If all things are going good then insert another tape cartridge and try to import any backup image. if import successful thn there might be some problem with the Tape Cartridge. or if fails thn there might be some problem with the import procedure.

FelixTam
Level 3

after update the netbackup

the problem solved and i no need to do the phase  2 import and got the catelog

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

thats gr8. (I found sometimes my problem solved in at least using NBU 6.5.5)