10-20-2010 08:15 AM
import fail log as following:
10/19/2010 10:17:04 AM - begin Import
10/19/2010 10:17:07 AM - started process bptm (1892)
10/19/2010 10:17:07 AM - started process bptm (1888)
10/19/2010 10:17:09 AM - started process bptm (1892)
10/19/2010 10:17:09 AM - mounting A00066
10/19/2010 10:17:07 AM - requesting resource A00066
10/19/2010 10:17:07 AM - granted resource A00066
10/19/2010 10:17:07 AM - granted resource IBM.ULTRIUM-TD4.000
10/19/2010 10:18:07 AM - mounted; mount time: 00:00:58
10/19/2010 10:18:07 AM - positioning A00066 to file 1
10/19/2010 10:18:07 AM - positioned A00066; position time: 00:00:00
10/19/2010 10:18:07 AM - begin reading
10/19/2010 1:39:03 PM - end reading; read time: 03:20:56
10/19/2010 1:39:03 PM - Error bptm(pid=1892) wait for child pid 1888 timeout
10/19/2010 1:39:03 PM - Critical bpbrm(pid=3788) from client backup-server: FTL - socket read failed
10/19/2010 1:39:03 PM - Error bpimport(pid=1164) from host backup-server, FTL - socket read failed
10/19/2010 1:39:03 PM - Error bpimport(pid=1164) Import of policy policy, schedule time (backup-server_1281944821) failed, tar had an unexpected error.
10/19/2010 1:39:03 PM - Error bpimport(pid=1164) Status = no images were successfully processed.
10/19/2010 1:39:03 PM - Error bpbrm(pid=3788) could not write EXIT STATUS to stderr
10/19/2010 1:39:03 PM - end Import; elapsed time: 03:21:59
no images were successfully processed(191)
phase 1 import successfully, but phase 2 import fail
any help?
Solved! Go to Solution.
10-21-2010 07:17 AM
Probably best to update NBU to 6.5.6 before logging a call with Symantec.
I found reference to a number of fixes that are related to imports (not exactly what we see here, but worth a try. Support would probably suggest the same).
http://www.symantec.com/business/support/index?page=content&id=TECH129076
10-20-2010 10:35 AM
Hopefully you have bptm log?
We need to see what "child pid 1888" was doing...
Otherwise you need to enable bptm log on media server, retry phase 2 and see what is logged in bptm log.
10-20-2010 07:59 PM
thank for the response
the attachment is the first half of the log
in my first post
the catelog job will "hang" on the begin read line.and you will find that the bptm log have these lines:
10:05:37.156 [3740.4080] <2> bptm: INITIATING (VERBOSE = 0): -rptdrv -jobid -1286899669 -jm
10:05:37.156 [3740.4080] <2> drivename_open: Called with Create 0, file IBM.ULTRIUM-TD4.000
10:05:37.156 [3740.4080] <2> drivename_checklock: Called
10:05:37.156 [3740.4080] <2> drivename_checklock: File is locked
10:05:37.156 [3740.4080] <2> report_drives: DRIVE = IBM.ULTRIUM-TD4.000 LOCK = TRUE CURTIME = 1287673537
10:05:37.156 [3740.4080] <2> drivename_close: Called for file IBM.ULTRIUM-TD4.000
10:05:37.156 [3740.4080] <2> report_drives: MODE = 0
10:05:37.156 [3740.4080] <2> report_drives: TIME = 1287672817
10:05:37.156 [3740.4080] <2> report_drives: MASTER = backup-server
10:05:37.156 [3740.4080] <2> report_drives: SR_KEY = 0 1
10:05:37.156 [3740.4080] <2> report_drives: PATH = {3,0,0,0}
10:05:37.156 [3740.4080] <2> report_drives: MEDIA = A00066
10:05:37.156 [3740.4080] <2> report_drives: REQID = 902
10:05:37.156 [3740.4080] <2> report_drives: ALOCID = 461
10:05:37.156 [3740.4080] <2> report_drives: RBID = {0F9596F4-E009-487E-9686-FC613C9AC5BE}
10:05:37.156 [3740.4080] <2> report_drives: PID = 3844
10:05:37.156 [3740.4080] <2> report_drives: FILE = C:\Program Files\Veritas\NetBackup\db\media\tpreq\drive_IBM.ULTRIUM-TD4.000
10:05:37.156 [3740.4080] <2> main: Sending [EXIT STATUS 0] to NBJM
10:05:37.156 [3740.4080] <2> bptm: EXITING with status 0 <----------
that is the place that the job hanging
10-21-2010 05:33 AM
this is the entire log that you need
thank for you help
10-21-2010 06:27 AM
Seems there is a disconnect between bptm processes 3844 and 1048.
bptm PID 3844 did the right thing - there are updates throughout bptm log confirming that data is read from tape:
09:54:38.359 [3844.3036] <4> read_backup: begin reading backup id backup-server_1281944821 (import), copy 1, fragment 1, from media id A00066 on drive IBM.ULTRIUM-TD4.000 (index 0)
10:14:39.968 [3844.3036] <2> read_data: Total Kbytes transferred 73899392
10:34:39.484 [3844.3036] <2> read_data: Total Kbytes transferred 147288512
10:54:39.578 [3844.3036] <2> read_data: Total Kbytes transferred 220887744
11:14:39.562 [3844.3036] <2> read_data: Total Kbytes transferred 294276864
11:34:39.578 [3844.3036] <2> read_data: Total Kbytes transferred 367635968
11:54:39.750 [3844.3036] <2> read_data: Total Kbytes transferred 440995072
12:14:39.531 [3844.3036] <2> read_data: Total Kbytes transferred 514474240
12:34:39.812 [3844.3036] <2> read_data: Total Kbytes transferred 588133504
12:54:39.781 [3844.3036] <2> read_data: Total Kbytes transferred 661162432
13:14:39.421 [3844.3036] <2> read_data: Total Kbytes transferred 734881728
13:16:20.109 [3844.3036] <2> read_data: read short block, bytes = 23552, remainder = 0
13:16:20.125 [3844.3036] <2> read_data: ReadFile returned FALSE, A tape access reached a filemark. (1101);bytes = 0
13:16:20.125 [3844.3036] <2> read_data: ReadFile detected EOM or EOF, bytes = 0
13:16:20.125 [3844.3036] <2> read_data: waited for empty buffer 384149 times, delayed 767930 times
13:16:20.125 [3844.3036] <2> read_data: Total Kbytes transferred 740921431
13:16:20.125 [3844.3036] <16> wait_for_sigcld: wait for child pid 1048 timeout
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, terminating
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, ignoring
13:16:20.125 [3844.3036] <2> mm_child_done: SIGCHLD: exit=82, pid=1048
PID 1048:
09:53:35.343 [3844.3036] <2> read_backup: import child process is pid 1048.2868
09:53:35.359 [1048.2868] <2> bptm: INITIATING (VERBOSE = 0): -import -pid 1080 -c backup-server -bt 1287672811 -b backup-server_1281944821 -hostname backup-server -L C:\Program..Files\Veritas\NetBackup\Logs\user_ops\BACKUP-SERVER\ix1540-1\logs\jbpImport2-20101021095330.log -ru root -rclnt backup-server -rclnthostname backup-server -everything -no_callback -connect_options 0x01010100 -jobid 902 -den 6 -mmflush 3844 0 65536 30 10 0 0 0 0 774592875 0 0
09:53:35.671 [1048.2868] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2057: service: bpjobd
09:53:35.671 [1048.2868] <2> logconnections: BPJOBD CONNECT FROM 192.168.204.208.1370 TO 192.168.204.208.13724
09:53:35.671 [1048.2868] <2> job_connect: Connected to the host backup-server contype 10 jobid <902> socket <1344>
09:53:35.671 [1048.2868] <2> job_connect: Connected on port 1370
09:53:35.671 [1048.2868] <2> setup_mm_child: [3844] child using 30 data buffers
09:53:35.671 [1048.2868] <2> setup_mm_child: [3844] child buffer size is 65536
09:53:35.671 [1048.2868] <2> setup_mm_child: [3844] buffer address = 0x13c0000, buf control = 0x15a0000, ReadyPtr = 0x15a02d0, res_cntl = 0x15a02d8
No more updates for PID 1048. Just the disconnect between 3844 & 1048:
13:16:20.125 [3844.3036] <16> wait_for_sigcld: wait for child pid 1048 timeout
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, terminating
13:16:20.125 [3844.3036] <2> wait_for_sigcld: child 1048 still active, ignoring
13:16:20.125 [3844.3036] <2> mm_child_done: SIGCHLD: exit=82, pid=1048
You never mentioned your O/S and NBU version?
Seems you will need to log a call with Symantec Support...
10-21-2010 06:57 AM
Oh...
my os is windows 2003 R2
netbackup version is 6.5.2A
what do you means call symantec support?send mail to symantec call for help?
10-21-2010 07:17 AM
Probably best to update NBU to 6.5.6 before logging a call with Symantec.
I found reference to a number of fixes that are related to imports (not exactly what we see here, but worth a try. Support would probably suggest the same).
http://www.symantec.com/business/support/index?page=content&id=TECH129076
10-21-2010 04:41 PM
10-21-2010 10:26 PM
i have increase as 3000 but still no effect
and i remember that last time i catelog the tape, it will have transmit rate on the GUI interface, but this time the GUI don't have any transmit rate, but this time only the time usage will increase.
10-21-2010 10:34 PM
I don't believe that any NBU timeout will have any effect. I suspect some sort of disconnect between NBU bptm processes. That is why I suggested upgrading to 6.5.6. If the problem still persists after the upgrade, open a support call with Symantec.
10-23-2010 03:10 AM
I think its basically due to some networt issue , Please monitor ping and findout any network drop is there or not ...
10-23-2010 06:01 AM
Rajesh - have another look at bptm log and Activity Monitor details - every bit of processing happens on the master (backup-server):
09:53:35.359 [1048.2868] <2> bptm: INITIATING (VERBOSE = 0): -import -pid 1080 -c backup-server -bt 1287672811 -b backup-server_1281944821 -hostname backup-server -L C:\Program..Files\Veritas\NetBackup\Logs\user_ops\BACKUP-SERVER\ix1540-1
The two bptm processes that are losing connection are both running on the same server (see full bptm log).
10-23-2010 02:07 PM
Initially i request you to update your NBU Server
First check and make sure the Tape Drive, Tape Cartridge and third the import procedure of Netbackup Server:-
Does your import is failing just ? and your backup/Restore are going good. Actually i want to confirm that your Tape Drive might not doing some problem after a specific time. If all things are going good then insert another tape cartridge and try to import any backup image. if import successful thn there might be some problem with the Tape Cartridge. or if fails thn there might be some problem with the import procedure.
10-25-2010 11:04 PM
after update the netbackup
the problem solved and i no need to do the phase 2 import and got the catelog
10-25-2010 11:19 PM
thats gr8. (I found sometimes my problem solved in at least using NBU 6.5.5)