cancel
Showing results for 
Search instead for 
Did you mean: 

Duplication jobs are stuck at "Waiting for positioning of media id"

Markus_Bonet
Level 3

We have an NetBackup 7.5.0.6 environment running on an RHEL5 Master Server and two RHEL5 Media Servers.

Each server has access to all 16 LTO4 tape drives connected via two different libraries.

On a regular basis I can encounter duplication jobs which seem to be stuck.

All resources have been granted and a proper tape has been mounted.

The file list contains the first image which should be duplicated in this batch and NetBackup claims that it's waiting for positioning the media.

This can take take hours until finally the duplication process starts and then proceeds at expected speed (350 GiB/h) with multiple images in the batch.

I know that positioning is not an issue as the same media and drives are used for B2T jobs which never show such behavior.

 

Did anybody encounter something like this?

Any hint where to look and what to check?

 

Thanks

Markus

12 REPLIES 12

Mark_Solutions
Level 6
Partner Accredited Certified

Looking at the image names that sit there for a long time .. could they be exchange GRT backups?

If so then it has to mount the exchnage backup image and catalog it fully before it even starts to write to tape - even though it has claimed the drive and tape.

If this is the case then reducing the fragment size of your disk storage unit can really help here - i always use 5000MB as a disk fragment size

Markus_Bonet
Level 3

No, there a no Exchange GRT backups on this environment.

SLP with duplication is used for many regular file system and Oracle archive log backups.

The disk fragment size is set to 524288 MB.

Mark_Solutions
Level 6
Partner Accredited Certified

OK - in that case it does seem odd - logs to check would be bptm and bpdm (netbackup/logs/bptm and bptm on each media server) and nbrb (vxlogview -o 118 on the Master)

 

 

Markus_Bonet
Level 3

I checked the bptm entries for a duplication, where a full medium occured during the process and it's stuck now for nearly two hours after the new medium has been loaded.

 

The log looks like this:

13:28:57.273 [2093] <2> io_position_for_write: position media id BM5040, copy 2, current number images = 355
13:28:57.274 [2093] <2> io_position_for_write: locating to absolute block number 1543756, copy 2
13:30:15.902 [2093] <2> io_position_for_write: locate block is done
13:30:16.992 [2093] <2> io_position_for_write: processing empty header, filenum = 356, bid = (empty_file), copy 2
13:30:16.992 [2093] <2> io_position_for_write: empty header found on BM5040, OK, copy 2
13:30:16.992 [2093] <2> io_ioctl: command (2)MTBSF 1 0x0 from (bptm.c.22463) on drive index 10
13:30:17.031 [2093] <2> io_ioctl: command (5)MTWEOF 1 0x0 from (bptm.c.22515) on drive index 10
13:30:22.325 [2093] <2> send_MDS_msg: MEDIADB 1 35835911 BM5040 4000340 *NULL* 6 1399121250 1399547721 1407575274 0 789921332 355 355 5 6 0 0 1024 0 1543756 0
13:30:22.422 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
13:30:22.425 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.34578 TO 10.175.160.170.13721 fd = 20
13:30:22.425 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
13:30:22.457 [2093] <2> db_end: Need to collect reply
13:30:22.462 [2093] <4> write_backup: begin writing backup id q4de8nsy409bk_1399506399, copy 2, fragment 1, to media id BM5040 on drive Drive011 (index 10)
13:30:22.462 [2093] <2> write_data: twin_index: 0 active: 1 dont_process: 0 wrote_backup_hdr: 0 finished_buff: 0 saved_cindex: -1 twin_is_disk 0 delay_brm: 0
13:30:22.462 [2093] <2> write_data: Total Kbytes transferred 0
13:30:22.468 [2093] <2> write_data: absolute block position prior to writing backup header(s) is 1543756, copy 2
13:30:22.468 [2093] <2> io_write_back_header: drive index 10, q4de8nsy409bk_1399506399, file num = 356, mpx_headers = 0, copy 2
13:30:22.523 [2093] <2> write_data: completed writing backup header, start writing data when first buffer is available, copy 2
13:30:22.536 [2093] <2> write_data: first write, twin_index: 0 cindex: 0 dont_process: 1 wrote_backup_hdr: 1 finished_buff: 0
13:30:22.536 [2093] <2> write_data: received first buffer (524288 bytes), begin writing data


For me positioning took place between 13:28:57.274 and 13:30:15.902.

First data buffer has been received but there is no progress since.

Mark_Solutions
Level 6
Partner Accredited Certified

What are you size and number data buffers that are in use?

They look to be 512kb (524288) which is quite large - even though you are using RHEL

If so those tapes would not be recoverable in a DR situation with a Windows media server (however unlikely that may be)

I take it you have no issues with restores form tape?

Was that a brand new tape? Could i see:

vmquery -m BM5040

if not as it seems report it as such and wondering if the block size could be the issue:

13:30:16.992 [2093] <2> io_position_for_write: empty header found on BM5040, OK, copy 2
13:30:16.992 [2093] <2> io_ioctl: command (2)MTBSF 1 0x0 from (bptm.c.22463) on drive index 10
13:30:17.031 [2093] <2> io_ioctl: command (5)MTWEOF 1 0x0 from (bptm.c.22515) on drive index 10

 

Markus_Bonet
Level 3

more NUMBER_DATA_BUFFERS*
::::::::::::::
NUMBER_DATA_BUFFERS
::::::::::::::
32
::::::::::::::
NUMBER_DATA_BUFFERS_DISK
::::::::::::::
64

 

more SIZE_DATA_BUFFERS*
::::::::::::::
SIZE_DATA_BUFFERS
::::::::::::::
524288
::::::::::::::
SIZE_DATA_BUFFERS_DISK
::::::::::::::
1048576

 

vmquery -m BM5040
================================================================================
media ID:              BM5040
media type:            1/2" cartridge tape (6)
barcode:               BM5040
media description:     ---
volume pool:           FSPOOL (6)
robot type:            TLD - Tape Library DLT (8)
robot number:          1
robot slot:            136
robot control host:    q4de8nmaster01
volume group:          000_00001_TLD
vault name:            ---
vault sent date:       ---
vault return date:     ---
vault slot:            ---
vault session id:      ---
vault container id:    -
created:               Fri 27 May 2011 09:20:55 AM CEST
assigned:              Sat 03 May 2014 02:47:30 PM CEST
last mounted:          Thu 08 May 2014 01:28:55 PM CEST
first mount:           Thu 15 Mar 2012 07:48:16 AM CET
expiration date:       ---
number of mounts:      534
max mounts allowed:    ---
status:                0x0

 

 

Restores are no problem.

This is part of a standard test every new environment muss succeed (restore from a B2T, restore from B2D and restore from B2D2T).

Mark_Solutions
Level 6
Partner Accredited Certified

OK - anything in the bptm log since 13:30 yesterday in relation to this - maybe once it actually gets going it will say something

What does an images on media report show for that tape?

Markus_Bonet
Level 3

Here is some output when it proceeded. There is much more, but these lines show how the duplication actually proceeded:

13:30:22.536 [2093] <2> write_data: received first buffer (524288 bytes), begin writing data
15:21:54.233 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
15:21:54.234 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.43461 TO 10.175.160.170.13721 fd = 20
15:21:54.234 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
15:21:54.238 [2093] <2> db_end: Need to collect reply
15:21:54.239 [2093] <4> report_throughput: VBRT 1 2093 1 1 Drive011 BM5040 0 1 0 200192  200192 (bptm.c.25547)
15:21:54.239 [2093] <2> write_data: Total Kbytes transferred 200192
17:19:21.595 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:19:21.619 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.37924 TO 10.175.160.170.13721 fd = 20
17:19:21.619 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:19:21.620 [2093] <2> db_end: Need to collect reply
17:19:21.621 [2093] <4> report_throughput: VBRT 1 2093 1 1 Drive011 BM5040 0 1 0 200192  200192 (bptm.c.25547)
17:19:21.621 [2093] <2> write_data: Total Kbytes transferred 400384
17:21:48.323 [2093] <2> write_data: writing block shorter than BUFF_SIZE, 200704 bytes
17:21:48.324 [2093] <2> write_data: writing short block, 200704 bytes, remainder 0
17:21:48.324 [2093] <2> write_data: waited for full buffer 779 times, delayed 898671 times
17:21:48.325 [2093] <2> write_data: Total Kbytes transferred 745156
17:21:48.325 [2093] <2> write_backup: write_data() returned, exit_status = 0, CINDEX = 0, TWIN_INDEX = 0, backup_status = 0
17:21:48.325 [2093] <2> io_ioctl: command (5)MTWEOF 1 0x81 from (bptm.c.20314) on drive index 10
17:21:48.330 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:21:48.331 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.49953 TO 10.175.160.170.13721 fd = 20
17:21:48.331 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:48.337 [2093] <2> db_end: Need to collect reply
17:21:48.338 [2093] <4> report_throughput: VBRT 1 2093 5 1 Drive011 BM5040 0 1 0 344772  344772 (bptm.c.20564)
17:21:48.338 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 77
17:21:48.339 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.38527 TO 10.175.160.170.13721 fd = 20
17:21:48.339 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:48.344 [2093] <2> db_end: Need to collect reply
17:21:48.503 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:21:48.505 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.52654 TO 10.175.160.170.13721 fd = 20
17:21:48.505 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:48.510 [2093] <2> db_end: Need to collect reply
17:21:48.511 [2093] <4> write_backup: successfully wrote backup id q4de8nsy409bk_1399506399, copy 2, fragment 1, 745156 Kbytes at 53.663 Kbytes/sec
17:21:49.045 [2093] <2> read_config_file: using 262144 value from /usr/openv/netbackup/NET_BUFFER_SZ
17:21:49.045 [2093] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
17:21:49.045 [2093] <2> create_data_socket: tir socket created, fd is 21
17:21:49.104 [2093] <2> write_data_tir: size of tir data received from read side is 6467575 bytes + 0 GB
17:21:49.108 [2093] <2> write_data_tir: INF - Begin writing True Image Recovery information for copy 2.
17:21:49.111 [2093] <2> write_data_tir: absolute block position prior to writing backup header(s) is 1545214, copy 2
17:21:49.111 [2093] <2> write_data_tir: block position check: actual 1545214, expected 1545214
17:21:49.111 [2093] <2> io_write_back_header: drive index 10, q4de8nsy409bk_1399506399, file num = 357, mpx_headers = 0, copy 2
17:21:49.302 [2093] <2> write_data_tir: writing first TIR block to media, bytes = 524288
17:21:49.394 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:21:49.395 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.36531 TO 10.175.160.170.13721 fd = 17
17:21:49.395 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:49.399 [2093] <2> db_end: Need to collect reply
17:21:49.400 [2093] <4> report_throughput: VBRT 1 2093 7 1 Drive011 BM5040 0 1 0 6316  6316 (bptm.c.27900)
17:21:49.400 [2093] <2> write_data_tir: Total Kbytes transferred 751471
17:21:49.400 [2093] <2> io_ioctl: command (5)MTWEOF 1 0x81 from (bptm.c.27930) on drive index 10
17:21:49.402 [2093] <2> write_data_tir: adding copy 2 fragment -1 (6316 Kbytes) for TIR data, filenum 357
17:21:49.402 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 77
17:21:49.403 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.56945 TO 10.175.160.170.13721 fd = 17
17:21:49.403 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:49.408 [2093] <2> db_end: Need to collect reply
17:21:49.566 [2093] <2> send_MDS_msg: MEDIADB 1 35835911 BM5040 4000340 *NULL* 6 1399121250 1399547721 1407575274 0 790672804 357 357 5 6 0 0 1024 0 1543756 0
17:21:49.800 [2093] <2> notify: executing - /usr/openv/netbackup/bin/backup_notify bptm q4de8nsy409bk_1399506399
17:21:49.831 [2093] <2> Media_dispatch_signal: calling child_wait (tmcommon.c:3963) delay 0 seconds
17:21:49.831 [2093] <2> write_backup: EXIT STATUS 0 1
17:21:49.832 [2093] <2> write_backup: Got bpduplicate acknowledgement status: 0, err: 0
17:21:49.932 [2093] <2> read_brm_msg: START BACKUP -jobid 3576234 -c q4de8ncoe39bk -b q4de8ncoe39bk_1399506401 -cl fsi_win_3m_auto_wed -sl incr -shm -blksize 1048576 -src_cn 1 -bt 1399506401 -st 4 -rl 5 -date -1 -ct 13 -tir -cn 2 -df 1 -eari 0 -use_vnetd -mtd -dv
17:21:49.953 [2093] <2> read_config_file: using 524288 value from /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS
17:21:49.953 [2093] <2> validate_parameters: DUP_BLKSIZE = 1048576, removable media bufsize = 524288
17:21:49.954 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 55
17:21:49.955 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.39771 TO 10.175.160.170.13721 fd = 17
17:21:49.955 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:49.961 [2093] <2> db_end: Need to collect reply
17:21:49.961 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 81
17:21:49.962 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.50536 TO 10.175.160.170.13721 fd = 17
17:21:49.962 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:50.107 [2093] <2> db_end: Need to collect reply
17:21:50.107 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 243
17:21:50.108 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.59016 TO 10.175.160.170.13721 fd = 17
17:21:50.108 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:50.113 [2093] <2> db_end: Need to collect reply
17:21:50.220 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 81
17:21:50.221 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.60509 TO 10.175.160.170.13721 fd = 17
17:21:50.221 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:50.365 [2093] <2> db_end: Need to collect reply
17:21:50.365 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:21:50.366 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.41724 TO 10.175.160.170.13721 fd = 17
17:21:50.366 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:50.371 [2093] <2> db_end: Need to collect reply
17:21:50.372 [2093] <4> report_client: VBRC 2 2093 1 q4de8ncoe39bk_1399506401 13 fsi_win_3m_auto_wed 4 incr 0 1 2
17:21:50.372 [2093] <2> write_backup: calling first time in write_backup (TRUE)
17:21:50.372 [2093] <2> create_data_socket: tmcommon.c.6876: IPC, TIRIPC, PID, blksize: IPC NULL /tmp/vnet-02093399562510372425000000003-SuFa6k 2093 524288
17:21:50.712 [2093] <2> setup_dup_shm: media_shm_info = 64 524288 570130448 6940 0 0 0 0 1
17:21:50.712 [2093] <2> setup_dup_shm: SHMID = 570130448, BUF_PTR = 0x0x2b42995ac000, BUF_CONTROL = 0x0x2b429b5ac000, ReadyPtr = 0x0x2b429b5ac600, MPX_RES_CNTL = 0x0x2b429b5ac608
17:21:50.712 [2093] <2> send_MDS_msg: MEDIADB 1 35835911 BM5040 4000340 *NULL* 6 1399121250 1399562510 1407575274 0 790672804 357 357 5 6 0 0 1024 0 1543756 0
17:21:50.790 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:21:50.791 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.51017 TO 10.175.160.170.13721 fd = 20
17:21:50.792 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:21:50.797 [2093] <2> db_end: Need to collect reply
17:21:50.797 [2093] <4> write_backup: begin writing backup id q4de8ncoe39bk_1399506401, copy 2, fragment 1, to media id BM5040 on drive Drive011 (index 10)
17:21:50.798 [2093] <2> write_data: twin_index: 0 active: 1 dont_process: 0 wrote_backup_hdr: 0 finished_buff: 0 saved_cindex: -1 twin_is_disk 0 delay_brm: 0
17:21:50.798 [2093] <2> write_data: Total Kbytes transferred 751471
17:21:50.801 [2093] <2> write_data: absolute block position prior to writing backup header(s) is 1545229, copy 2
17:21:50.801 [2093] <2> write_data: block position check: actual 1545229, expected 1545229
17:21:50.801 [2093] <2> io_write_back_header: drive index 10, q4de8ncoe39bk_1399506401, file num = 358, mpx_headers = 0, copy 2
17:21:50.803 [2093] <2> write_data: completed writing backup header, start writing data when first buffer is available, copy 2
17:21:50.954 [2093] <2> write_data: first write, twin_index: 0 cindex: 0 dont_process: 1 wrote_backup_hdr: 1 finished_buff: 0
17:21:50.954 [2093] <2> write_data: received first buffer (524288 bytes), begin writing data
17:22:04.799 [2093] <2> write_data: writing block shorter than BUFF_SIZE, 67584 bytes
17:22:04.799 [2093] <2> write_data: writing short block, 67584 bytes, remainder 0
17:22:04.799 [2093] <2> write_data: waited for full buffer 10 times, delayed 176 times
17:22:04.800 [2093] <2> write_data: Total Kbytes transferred 1465777
17:22:04.800 [2093] <2> write_backup: write_data() returned, exit_status = 0, CINDEX = 0, TWIN_INDEX = 0, backup_status = 0
17:22:04.800 [2093] <2> io_ioctl: command (5)MTWEOF 1 0x81 from (bptm.c.20314) on drive index 10
17:22:04.806 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:22:04.807 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.58120 TO 10.175.160.170.13721 fd = 20
17:22:04.807 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:22:04.812 [2093] <2> db_end: Need to collect reply
17:22:04.812 [2093] <4> report_throughput: VBRT 1 2093 5 1 Drive011 BM5040 0 1 0 714306  714306 (bptm.c.20564)
17:22:04.812 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 77
17:22:04.813 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.45044 TO 10.175.160.170.13721 fd = 20
17:22:04.813 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:22:04.819 [2093] <2> db_end: Need to collect reply
17:22:04.944 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:22:04.945 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.57603 TO 10.175.160.170.13721 fd = 20
17:22:04.945 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:22:04.950 [2093] <2> db_end: Need to collect reply
17:22:04.951 [2093] <4> write_backup: successfully wrote backup id q4de8ncoe39bk_1399506401, copy 2, fragment 1, 714306 Kbytes at 51589.340 Kbytes/sec
17:22:05.690 [2093] <2> read_config_file: using 262144 value from /usr/openv/netbackup/NET_BUFFER_SZ
17:22:05.690 [2093] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
17:22:05.690 [2093] <2> create_data_socket: tir socket created, fd is 21
17:22:05.749 [2093] <2> write_data_tir: size of tir data received from read side is 6799720 bytes + 0 GB
17:22:05.753 [2093] <2> write_data_tir: INF - Begin writing True Image Recovery information for copy 2.
17:22:05.758 [2093] <2> write_data_tir: absolute block position prior to writing backup header(s) is 1546627, copy 2
17:22:05.758 [2093] <2> write_data_tir: block position check: actual 1546627, expected 1546627
17:22:05.758 [2093] <2> io_write_back_header: drive index 10, q4de8ncoe39bk_1399506401, file num = 359, mpx_headers = 0, copy 2
17:22:07.659 [2093] <2> write_data_tir: writing first TIR block to media, bytes = 524288
17:22:08.346 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 1
17:22:08.346 [2093] <2> logconnections: BPDBM CONNECT FROM 10.175.160.168.52722 TO 10.175.160.170.13721 fd = 17
17:22:08.346 [2093] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:871] Ignoring VxSS authentication 2 0x2
17:22:08.351 [2093] <2> db_end: Need to collect reply
17:22:08.352 [2093] <4> report_throughput: VBRT 1 2093 7 1 Drive011 BM5040 0 1 0 6641  6641 (bptm.c.27900)
17:22:08.352 [2093] <2> write_data_tir: Total Kbytes transferred 1472417
17:22:08.352 [2093] <2> io_ioctl: command (5)MTWEOF 1 0x81 from (bptm.c.27930) on drive index 10
17:22:08.354 [2093] <2> write_data_tir: adding copy 2 fragment -1 (6641 Kbytes) for TIR data, filenum 359
17:22:08.354 [2093] <2> ConnectionCache::connectAndCache: Acquiring new connection for host q4de8nmaster01, query type 77


Will look for the media later and post it here.

Mark_Solutions
Level 6
Partner Accredited Certified

OK - cant see anything other than my feeling yesterday - would be interested to see what images are supposed to be on that tape and if a verify on the oldest one works - just to be sure.

Markus_Bonet
Level 3

But there is 2 times a gap of 2 hours where nothing happened.What might be a reason for this?

How is the communication between bpdm and bptm organized? bptm will mount the tape and position it. How does bpdm get informed to actually deliver data to bptm?

Do you think that it is related somehow to somehting physical with the drives and cartdridges?
Honestly, I don't see a valid point here. If this would be true, why are B2T jobs not affected by this behavior?
 

Mark_Solutions
Level 6
Partner Accredited Certified

Really not sure what is causing it - it may be hardware or it may be a resource issue (nbrb on the Master or paged memory on the media servers)

You probably need to raise a case with Symantec and up the logging levels to try and find out what is actually happening during the 2 hour gaps - maybe each media server having 16 drives is just too much .. bear in mind that when you write straight to tape it has half the buffering processes that it will have when duplicating from disk to tape

Markus_Bonet
Level 3

For now, we have to live with the situation.

The server hardware is being replaced in the near future, so I will re-check the situation as soon we are on RHEL6 and new hardware.