β10-03-2012 03:34 AM
Hi,
My setup: Netbackup 7.1, a Dell LTO4 2 drive robot.
I'm practising restores of data, and I've noticed that NB seems to be unmounting, and then re-mounting the same tape over and over, rather than realising that the next bit of data needed is on the same tape, same robot, same drive. This causes the mount counters on that tape to increase, and puts unecassary ware and tear on the robot library. It also seems to add time to the job, because NB is waiting for the resource to free up.
Heres what I'm doing, I'm running Phase 1 and Phase 2 imports from offsite expired tapes. Some of the jobs on these tapes are small SQL backups.
In the job logs I see the below log, and this is just a small sample from one tapes Phase 2.
You can see it loading/unloading the same tape again and again.
I seem to remember from my Admin training, that there was some timer I could set, where the tape wouldn't get unloaded for xx seconds, but not sure if that was part of NB, or perhaps part of the tape drive system we were using in training. I know our own robot won't move the tape out of the drive unless NB instructs it to, so I'm guessing its potentially a NB setting I could tweak ?
03/10/2012 11:09:40 - started process bptm (8188)03/10/2012 11:09:40 - mounting UBI00903/10/2012 11:09:40 - Info bptm(pid=8188) INF - Waiting for mount of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.03/10/2012 11:09:40 - granted resource UBI00903/10/2012 11:09:40 - granted resource IBM.ULT3580-TD4.00103/10/2012 11:10:18 - mounted; mount time: 00:00:3803/10/2012 11:10:18 - Info bptm(pid=8188) UBI00903/10/2012 11:10:18 - Info bptm(pid=8188) INF - Waiting for positioning of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.03/10/2012 11:10:18 - positioning UBI009 to file 3703/10/2012 11:11:17 - positioned UBI009; position time: 00:00:5903/10/2012 11:11:17 - begin reading03/10/2012 11:11:28 - Info bptm(pid=8188) waited for empty buffer 1 times, delayed 1 times03/10/2012 11:11:28 - end reading; read time: 00:00:1103/10/2012 11:11:28 - Info tar32(pid=9532) done. status: 003/10/2012 11:11:29 - Info bptm(pid=8188) completed reading backup image03/10/2012 11:11:29 - Info bptm(pid=8188) EXITING with status 0 <----------03/10/2012 11:11:29 - Info bpbrm(pid=11076) validating image for client UBIQ-SERV103/10/2012 11:11:29 - Info bpbrm(pid=5828) ubiq-serv22.ubiquisys.local is the host to restore to03/10/2012 11:11:29 - Info bpbrm(pid=5828) reading file list from client03/10/2012 11:11:29 - Info bpbrm(pid=5828) starting tar32 on client03/10/2012 11:11:30 - Info tar32(pid=10648) Restore started03/10/2012 11:11:30 - Info bptm(pid=10936) start03/10/2012 11:11:30 - started process bptm (10936)03/10/2012 11:11:30 - Info bpdm(pid=10936) reading backup image03/10/2012 11:11:30 - Info bptm(pid=10936) using 30 data buffers03/10/2012 11:11:30 - Info bptm(pid=10936) spawning a child process03/10/2012 11:11:30 - Info bptm(pid=10936) child pid: 1060403/10/2012 11:11:30 - Info bptm(pid=10604) start03/10/2012 11:11:30 - started process bptm (10604)03/10/2012 11:11:30 - requesting resource UBI00903/10/2012 11:11:30 - awaiting resource UBI009 Reason: Media is in use, Media Server: N/A,Robot Number: NONE, Robot Type: NONE, Media ID: UBI009, Drive Name: N/A,Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A03/10/2012 11:11:34 - Info tar32(pid=9532) done. status: 0: the requested operation was successfully completed03/10/2012 11:12:57 - Info bptm(pid=10936) Waiting for mount of media id UBI009 (copy 1) on server ubiq-serv22.ubiquisys.local.03/10/2012 11:12:57 - started process bptm (10936)03/10/2012 11:12:57 - mounting UBI00903/10/2012 11:12:57 - Info bptm(pid=10936) INF - Waiting for mount of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.03/10/2012 11:12:57 - granted resource UBI00903/10/2012 11:12:57 - granted resource IBM.ULT3580-TD4.00003/10/2012 11:13:37 - mounted; mount time: 00:00:4003/10/2012 11:13:37 - Info bptm(pid=10936) UBI00903/10/2012 11:13:37 - Info bptm(pid=10936) INF - Waiting for positioning of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.03/10/2012 11:13:37 - positioning UBI009 to file 3803/10/2012 11:14:37 - positioned UBI009; position time: 00:01:0003/10/2012 11:14:37 - begin reading03/10/2012 11:19:08 - Info bptm(pid=10936) waited for empty buffer 219 times, delayed 1271 times03/10/2012 11:19:08 - end reading; read time: 00:04:3103/10/2012 11:19:08 - positioning UBI009 to file 3903/10/2012 11:19:08 - positioned UBI009; position time: 00:00:0003/10/2012 11:19:08 - begin reading03/10/2012 11:22:59 - Info bptm(pid=10936) waited for empty buffer 20 times, delayed 1345 times03/10/2012 11:22:59 - end reading; read time: 00:03:5103/10/2012 11:22:59 - positioning UBI009 to file 4003/10/2012 11:22:59 - positioned UBI009; position time: 00:00:0003/10/2012 11:22:59 - begin reading03/10/2012 11:24:34 - Info bptm(pid=10936) waited for empty buffer 1049 times, delayed 3165 times03/10/2012 11:24:34 - end reading; read time: 00:01:3503/10/2012 11:24:34 - positioning UBI009 to file 4103/10/2012 11:24:34 - positioned UBI009; position time: 00:00:0003/10/2012 11:24:34 - begin reading03/10/2012 11:25:35 - Info bptm(pid=10936) waited for empty buffer 37 times, delayed 2438 times03/10/2012 11:25:35 - end reading; read time: 00:01:0103/10/2012 11:25:35 - positioning UBI009 to file 4203/10/2012 11:25:35 - positioned UBI009; position time: 00:00:0003/10/2012 11:25:35 - begin reading03/10/2012 11:25:50 - Info bptm(pid=10936) waited for empty buffer 101 times, delayed 484 times03/10/2012 11:25:50 - end reading; read time: 00:00:1503/10/2012 11:25:50 - Info tar32(pid=10648) done. status: 003/10/2012 11:25:50 - Info bptm(pid=10936) completed reading backup image03/10/2012 11:25:50 - Info bptm(pid=10936) EXITING with status 0 <----------03/10/2012 11:25:50 - Info bpbrm(pid=5828) validating image for client UBIQ-SERV103/10/2012 11:25:50 - Info bpbrm(pid=10880) ubiq-serv22.ubiquisys.local is the host to restore to03/10/2012 11:25:50 - Info bpbrm(pid=10880) reading file list from client03/10/2012 11:25:50 - Info bpbrm(pid=10880) starting tar32 on client03/10/2012 11:25:52 - Info tar32(pid=9608) Restore started03/10/2012 11:25:52 - Info bptm(pid=11032) start03/10/2012 11:25:52 - started process bptm (11032)03/10/2012 11:25:52 - Info bpdm(pid=11032) reading backup image03/10/2012 11:25:52 - Info bptm(pid=11032) using 30 data buffers03/10/2012 11:25:52 - Info bptm(pid=11032) spawning a child process03/10/2012 11:25:52 - Info bptm(pid=11032) child pid: 433603/10/2012 11:25:52 - Info bptm(pid=4336) start03/10/2012 11:25:52 - requesting resource UBI00903/10/2012 11:25:53 - started process bptm (4336)03/10/2012 11:25:53 - awaiting resource UBI009 Reason: Media is in use, Media Server: N/A,Robot Number: NONE, Robot Type: NONE, Media ID: UBI009, Drive Name: N/A,Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A03/10/2012 11:25:55 - Info tar32(pid=10648) done. status: 0: the requested operation was successfully completed03/10/2012 11:26:50 - Info bptm(pid=11032) Waiting for mount of media id UBI009 (copy 1) on server ubiq-serv22.ubiquisys.local.03/10/2012 11:26:50 - started process bptm (11032)03/10/2012 11:26:50 - mounting UBI00903/10/2012 11:26:50 - Info bptm(pid=11032) INF - Waiting for mount of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.03/10/2012 11:26:50 - granted resource UBI00903/10/2012 11:26:50 - granted resource IBM.ULT3580-TD4.00103/10/2012 11:27:28 - mounted; mount time: 00:00:3803/10/2012 11:27:28 - Info bptm(pid=11032) UBI00903/10/2012 11:27:28 - Info bptm(pid=11032) INF - Waiting for positioning of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.03/10/2012 11:27:28 - positioning UBI009 to file 4303/10/2012 11:28:17 - positioned UBI009; position time: 00:00:4903/10/2012 11:28:17 - begin reading
β10-03-2012 03:57 AM
Try host properties > media
Media unmount delay
Regards,
martin
β10-03-2012 04:03 AM
Was the tape actually dismounted? Check bptm log for confirmation, If log folder does not exist on the media server, please create it. Although details say 'mounting' each time, chances are that the tape was not actually dismounted.
The default unmount delay is 3 minutes.
See Host Properties -> Media Server -> Media -> Media Unmount Delay.
β10-03-2012 08:23 AM
Hi, and thanks for the quick response.
Ok so the host properties (I've one server) are set to 180s. I was actually watching the tape drive web gui as it ran and it does look like its unmounting (I see the Unloading, then a few seconds later 'Loading' as the Drive status)
In the bptm log I see lines like:
β10-03-2012 08:40 AM
β10-05-2012 07:07 AM
Here you go. I've masked out the domain name, but otherwise untouched.
The phase 2 restore took nearly 8 hours, starting at 09:30 and finishing at 17:19
This is the whole days logs though.
From what I saw on the TL4000 web gui, it does eject the tapes and then re-load them.
β10-05-2012 08:03 AM
This is actually load balancing the drive usage which is the default behavoir for NetBackup
You will see that it uses one drive then unmounts and uses the next one
You would need to restrict the operation to using just one tape drive to prevent this behavoir, either by forcing it to use a Storage Unit with one tape drive or downing the other drives.
Hope this helps
β10-05-2012 08:17 AM
Hi Mark,
Thanks for the info. I'm running more tests next week so I'll try that trick.
I'd call that ineffecient usage there though. I mean its all one job thats being run, not seperate jobs so it should know its going to re-use the same tape. Having to down a perfectly good drive might impact other backups, for no reason really. For me at this time though its workable. I'll try the same tape next week and see how it improves the time.
β10-08-2012 02:10 AM
Hi Mark,
I'm not so sure this is whats happening now. I've tried reloading tape ubi009 again, and disabled one drive. In the job log I still say its waiting for media.. then mounting media etc, and in bptm theres still lines like:
β10-08-2012 07:22 PM
Marsh, Can you check the "nbrbutil -dump | grep media_ID" and see if this media is locked and not yet released to start the new process or if there is incosistency in the allocation which you can check using command "nbrbutil -reportinconsistentallocations".