cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

Constant re-mounting of same tape

adrianmarsh
Level 4

Hi,

My setup:  Netbackup 7.1, a Dell LTO4 2 drive robot.

 

I'm practising restores of data, and I've noticed that NB seems to be unmounting, and then re-mounting the same tape over and over, rather than realising that the next bit of data needed is on the same tape, same robot, same drive. This causes the mount counters on that tape to increase, and puts unecassary ware and tear on the robot library. It also seems to add time to the job, because NB is waiting for the resource to free up.

Heres what I'm doing, I'm running Phase 1 and Phase 2 imports from offsite expired tapes.  Some of the jobs on these tapes are small SQL backups.

In the job logs I see the below log, and this is just a small sample from one tapes Phase 2.
You can see it loading/unloading the same tape again and again.

I seem to remember from my Admin training, that there was some timer I could set, where the tape wouldn't get unloaded for xx seconds, but not sure if that was part of NB, or perhaps part of the tape drive system we were using in training.  I know our own robot won't move the tape out of the drive unless NB instructs it to, so I'm guessing its potentially a NB setting I could tweak ?

 

 

03/10/2012 11:09:40 - started process bptm (8188)
03/10/2012 11:09:40 - mounting UBI009
03/10/2012 11:09:40 - Info bptm(pid=8188) INF - Waiting for mount of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.
03/10/2012 11:09:40 - granted resource UBI009
03/10/2012 11:09:40 - granted resource IBM.ULT3580-TD4.001
03/10/2012 11:10:18 - mounted; mount time: 00:00:38
03/10/2012 11:10:18 - Info bptm(pid=8188) UBI009            
03/10/2012 11:10:18 - Info bptm(pid=8188) INF - Waiting for positioning of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.
03/10/2012 11:10:18 - positioning UBI009 to file 37
03/10/2012 11:11:17 - positioned UBI009; position time: 00:00:59
03/10/2012 11:11:17 - begin reading
03/10/2012 11:11:28 - Info bptm(pid=8188) waited for empty buffer 1 times, delayed 1 times    
03/10/2012 11:11:28 - end reading; read time: 00:00:11
03/10/2012 11:11:28 - Info tar32(pid=9532) done. status: 0          
03/10/2012 11:11:29 - Info bptm(pid=8188) completed reading backup image         
03/10/2012 11:11:29 - Info bptm(pid=8188) EXITING with status 0 <----------        
03/10/2012 11:11:29 - Info bpbrm(pid=11076) validating image for client UBIQ-SERV1        
03/10/2012 11:11:29 - Info bpbrm(pid=5828) ubiq-serv22.ubiquisys.local is the host to restore to      
03/10/2012 11:11:29 - Info bpbrm(pid=5828) reading file list from client        
03/10/2012 11:11:29 - Info bpbrm(pid=5828) starting tar32 on client         
03/10/2012 11:11:30 - Info tar32(pid=10648) Restore started           
03/10/2012 11:11:30 - Info bptm(pid=10936) start            
03/10/2012 11:11:30 - started process bptm (10936)
03/10/2012 11:11:30 - Info bpdm(pid=10936) reading backup image          
03/10/2012 11:11:30 - Info bptm(pid=10936) using 30 data buffers         
03/10/2012 11:11:30 - Info bptm(pid=10936) spawning a child process         
03/10/2012 11:11:30 - Info bptm(pid=10936) child pid: 10604          
03/10/2012 11:11:30 - Info bptm(pid=10604) start            
03/10/2012 11:11:30 - started process bptm (10604)
03/10/2012 11:11:30 - requesting resource UBI009
03/10/2012 11:11:30 - awaiting resource UBI009 Reason: Media is in use, Media Server: N/A, 
Robot Number: NONE, Robot Type: NONE, Media ID: UBI009, Drive Name: N/A, 
Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A
 
03/10/2012 11:11:34 - Info tar32(pid=9532) done. status: 0: the requested operation was successfully completed    
03/10/2012 11:12:57 - Info bptm(pid=10936) Waiting for mount of media id UBI009 (copy 1) on server ubiq-serv22.ubiquisys.local. 
03/10/2012 11:12:57 - started process bptm (10936)
03/10/2012 11:12:57 - mounting UBI009
03/10/2012 11:12:57 - Info bptm(pid=10936) INF - Waiting for mount of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.
03/10/2012 11:12:57 - granted resource UBI009
03/10/2012 11:12:57 - granted resource IBM.ULT3580-TD4.000
03/10/2012 11:13:37 - mounted; mount time: 00:00:40
03/10/2012 11:13:37 - Info bptm(pid=10936) UBI009            
03/10/2012 11:13:37 - Info bptm(pid=10936) INF - Waiting for positioning of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.
03/10/2012 11:13:37 - positioning UBI009 to file 38
03/10/2012 11:14:37 - positioned UBI009; position time: 00:01:00
03/10/2012 11:14:37 - begin reading
03/10/2012 11:19:08 - Info bptm(pid=10936) waited for empty buffer 219 times, delayed 1271 times    
03/10/2012 11:19:08 - end reading; read time: 00:04:31
03/10/2012 11:19:08 - positioning UBI009 to file 39
03/10/2012 11:19:08 - positioned UBI009; position time: 00:00:00
03/10/2012 11:19:08 - begin reading
03/10/2012 11:22:59 - Info bptm(pid=10936) waited for empty buffer 20 times, delayed 1345 times    
03/10/2012 11:22:59 - end reading; read time: 00:03:51
03/10/2012 11:22:59 - positioning UBI009 to file 40
03/10/2012 11:22:59 - positioned UBI009; position time: 00:00:00
03/10/2012 11:22:59 - begin reading
03/10/2012 11:24:34 - Info bptm(pid=10936) waited for empty buffer 1049 times, delayed 3165 times    
03/10/2012 11:24:34 - end reading; read time: 00:01:35
03/10/2012 11:24:34 - positioning UBI009 to file 41
03/10/2012 11:24:34 - positioned UBI009; position time: 00:00:00
03/10/2012 11:24:34 - begin reading
03/10/2012 11:25:35 - Info bptm(pid=10936) waited for empty buffer 37 times, delayed 2438 times    
03/10/2012 11:25:35 - end reading; read time: 00:01:01
03/10/2012 11:25:35 - positioning UBI009 to file 42
03/10/2012 11:25:35 - positioned UBI009; position time: 00:00:00
03/10/2012 11:25:35 - begin reading
03/10/2012 11:25:50 - Info bptm(pid=10936) waited for empty buffer 101 times, delayed 484 times    
03/10/2012 11:25:50 - end reading; read time: 00:00:15
03/10/2012 11:25:50 - Info tar32(pid=10648) done. status: 0          
03/10/2012 11:25:50 - Info bptm(pid=10936) completed reading backup image         
03/10/2012 11:25:50 - Info bptm(pid=10936) EXITING with status 0 <----------        
03/10/2012 11:25:50 - Info bpbrm(pid=5828) validating image for client UBIQ-SERV1        
03/10/2012 11:25:50 - Info bpbrm(pid=10880) ubiq-serv22.ubiquisys.local is the host to restore to      
03/10/2012 11:25:50 - Info bpbrm(pid=10880) reading file list from client        
03/10/2012 11:25:50 - Info bpbrm(pid=10880) starting tar32 on client         
03/10/2012 11:25:52 - Info tar32(pid=9608) Restore started           
03/10/2012 11:25:52 - Info bptm(pid=11032) start            
03/10/2012 11:25:52 - started process bptm (11032)
03/10/2012 11:25:52 - Info bpdm(pid=11032) reading backup image          
03/10/2012 11:25:52 - Info bptm(pid=11032) using 30 data buffers         
03/10/2012 11:25:52 - Info bptm(pid=11032) spawning a child process         
03/10/2012 11:25:52 - Info bptm(pid=11032) child pid: 4336          
03/10/2012 11:25:52 - Info bptm(pid=4336) start            
03/10/2012 11:25:52 - requesting resource UBI009
03/10/2012 11:25:53 - started process bptm (4336)
03/10/2012 11:25:53 - awaiting resource UBI009 Reason: Media is in use, Media Server: N/A, 
Robot Number: NONE, Robot Type: NONE, Media ID: UBI009, Drive Name: N/A, 
Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A
 
03/10/2012 11:25:55 - Info tar32(pid=10648) done. status: 0: the requested operation was successfully completed    
03/10/2012 11:26:50 - Info bptm(pid=11032) Waiting for mount of media id UBI009 (copy 1) on server ubiq-serv22.ubiquisys.local. 
03/10/2012 11:26:50 - started process bptm (11032)
03/10/2012 11:26:50 - mounting UBI009
03/10/2012 11:26:50 - Info bptm(pid=11032) INF - Waiting for mount of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.
03/10/2012 11:26:50 - granted resource UBI009
03/10/2012 11:26:50 - granted resource IBM.ULT3580-TD4.001
03/10/2012 11:27:28 - mounted; mount time: 00:00:38
03/10/2012 11:27:28 - Info bptm(pid=11032) UBI009            
03/10/2012 11:27:28 - Info bptm(pid=11032) INF - Waiting for positioning of media id UBI009 on server ubiq-serv22.ubiquisys.local for reading.
03/10/2012 11:27:28 - positioning UBI009 to file 43
03/10/2012 11:28:17 - positioned UBI009; position time: 00:00:49
03/10/2012 11:28:17 - begin reading
9 REPLIES 9

mph999
Level 6
Employee Accredited

Try host properties > media 

Media unmount delay 

 

Regards,

 

martin

Marianne
Level 6
Partner    VIP    Accredited Certified

Was the tape actually dismounted? Check bptm log for confirmation, If log folder does not exist on the media server, please create it. Although details say 'mounting' each time, chances are that the tape was not actually dismounted.

The default unmount delay is 3 minutes.

See Host Properties -> Media Server -> Media -> Media Unmount Delay.

 

adrianmarsh
Level 4

Hi, and thanks for the quick response.

Ok so the host properties (I've one server) are set to 180s.   I was actually watching the tape drive web gui as it ran and it does look like its unmounting (I see the Unloading, then a few seconds later 'Loading' as the Drive status)

In the bptm log I see lines like:

 

16:13:32.965 [7632.8624] <2> read_backup_mount_media: media id UBI009 mounted on drive index 0
16:13:32.965 [7632.8624] <4> report_throughput: VBRT 1 7632 0 1 IBM.ULT3580-TD4.000 UBI009 0 1 0 0  0 (bptm.c.11658)
 
and
 
16:14:35.136 [10888.6120] <2> really_tpunmount: tpunmount'ing C:\Program Files\Veritas\NetBackup\db\media\tpreq\drive_IBM.ULT3580-TD4.000
16:14:35.937 [10888.6120] <2> tapelib: wait_for_ltid, UnMount, timeout 2400
 
My phase 2 (LTO4 tape that has 149 images on it) that I started at 09:30 is still running (16:19), and it says UBI009 has been mounted 62 times today, for this one p2 re-catalogue. I'm guessing I'll get to about 149 mounts before its done.

 

Marianne
Level 6
Partner    VIP    Accredited Certified
Please copy bptm log to bptm.txt and post here as File attachment. I don't think tape is actually dismounted all the time. We need full bptm to confirm.

adrianmarsh
Level 4

Here you go.  I've masked out the domain name, but otherwise untouched.

The phase 2 restore took nearly 8 hours, starting at 09:30 and finishing at 17:19

This is the whole days logs though.

From what I saw on the TL4000 web gui, it does eject the tapes and then re-load them.

Mark_Solutions
Level 6
Partner Accredited Certified

This is actually load balancing the drive usage which is the default behavoir for NetBackup

You will see that it uses one drive then unmounts and uses the next one

You would need to restrict the operation to using just one tape drive to prevent this behavoir, either by forcing it to use a Storage Unit with one tape drive or downing the other drives.

Hope this helps

adrianmarsh
Level 4

Hi Mark,

Thanks for the info. I'm running more tests next week so I'll try that trick.

I'd call that ineffecient usage there though. I mean its all one job thats being run, not seperate jobs so it should know its going to re-use the same tape. Having to down a perfectly good drive might impact other backups, for no reason really.  For me at this time though its workable.  I'll try the same tape next week and see how it improves the time.

adrianmarsh
Level 4

Hi Mark,

 

I'm not so sure this is whats happening now.  I've tried reloading tape ubi009 again, and disabled one drive. In the job log I still say its waiting for media.. then mounting media etc, and in bptm theres still lines like:

 

10:06:14.686 [2096.9068] <2> job_monitoring_exex: ACK disconnect
10:06:14.686 [2096.9068] <2> job_disconnect: Disconnected
10:06:14.686 [2096.9068] <2> bptm: Calling tpunmount for media UBI009
10:06:14.686 [2096.9068] <2> send_MDS_msg: MEDIA_DONE 0 38787 0 UBI009 4000137 180 {463C0942-6FDC-4961-9D0B-0321CA54C59B}
10:06:14.687 [2096.9068] <2> packageBptmResourceDoneMsg: msg (MEDIA_DONE 0 38787 0 UBI009 4000137 180 {463C0942-6FDC-4961-9D0B-0321CA54C59B})
10:06:14.687 [2096.9068] <2> packageBptmResourceDoneMsg: keyword MEDIA_DONE version 0 jobid 38787 copyNum 0 mediaId UBI009 mediaKey 4000137 unloadDelay 180 allocId {463C0942-6FDC-4961-9D0B-0321CA54C59B}
10:06:14.687 [2096.9068] <2> packageBptmResourceDoneMsg: returns 0
10:06:14.689 [2096.9068] <2> JobInst::sendIrmMsg: returning
 
Which suggests its still unloading the media.

Yogesh_Jadhav1
Level 5

Marsh, Can you check the "nbrbutil -dump | grep media_ID" and see if this media is locked and not yet released to start the new process or if there is incosistency in the allocation which you can check using command "nbrbutil -reportinconsistentallocations".