cancel
Showing results for 
Search instead for 
Did you mean: 

Downed drives - netbackup 5.1

sbecker
Level 2

My netbackup server keeps downing drives.  The drives pass all mechanical diags and the library reports no errors.  I have replaced one of the drives 4 times to no avail.  I have done "vmglob -delete" and "vmglob -set_gdbhost {master host} -h {master host}."  I then issued a "tpautoconf -upgrade."  From some of the errors, I get the feeling the netbackup database is a bit confused about its inventory as well. Any suggestions appreciated.

 

9 REPLIES 9

Stefaan_M
Level 4
Partner

Hi,

 

have you tried to rerun the device config wizard?

Have you been testing with robtest to make sure that you can move a tape from a slot to the drive?

 

Stefaan.

sbecker
Level 2
Robtest is is good on all the drives.  I just downloaded the latest 5.1 patches.  Hopefully, that will help with this issue.

J_H_Is_gone
Level 6

what errors are you getting?

 

check the serial number of the drives via the robot, then check the serial number via netbackup.

 

if you have miss configed you will have issues.

Rakesh_Khandelw
Level 6

What kind of drives you have and how these are connected with NetBackup server? By any chance you SSO is in use, that can cause serious drives issues if not configured correct.

You may want to check the firmware version on your drives as well and bring it upto the latest. 

 

 

Run tpautoconf -report_disc

 

and if it reports any errors take the corrective action. Also, you may want to run robtest to confirm the robot drive number defined in NetBackup matches the physical drive number.

 

 

 

sbecker
Level 2

All the serial names number match.

 

The drives are AIT-4's in a Spectra Logic 20k Gator.  It's direct attached through fiber.  Until 2 weeks ago, this set has ran with very few problems for 5 years.  One drive went bad and it's been downhill since.

 

tpautoconf -report_disc returns nothing. 

 

Netbackup keeps trying to mount tapes that is clearly shows not in the robot inventory.  I assume the volume db is scrambled. Here is some typical errors:

 

Nov 14 21:30:00 netbackup-master tl8d[14611]: sense_key = 0x02, asc = 0x3a, ascq = 0x00; retry = 0, errno = 42
Nov 14 21:35:38 netbackup-master tl8d[14611]: TL8(0) [14611] timed out after waiting 301 seconds for ready, drive 2
Nov 14 21:36:20 netbackup-master tl8d[7587]: Adding media ID A00404 to unmountable media list
Nov 14 21:36:20 netbackup-master tl8d[7587]: TL8(0) drive 2 (device 3) is being DOWNED, status: Unable to open drive
Nov 14 21:36:20 netbackup-master tl8d[7587]: Check integrity of the drive, drive path, and media
Nov 14 21:36:20 netbackup-master tl8d[7587]: Removing media ID A00404 from unmountable media list
Nov 14 21:48:50 netbackup-master ltid[7528]: Operator has UP'ed drive SONYSDX-900V3 (device 3)
Nov 14 21:50:04 netbackup-master tl8d[19826]: sense_key = 0x02, asc = 0x3a, ascq = 0x00; retry = 0, errno = 42
Nov 14 21:55:57 netbackup-master tl8d[19826]: TL8(0) [19826] timed out after waiting 315 seconds for ready, drive 2
Nov 14 21:56:38 netbackup-master tl8d[7587]: Adding media ID A00404 to unmountable media list
Nov 14 21:56:38 netbackup-master tl8d[7587]: TL8(0) drive 2 (device 3) is being DOWNED, status: Unable to open drive
Nov 14 21:56:38 netbackup-master tl8d[7587]: Check integrity of the drive, drive path, and media
Nov 14 21:56:38 netbackup-master tl8d[7587]: Removing media ID A00404 from unmountable media list
Nov 14 22:13:22 netbackup-master tl8d[20440]: sense_key = 0x02, asc = 0x3a, ascq = 0x00; retry = 0, errno = 42

 

Andy_Welburn
Level 6
Have you tried re-running an inventory? Maybe tapes your library is expecting to be in particular slots aren't?

 

Dodgy tapes?

 

If the config is messed up, maybe the only recourse will be to deleting the drives/library &  re-adding them.

J_H_Is_gone
Level 6

sounds like an issue I had with a library a while ago.

 

Rember when you do the inject of tapes to the library and "update"  you are having netbackup match itself to what the library says it has.....  so mine ended up being a library issue.

 

we tried a reboot on the library itself, which forced it to reinventory itself, then did a update to netbackup again.  This helped for a little while, but it got bad again.

 

We ended up get them to come out and diag the whole library and had to replace the "mother board" on it.

sbecker
Level 2
Looks like I fixed the problem.  I patched nbu to MP7. I removed all drives and the robot with tpconfig.  I then added them all back exactly as they were before and everything has been fine since.  I figured out the catalog is config'd to always use the same tape instead of the pool it has been assigned.  I changed that to another tape and nbu is happy.

sudu_mlk
Level 5
Hello,

Am also facing same issue,

In event log i found

Request for media ID EPM118 is being rejected because it cannot be found in designated slot

TLD(1) [4328] cannot open \\.\Changer1 (bus 0, target 0, lun 0): The requested resource is in use.   

TLD(1) [4328] Could not get current path from PnP name \\?\scsi#changer&ven_m4_data&prod_magfile&rev_3.04#4&2d7d12ab&0&000000#{53f56310-b6bf-11d0-94f2-00a0...}, using existing path \\.\Changer1

Removing media ID EPM118 from unmountable media list

TLD(0) [2816] timed out after waiting 672 seconds for ready, drive 1

Check integrity of the drive, drive path, and media

TLD(0) drive 1 (device 1) is being DOWNED, status: Unable to open drive

can it solve by patching with MP7?

Please help me

Thanks!!
SUDU