cancel
Showing results for 
Search instead for 
Did you mean: 

Tape Mismatch (Netbackup thinks tapes in drives are not part of library)

Georges_N
Level 4

Hi everyone

 

Been coming here for a while now for solutions, finally found a situation that doesn't seem to match others so here goes:

 

1 x IBM Tape Library

1 x Netbackup Master 7.1 running on Windows 2008 R2

 

What happens is this:

Everyday we remove/add tapes - same process, around the same time. Lately, whenever we run an inventory to insert the new tapes, the slots chosen by netbackup to insert them in are assigned to tapes currently being written on in drives. It's as if Netbackup ignores the information of those tapes, just because they are in drives right now and not idle in their slots.

 

Tried:

Deleting all drives (and the robot) and reinstalling step-by-step in Netbackup

Updating firmware in IBM library for library itself and drives

Billions of inventories on both sides. (ok maybe just millions)

 

The result is the same:

I eject Tapes A B C and D.

I insert tapes E F G and H

EFGH tapes will be assigned to the following slots: Slot 1 2 3 and 4

But wait... those slots are assigned to tapes I J K and L and those tapes are being used by the drives... what gives?

Result: When said tapes are done being used, and ejected from the drives, netbackup suddenly remembers their existence and tries to put them back in their previously-assigned slot. The result is this:

TLD(0) cannot dismount drive 2, slot 164 already is full

Operator/EMM server has DOWN'ed drive IBM.DRIVE02 (device 2)

Along with a few of these sprinkled around:

TLD(0) expected barcode (XXXXXX) in slot 164, found barcode (YYYYYY)

So, the result is a downed drive, and a tape stuck in it (has to be ejected using the IBM Library interface)

Any ideas, whatsoever?

Thanks for reading!

1 ACCEPTED SOLUTION

Accepted Solutions

Georges_N
Level 4

Ladies and gentlemen, it was the IBM TS3310 (3576) Firmware (which was installed by IBM to support the newer drives)

If someone else is googling this issue and finds this in the near future, upgrade to the TS3310 (3576) 640G.GS007 Library firmware, as well as the LTO5.D8D4 drive firmware.

 

Problem solved, thanks for the NBU help!

View solution in original post

26 REPLIES 26

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

it looks like your Master server is victim of media labeling,.

does the tapes that you insert are new tapes.. and have barcode labeled?

or old reused tapes?

how you are managing barcode rules in your master server envirornment?

show us the detail errors messages that you are reciving....

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Please tell us how exactly you are ejecting and inserting media in the robot?

What is the model name/number of your IBM robot?
Does your robot have a MAP (media access port)?

If you are using the eject facility in NBU, the robot as well as NBU will be updated.

If you are inserting the new tapes via the MAP and use NBU inventory to 'Empty MAP before updating', then NBU sends the message to the robot to move tapes into empty slots.
It is not NBU that decides where the tapes must go - it is the robot that makes this decision.

If you are using above method to eject and insert media and tapes are going into the wrong slots, there is something wrong with your robot. Log a call with your hardware vendor.

I have only ever seen these kind of errors where operators were opening the robot door to insert tapes manually.

mph999
Level 6
Employee Accredited

I'm going to guess this is a IBM 3584 series library ...

How many times have I seen similar issues like this with this library, lots and lots ...

The bad news, is that NetBackup is not selecting the slots to put the tapes in, the library is doing that.  This is almost certainly nothing to do with NBU, and nothing can be done in NBU to resolve it.

From your wording :

"Lately, whenever we run an inventory to insert the new tapes, the slots chosen by netbackup to insert them"

I think you are adding the tapes into the MAP/ CAP.

If ...

1.  You are adding the tapes to the MAP

2.  You have powercycled the library, and then run an inventory in NBU

and after this the issue still remains ...

There is not much else you can do, apart from call IBM.

There are some settings on the IBM libraries that can cause havic with NBU (or any other backkup software), but I've personally only seen these config issues cause slots to be non-visible.

This TN:

http://www.symantec.com/docs/TECH169477

explains that library invontary issues are outside of NBU.  I appreciate that the TN does not cover your exact issue, you will have to trust me.

The way it works is like this, despite what anyone else may say.

When running an inventory, NBU sends scsi commands to the library to start the inventory/ empty CAP.

What happens after that, is decided only by the library.

What is seen in NBU (eg where the tapes end up) is only sent back from the library, NBU has no control over this at all.

Hope this helps,

Martin

LucSkywalker195
Level 4
Certified

Sounds about right. I think your library isn't doing it's audit correctly or you have tape labels that don't match the internal tape label. Call IBM.

Georges_N
Level 4

Great input, thanks guys. I'm reading up on some IBM issues (it's a TS3310 Tape Library) but what has me baffled is my way of fixing it temporarily...

 

1) When doing an inventory after removing/adding tapes into Netbackup, I notice the inventory changes proposed are conflicting with each other:

Logically move media ID AAAAAA from slot 111 to standalone residence

Logically move media ID BBBBBB from slot 222 to standalone residence

Logically move media ID CCCCCC from standalone to slot 333

Logically move media ID DDDDDD from standalone to slot 111

Logically move media ID EEEEEE from standalone to slot 222

 

So, the first 2 lines suggest moving tapes from slots to outside the library, and the 3 next lines are new tapes I've added. The problem is that the 2 first tapes, A and B, are not in those slots, but are in fact in drives, currently being written. Netbackup ignores them and decides to assign their reserved slots (empty yes, but assigned to them) to new tapes. When those jobs finish, the tapes are ejected but the slots are assigned to new tapes, so netbackup downs the drive and leaves the tape in the drive. I have to manually select the tape in IBM and remove it, then assign it to a whole new slot.

A bit more about my choice of words:

New Tape = tape that was previously ejected, 99.9% of the time sent to a storage facility outside the building for a month. Sorry, I didn't mean an actual new tape with new label - these are tapes Netbackup already knows and has already labelled.

Barcode rules are managed by Netbackup using our policies - doesn't seem to be the issue.

We always select "empty MAP" when running inventory. We manually eject tapes and send them outside the building, and then manually insert tapes. Inventory every time...

 

If it is in fact the IBM robot that chooses which slots to put them in, then again, wow. I'll have to run some tests later today/tomorrow and see how it behaves first-hand.

 

Thanks again for the replies, I will get back with more info soon!

Will_Restore
Level 6

>>The problem is that the 2 first tapes, A and B, are not in those slots, but are in fact in drives, currently being written. Netbackup ignores them and decides to assign their reserved slots (empty yes, but assigned to them) to new tapes.

 

As Martin states above "What is seen in NBU (eg where the tapes end up) is only sent back from the library, NBU has no control over this at all."

You will need to engage IBM support.  The library assigns slots for the media not NetBackup.  The library seems to be forgetting the media in the drives will return to respective slots.

mph999
Level 6
Employee Accredited

"If it is in fact the IBM robot that chooses which slots to put them in, then again, wow. I'll have to run some tests later today/tomorrow and see how it behaves first-hand."

It is, I promise you ...

Despite 'very' popular opinion, including that of certain hardware vendors,  Netbackup actally has very very little to do with tape drives/ robots.

It does not write, or read from tape for a start, the actual write /read to a tape is carried out by the OS.

For inventories, loading, unloading drives, these are simple scsi commands sent to the library, what happens after the scsi command is sent, is 100% out of NBUs control.

Sure, if the NBU config is wrong this will cause issues, but the vast majority of tape and library issues are outside of NBU, and anything related to a TAPE_ALERT or ASC/ ASCQ errrors are 100% outside of NBU (apart from perhaps the cleaning tape alert if NBU is meant to be cleaning but has incorrect config for the cleaning tapes).

M

Mark_Solutions
Level 6
Partner Accredited Certified

Just wanted to double double check how you load tapes (though it does sound like the infamous IBM library issue)

Do you only put tapes into the mail slot (load port or what ever you want to call it) or do the operators open the doors / magazines and insert tapes into empty slots?

I have had a lot of customers that have loaded tapes into empty magazine slots not realizing that it that empty slot belongs to a tape in a drive it cannot then be unloaded as it belongs to that slot.

Just wanted to be sure of your process when loading tapes

 

Georges_N
Level 4

Actually we load tapes through the i/o door - the arm then loads them into a free slot (well, free as in empty but it looks like it's choosing the wrong slots)

 

Just an update - we're using Netbackup Vault now to do auto-ejects, which are fine. It's the loading part (and inventory) which just looks like it's trying to load tapes into taken slots (the taken slots are taken by tapes currently being written, everytime)

 

 

mph999
Level 6
Employee Accredited
For the new tapes going back into the library, via the I/O slots, I'm as sure as I can be that we don't control where they go. How do I know this. There is an option (AUTO_UPDATE_ROBOT I think ) that causes the library to send a signal to NBU, to run an inventory. Eg. you put tapes in, library sticks them in slots, NBU auto runs an inventory. If NBU knew where to put the tapes, you wouldn't need this option, as no inventory would be required. The very exsistance of this option shows that a inventory is required after inserting tapes via the MAP, because at that pont NBU has no idea about them. I would agree that when we unload a tape from a drive, yes, then we would specify which slot, but not when moving tapes out of the MAP. M

Georges_N
Level 4

Thanks for all the tips everyone - the library (IBM) works just fine on another system (CommVault) - this is a netbackup-only issue. It seems to happen exactly as is:

 

- XX amount of tapes in slots

- YY amount of tapes in drives

 

XX tapes are ok, YY tapes don't even exist according to netbackup, even though it sees them inside tape drives at this very moment.

Insert new tapes, inventory, netbackup will ignore tapes YY and assign the new tapes to those previously used by YY. Chaos ensues once YY tapes are done writing and are ejected.

mph999
Level 6
Employee Accredited

Hmm, I will revisit this and do a tab of research, but I am very sure that we have no control over this.

However, the comm vault example suggests otherwise, I have to agree.

M

mph999
Level 6
Employee Accredited

Can you set the robots log

Create /usr/openv/volmgr/debug/robots

Add VERBOSE to /usr/openv/volmgr/vm.conf

Restart media manager service

stopltid

then ltid -v

Recreate the issue and post up the log

Many thanks,

Martin

mph999
Level 6
Employee Accredited

I've been doing some testing and need to look through the logs from the library.

In the meantime, I spoke with a colleague who is very very knowledgeable with libraries.  He conformed that on an 'empty map' inventory we leave it up to the library where it puts the tapes.

Comm Vault may do things differently - I have to say I have absolurly no idea.

The IM chat between me and my colleague went like this

23:59'Me' When you run an inventory on a library with tapes in MAP and select 'Empty MAP' checkbox
Does NBU tell the library which slots to put the tapes in, or do we leave that upto the library to decide
 
00:00'Coleague' - library decides..
 
(I then explained the issue)
 
00:02'Colleague' - that's a bug in library f/w
 
That's the best I can do for the second, I'll look in more detail later.
 
Regards,
 
Martin

mph999
Level 6
Employee Accredited

George,

I sent you an email, can you log a call and post the case number up here.

I will post details up here, but at the moment things are 'unclear' and I don;t want to post up 'half of the story' as it will lead to confusion of people reading this post.

Martin

Georges_N
Level 4

Thanks Martin - will do.

 

From the event viewer (examples):

 

TLD(0) expected barcode (E00063L5) in slot 73, found barcode (C00032L5        )

TLD(0) expected barcode (D00031L5) in slot 149, found barcode (A00116L5        )

TLD(0) expected barcode (C00068L5) in slot 169, found barcode (A00017L5        )

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Is someone opening the robot door and moving tapes manually?

Georges_N
Level 4

no just the I/O when tapes are ejected (to send out) and new tapes inserted.

mph999
Level 6
Employee Accredited
Hi George, What I will need is robots log at full verbose level. To do this create directory on robot control host as follows: \veritas\volmgr\debug\robots Add VERBOSE into \veritas\volmgr\vm.conf Create an empty file called \veritas\volmgr\ROBOT_DEBUG (make sure windows doesn't add a suffix ...) Stop /start NBU Recreate issue, it would be most helpful if you could make note of the barcode of the tape in the slot. What I have found so far, is that ultimatly the library is responsible for not putting a tape in a slot that contains a tape that is in a drive. However, it doesn't work quite the way first thought. NBU will actually make a check if a tape from a slot is in a drive, but I think it only does this if the library has informed NBU that there is a tape in a drive from a slot. Of course, I have to consider that that NBU has a 'bug', but if that were the case, I would expect to find multipple cases with this issue, and in a nutshell I can't. Considering ho many people use libraries with NBU, you would expect to see this. At the end of the day it can only be confirmed with further investigation. If you post the case number up here, and as per my email inform the TSE who takes the case to escalate it to me, or at least inform them of my involvement. Many thanks, Martin