cancel
Showing results for 
Search instead for 
Did you mean: 

NetBackup 8.3 not writing to tapes

jpope
Level 3

We have an issue where all off a sudden the Media/Master server Netbackup 8.3.0.2 we have will not write to any tapes.

It fires off the job, it says it has selected a tape to mount, I see the tape move in the library into the drive (Can see that on the web gui also), but that is all that happens. It never writes anything to the tape. The tape sits idle in the tape drive and Netbackup just sits there waiting for the tape to be read. 

When using vmoprcmd it shows the tape drives, but does not show that there is a tape in them.

I have upgraded and downgraded the Firmware on the tape drives and library, and it does nothing. Still the same issue. I have checked the drivers and they are all up to date, and none of those have change anyway.

I have removed Microsoft Security patches to see if that will make a change, but that does not change.

It seems that either Netbackup is not reading what is returned from the library/tape drives or the library/tape drives are not sending the load/mount completion back to Netbackup.

Any help would be greatly appreciated.

Thanks

29 REPLIES 29

@Nicolai 

I can manually put them in the drive using the web interface of the library.

The tapes stay in there and don't move until I eject them.

Nicolai
Moderator
Moderator
Partner    VIP   

Hi @jpope 

Ok, that was worth a try.

My best advice from here is inspecting debug logs. vxlogview is a good command since it will report everything it can find.

https://www.veritas.com/support/en_US/article.100017099

E.g : #vxlogview -p 51216 -X "jobid=12345"

Best Regards
Nicolai

Hi @jpope 

Firstly have you inventoried the robot recently? 

And one suggestion (if the tapes are new and unused), change the media ID generation rules for the robot to use the first 6 characters of the barcode and not the default last 6 (I never understood why NetBackup did it this way). Delete the tapes from the GUI, update the generation rule for the robot (under advanced options which doing an inventory). Refer to the one of the Server Admin Guides for details. You could also add something like "MEDIA_ID_BARCODE_CHARS = 0 8 1:2:3:4:5:6" to /usr/openv/volmgr/vm.conf to achieve the same result (the first digit is the robot number, the second the the number of chars in the barcode, and the last is the chars to use from the barcode).

Cheers
David

mph999
Level 6
Employee Accredited

It would fail instantly Nicolai, the tape would immediately be ejected from the drive, as the drive knows which generation of tape it is from the onboard RFID chip.

mph999
Level 6
Employee Accredited

A tape is not 'mounted' just because it is in the drive (a very common mis-conception).

The following is the process ....

  1. NetBackup requests SCSI reservation for the drive
  2. NetBackup requests the tape to be mounted via TLDD process on the media server
  3. TLDD passes the request to TLDCD on the robot control host, if the media server is not the robot control host, TLDD on the media server sends the request over the network to TLDCD on the robot control host.
  4. TLDCD on the robot control host sends a scsi CDB 0xa5 MOVE MEDIUM command to the library
  5. The library moves the requested tape from the 'storage slot' to the tape drive
  6. NetBackup sends scsi CDB 0x1d SEND DIAGNOSTIC to the drive, a valid correct response is required.
  7. NetBackup sends scsi CDB 0x00 Test Unit Ready over and over every few seconds – once TUR finally returns 0x00 READY, only then know do we know the tape is actually physically and correctly loaded in the drive.
  8. AVRD reads the tape media header, this contains amongst other things, the tape mediaid - this must match the mediaid  that we requested, a safety check that we have the right tape

A failure of any of those steps will result in the tape not being mounted and available for use - for example, if step 5 or 6 fails, although the tape is physically in the drive, NetBackup was not able to get confirmation of this, and so this would cause a 'robot load error', despite the fact the robot physically moved the tape into the drive.

We know 5 happens, as you have confirmed the tape is in the drive.

So the failure is between 6 and 8 (arguable I guess even 8 could be happening and not telling the rest on NBU, although I have never seen that happen in 14 years ....)

If 6 or 7 fails to return, I think you do get something in the robots log (timed out waiting for drive to become ready) or words to that effect, but proving it is very very difficult, impossible from NetBackup, you need the library vendor to confirm if the drive/ robot received the CDB, and if it sent a response, or, a scsi analyzer.

7 may be easier - on Linux, I'd use strace on AVRD, you can clearly see it 'reading' the header, and of course if that happens, it must have completed 5 and 6.

On Windows, I guess you'd be looking at something like procmon to trace avrd when a tape is trying to mount.

Nicolai
Moderator
Moderator
Partner    VIP   

Hi Martin, yes I know. The idea was to inspect if cartridges somehow was rejected (wrong type, RFID error) by the drive and then ending up on the "unmountable" list of tapes. Once we know the tape drives go into the ready state, we know to look at the issue from the OS side. If just mt -f /dev/rtmxxx status was available on Windows ....

StefanosM
Level 6
Partner    VIP    Accredited Certified

as ITDT failed to recognize the tape, this is not a netbackup problem.

For me, you have to open a case to library vendor.
Before that, check that the encryption setting on the library is correct.

So we have figured out the issue now.

When the tapes were inserted they never checked what version of Tapes they were putting in. 

We took one out last night and had a look ..... LTO8 tapes. Drives are LTO7. So correct, not a netbackup issue at all, and in the great words of Homer Simpson - "D'OH!!!!!"

So new tapes should be arriving tomorrow and no more issues we hope.

Thanks all for jumping in to help ou in this.

Just to add to this, the drives never spat the tapes out. They just stayed loaded and until they were removed manually. So very odd indeed that his happened as I would have expected also for them to eject the tapes if they were the wrong generation.

mph999
Level 6
Employee Accredited

I would report tat back to the vendor, as I also thought the tape was ejected in such a case, which 'might' suggest a firmware issue.
There is a lso 'unsupported tape format' tape alert which I would have thought was logged, which NetBackup should have read when the tape was ejected, maybe this was logged in ...netbackup/db/media/errors on the media server.