01-25-2022 08:01 AM
We have an issue where all off a sudden the Media/Master server Netbackup 8.3.0.2 we have will not write to any tapes.
It fires off the job, it says it has selected a tape to mount, I see the tape move in the library into the drive (Can see that on the web gui also), but that is all that happens. It never writes anything to the tape. The tape sits idle in the tape drive and Netbackup just sits there waiting for the tape to be read.
When using vmoprcmd it shows the tape drives, but does not show that there is a tape in them.
I have upgraded and downgraded the Firmware on the tape drives and library, and it does nothing. Still the same issue. I have checked the drivers and they are all up to date, and none of those have change anyway.
I have removed Microsoft Security patches to see if that will make a change, but that does not change.
It seems that either Netbackup is not reading what is returned from the library/tape drives or the library/tape drives are not sending the load/mount completion back to Netbackup.
Any help would be greatly appreciated.
Thanks
Solved! Go to Solution.
01-31-2022 07:10 AM
I can manually put them in the drive using the web interface of the library.
The tapes stay in there and don't move until I eject them.
01-31-2022 07:58 AM
Hi @jpope
Ok, that was worth a try.
My best advice from here is inspecting debug logs. vxlogview is a good command since it will report everything it can find.
https://www.veritas.com/support/en_US/article.100017099
E.g : #vxlogview -p 51216 -X "jobid=12345"
Best Regards
Nicolai
01-31-2022 01:16 PM
Hi @jpope
Firstly have you inventoried the robot recently?
And one suggestion (if the tapes are new and unused), change the media ID generation rules for the robot to use the first 6 characters of the barcode and not the default last 6 (I never understood why NetBackup did it this way). Delete the tapes from the GUI, update the generation rule for the robot (under advanced options which doing an inventory). Refer to the one of the Server Admin Guides for details. You could also add something like "MEDIA_ID_BARCODE_CHARS = 0 8 1:2:3:4:5:6" to /usr/openv/volmgr/vm.conf to achieve the same result (the first digit is the robot number, the second the the number of chars in the barcode, and the last is the chars to use from the barcode).
Cheers
David
01-31-2022 01:44 PM
It would fail instantly Nicolai, the tape would immediately be ejected from the drive, as the drive knows which generation of tape it is from the onboard RFID chip.
01-31-2022 01:51 PM
A tape is not 'mounted' just because it is in the drive (a very common mis-conception).
The following is the process ....
CDB 0xa5 MOVE MEDIUM
command to the libraryCDB 0x1d SEND DIAGNOSTIC
to the drive, a valid correct response is required.CDB 0x00 Test Unit Ready
over and over every few seconds – once TUR finally returns 0x00 READY,
only then know do we know the tape is actually physically and correctly loaded in the drive.A failure of any of those steps will result in the tape not being mounted and available for use - for example, if step 5 or 6 fails, although the tape is physically in the drive, NetBackup was not able to get confirmation of this, and so this would cause a 'robot load error
', despite the fact the robot physically moved the tape into the drive.
We know 5 happens, as you have confirmed the tape is in the drive.
So the failure is between 6 and 8 (arguable I guess even 8 could be happening and not telling the rest on NBU, although I have never seen that happen in 14 years ....)
If 6 or 7 fails to return, I think you do get something in the robots log (timed out waiting for drive to become ready) or words to that effect, but proving it is very very difficult, impossible from NetBackup, you need the library vendor to confirm if the drive/ robot received the CDB, and if it sent a response, or, a scsi analyzer.
7 may be easier - on Linux, I'd use strace on AVRD, you can clearly see it 'reading' the header, and of course if that happens, it must have completed 5 and 6.
On Windows, I guess you'd be looking at something like procmon to trace avrd when a tape is trying to mount.
02-01-2022 12:29 AM
Hi Martin, yes I know. The idea was to inspect if cartridges somehow was rejected (wrong type, RFID error) by the drive and then ending up on the "unmountable" list of tapes. Once we know the tape drives go into the ready state, we know to look at the issue from the OS side. If just mt -f /dev/rtmxxx status was available on Windows ....
02-01-2022 12:39 AM - edited 02-01-2022 01:30 AM
as ITDT failed to recognize the tape, this is not a netbackup problem.
For me, you have to open a case to library vendor.
Before that, check that the encryption setting on the library is correct.
02-01-2022 06:12 AM
So we have figured out the issue now.
When the tapes were inserted they never checked what version of Tapes they were putting in.
We took one out last night and had a look ..... LTO8 tapes. Drives are LTO7. So correct, not a netbackup issue at all, and in the great words of Homer Simpson - "D'OH!!!!!"
So new tapes should be arriving tomorrow and no more issues we hope.
Thanks all for jumping in to help ou in this.
02-01-2022 06:14 AM
Just to add to this, the drives never spat the tapes out. They just stayed loaded and until they were removed manually. So very odd indeed that his happened as I would have expected also for them to eject the tapes if they were the wrong generation.
02-02-2022 12:00 AM
I would report tat back to the vendor, as I also thought the tape was ejected in such a case, which 'might' suggest a firmware issue.
There is a lso 'unsupported tape format' tape alert which I would have thought was logged, which NetBackup should have read when the tape was ejected, maybe this was logged in ...netbackup/db/media/errors on the media server.