cancel
Showing results for 
Search instead for 
Did you mean: 

Problems with Oracle T10K tape drives not unloading

walker_paul
Level 3

We are running NBU 7.0.1 with Oracle T10000B tape drives in an SL8500 silo. Our environment consists of one master server and two media servers with the tape drives SAN attached to all three servers. We have 7 drives actually mapped to one media server, 6 to the other, and 2 to the master and are not using the shared storage option. Here is our problem:

One one media server, at seemingly random times, 5 of the 7 tape drives will get an unload request (we see it in the bptm report) but will never actually unload the tape which leaves these drives in an ACTIVE control state in our device monitor but with no recorded or extrenal media label showing. When we look on our ACSLS server, the drives continue to show as loaded and will not let us dismount the tapes as it believes that there is still an active job on them. NBU will eventually fail the backup job with a code 52, timed out waiting to load tape, and then another backup job jumps on the drive and the cycle continues.  This only happens on the one media server and only on the same 5 tape drives, the other two drives on the server never have an issue.

In order to get the drives working again, we have had to restart the NetBackup services on the media server, though at times we have had to restart the whole environment, and on a couple of occasions actually had to reboot the server. Has anyone else experienced this? And if so, what could be the possible cause? We can't find any rhyme or reason to it. The zoning on the tape drives looks fine and we have seen no errors in the switch, though the media server error log is showing tons of adapter errors. Though if it were the fibre or the HBA we would expect it to be happening on all 7 drives.

4 REPLIES 4

Marianne
Level 6
Partner    VIP    Accredited Certified

No easy solution here - you will have to enable logs and check NBU, OS system and ACSLS logs.

The most common problem is device config mismatch.

You need to have VERBOSE entry in vm.conf on all media servers (restart NBU after adding entry). Mount/dismount requests as well as any other media manager activity/errors will be logged to syslog on Unix media servers and Event Viewer Application logs on Windows media servers.

You also need bptm logs on all media servers,

ACSLS log that will record unload problems: /export/home/ACSSS/log/acsss_event.log

walker_paul
Level 3

Thanks Marianne, I'll get with my SA and get all this set up.

Nicolai
Moderator
Moderator
Partner    VIP   

Try dismounting the drive via ACSLS instead for rebooting the media server:

dismount 123456 0,0,1,2 force

Word about zoning:

Zone HBA and one drive in one zone. Zoning HBA and many drives in one zone tend to cause issues when a tape drive misbehave.

What OS are you using ?

Fred2010
Level 6

Hey Paul,

We are using T10000A drives, but have had our share of problems with them in the past...

Be sure to have Oracle update the firmware of your drives, BUT...

also be sure to check in their support portal if that particular firmware has known issues with Netbackup...

For instance:

I am currently running a 'special build'  firmware that has specific fixes for Netbackup (Not the one you mention by the way): I am running RK.46.109 (Whilst 1.46.109 Firmware was the latest)

There have been quite a lot of changes in firmware of the T10000x drives, so keep 'em up to date :)

Hope this helps!

Fred