I'm totally new to Netbackup zero experience just got it passed down to me anyway. I have a bunch of backups failing. The robot is scheduled to eject out tapes every week. I've noticed that backups that fail succeed or does a partial backup after we do a tape exchange. I'm just curious why netbackup doest not spit out all the bad tapes that need to be replaced.There are only a few active tapes in the robot, the rest are Full or Frozen. I understand that there are several reasons netbackup freezes the media. How do I pick which ones to unfreeze and what do I do with tapes that are marked full? Some of the tapes were last written way back in 2015 do I eject those? Thanks.
Netbackup deos not automatical eject frozen or suspended tapes. It is somthing you as NBU admin has to do.
To unfreeze a media run :
bpmedia -unfreeze -m LABEL_NAME
There are senarios where NBU will freeze a lot of media in one go - eg. if you are trying to re-use media without writing a new magnetic label. Inspec how many times a media has been mounted, if it is very low you can unfreeze media - If it high bin the tape. Do not unfreeze all media - release no more than a handfull a day and monitor the number of status code 84 (media write error).
Tape drives in a bad shape may also lead to a high volume of frozen tapes.
On each media server a file called errors is located in /usr/openv/netbackup/db/media. Inspect that file to see any tape drives or times where tape was frozen in high volume
This example is a snippet of the error file. From the example we can see 5 errors. Since tape drived had erros on at least two taps, it safe to assume tape drive is faulty.
03/30/13 04:56:24 701149 4 POSITION_ERROR 0018-B-2F
03/30/13 05:01:09 701149 4 POSITION_ERROR 0018-B-2F
03/30/13 06:49:19 701149 4 POSITION_ERROR 0018-B-2F
03/30/13 06:53:56 701149 4 POSITION_ERROR 0018-B-2F
04/12/13 10:37:15 701376 4 POSITION_ERROR 0018-B-2F
Time to get the junior detective hat out!
You need to determine WHY these tapes froze.
Example - tape gets stuck in drive 1, NB tries to unload but does not realize tape remains.
Next tape NB tries to load will get frozen, because it failed to load!
This can freeze many tapes before NB takes the drive DOWN. And if you have "helpful" operators who put the drive back UP - you can freeze all your scratch tapes.
Example - Tape has data written, then has too many read/write errors - NB freezes this tape. You should keep it until it expires, then destroy it.
I have issues with my IBM LTO5 drives, they get head wear, then start throwing read/write errors - are the tapes bad? Not always, sometimes it means I need to replace the drive.
Check the robot -
many errors on one drive, various tapes - drive issue.
many errors on one tape, multiple drives, - likely tape issue.
You can compareNetBackup Problem report, filter for "TapeAlert" - to see what NetBackup is seeing with what your robot has for errors.
I notice the following in your post:
The robot is scheduled to eject out tapes every week.
I have not heard of any type of robot that can schedule ejects.
Maybe you have NBU Vault option configured?
With this option, Vault Profiles are configured to select tapes based on what was backed up in a certain period and eject that.
This profile is then added in a NetBackup policy with a schedule to run as needed (once a week in your environment by the looks of it).
You can go through NetBackup Vault Administrator's Guide to see how the 'Choose Backups' tab is used.
You will notice that there is nothing about 'Full' or 'Frozen' tapes.
If you are NOT using Vault, then someone has probably written a script to select and eject media and scheduled the script to run once a week via OS scheduler or a thirdparty scheduler.
Media Management (which tapes to eject, where to send them and when to bring them back) should be a documented IT policy.
This is not something that you should decide. Or the NBU community...