cancel
Showing results for 
Search instead for 
Did you mean: 

Required Media is in use (Queued backup jobs)

bunsen82
Level 4

Hi,

This is my first post in the community, and I am by no means an NBU expert! The best/worst kind of user! :)

Anyhow, I have recently inherited our companies NBU environment as the person managing it has left. I have a history in backup but these are typically disk based solutions. This is my first real exposure to managing NBU.

Setup:

Windows 2012 Master server, using a Spectra Logic 950 tape library running NDMP level backups on the whole, with some agent/client level too.

 Had some failed jobs recently and I am attempting to re-run the jobs. There are a total of 13 jobs. Only one is processing, the rest are "queued" status and the state details are : Required Media is in use (Server name/Library blah)

These are from 13 different policies, all using the same client. Typically, the clients are able to process multiple jobs at once but this error seems more of a media issue than policy I would assume.

When I look into the detailed status of the queued jobs, the following message is offered:

 awaiting resource gbwwsclobkp01-hcart3-robot-tld-0-backup.gbwwsbasifs.COMPANYDOMAIN Reason: Media is in use, Media Server: gbwwsclobkp01.COMPANYDOMAIN,
  Robot Number: 0, Robot Type: TLD, Media ID: N/A, Drive Name: N/A,
  Volume Pool: DataStore, Storage Unit: gbwwsclobkp01-hcart3-robot-tld-0-backup.gbwwsbasifs.COMPANYDOMAIN Drive Scan Host: N/A

Stating the obvious, I would expect the above error to actually specify a drive or tape that it is waiting for but it is all N/A. The volume pools and library suggest that there is available tapes/space but NBU does not it seems!

Would someone be kind enough to help me troubleshoot this?

Thanks!

16 REPLIES 16

GeForce123
Level 5

Hello @bunsen82,

Storage Unit: gbwwsclobkp01-hcart3-robot-tld-0-backup.gbwwsbasifs.COMPANYDOMAIN

Is the storage unit that is configured for your drives. How many drives do you have?

Marianne
Level 6
Partner    VIP    Accredited Certified
Seems NBU thinks that this is the only available tape in DataStore pool and therefore queueing the other jobs. Or else you have Max Partially Full setting configured for this pool?
Can you run available_media from cmd (in ...netbackup\bin\goodies), copy the text ouput and post it here?

We may need output from more commands but this will be a good start.

Hi Marianne,

Thanks for the reply. I dont believe Max Partially Full setting is enabled, although I am not certain. When I look at the volume pools via the GUI 0 is given as the output for Max Partially full media for all volume pools.

I ran the command as requested, and it doesnt offer any output in terms of tapes. A further update to my post yesterday is that all subsequent backups failed due to lack of media. I include the output for your reference :

operable program or batch file.
media   media   robot   robot   robot   side/   ret    size     status
 ID     type    type      #     slot    face    level  KBytes
----------------------------------------------------------------------------

D:\Program Files\Veritas\NetBackup\bin>available_media.cmd
'D:\Program' is not recognized as an internal or external command,
operable program or batch file.
'D:\Program' is not recognized as an internal or external command,
operable program or batch file.
'D:\Program' is not recognized as an internal or external command,
operable program or batch file.
media   media   robot   robot   robot   side/   ret    size     status
 ID     type    type      #     slot    face    level  KBytes
----------------------------------------------------------------------------

D:\Program Files\Veritas\NetBackup\bin>

Marianne
Level 6
Partner    VIP    Accredited Certified

You need to change directory in cmd to "D:\Program Files\Veritas\NetBackup\bin\goodies".

If the command is still producing empty output, it may be because of a lock file when a previous attempt was interrupted. Depending on the number of tapes in the environment, the command takes a while to gather information.

There used to be a TN with the following info:

When the Available_Media report is executed, it creates a lock file to prevent a second attempt of this report from running at the same time. When the report is running and is interrupted by an 'Esc' key or 'CTRL-C' the lock file does not get released and prevents the Available_Media report from being run.  In order to get around this, look for a file in the %TEMP% directory in the following format:
AM.<current username>.lok.  
This file must be deleted or renamed so that the Available_Media report can be run again.
 
Hope this helps.

Thanks Marianne, I was able to execute the command. I have attached a txt output as there is alot of information.

Marianne
Level 6
Partner    VIP    Accredited Certified

Most of the tapes in DataStore pool that are in the robot are FULL, e.g:

01679L	HCART3   TLD	  0	  41	 -	 5     2510213407	FULL
01680L	HCART3   TLD	  0	  198	 -	 5     2459675804	FULL

Some of them are not yet full (ACTIVE), mostly with retention level 5 or 9:

00016L	HCART3   TLD	  0	  86	 -	 5     8780384	ACTIVE
00146L	HCART3   TLD	  0	  138	 -	 5     7592864	ACTIVE
00373L	HCART3   NONE	  -	  -	 -	 9     9256097	ACTIVE
00375L	HCART3   NONE	  -	  -	 -	 9     28852857	ACTIVE

I can only guess that you probably have multiple media servers.

Bear the following in mind:
Media servers do not share media by default.
This means that once a tape is written by one media server, no other media server can append to it.
Media sharing can be enabled and is highly recommended.

NBU does not mix retention levels on the same media.
This can be enabled but is NOT recommended. (A FULL tape with both level 5 and 9 can never be reused when the level 5's expire as ALL images need to expire before a tape can be overwritten.)

There are lots of AVAILABLE tapes, but they are not in the robot:

00702L	HCART3   NONE	  -	  -	 -	 -     -	AVAILABLE
00707L	HCART3   NONE	  -	  -	 -	 -     -	AVAILABLE
00709L	HCART3   NONE	  -	  -	 -	 -     -	AVAILABLE

 

Anyhow - you need to have a look at the policy for which the jobs are queued. 
Check what the retention level is for the schedule.
Check the STU selected for the job and note the media server name .

For the ACTIVE (non-full) tapes in DataStore pool that are in the robot, check Media ownership in the Media GUI (Media owner could be a hidden column in Windows GUI).
Use Column Layout and 'sort' functionality to make life easier.
How many non-full tapes in DataStore pools belongs to this media server?

You may want to read up about Unrestricted Media Sharing in NBU Admin Guide I.

PS:
It seems no notes were left behind with tape rotation / management procedures?

The jobs that have queued have a retention level of 5 (I havent checked every single one, but have checked most and all are reporting as 5)

Only have 1 media server (checking under host props/Media servers) and this is also the master server

The Media server is gbwwsclobkp01 and the storage unit is gbwwsclobkp01-hcart3-robot-tld-0-backup.gbwwsbasifs

Counting the active tapes in the datastore pool that are assigned to a slot is 19, which sounds correct as we have 19 drives for our tape library. However these are assigned between backup.gbwwsbasifs.Domain and gbwwsclobkp01.domain according to the media owner tab.

I also noticed that when I look at the drives tab on activity monitor, only 1 is reporting as ready and most have a ready "no" status and a "RESTART" control. Is this related or am I chasing my tail?

On the back of the drive view, I noticed the following article. - https://www.veritas.com/support/en_US/article.000007415

 Do you think it is worth trying as I have rebooted a number of times and removed the drives and readded as part of the (minimal!) documentation passed to me

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Okay, so you have a NBU master/media server and and NDMP host that is configured with tape devices and doing its own backups.

The rules are the same - tapes owned by backup.gbwwsbasifs cannot be used by gbwwsclobkp01 and vice versa.

The 'Media is in use' message normally mean that drives are available but not sufficient media.

The RESTART status of the drives means that Media Manager services need to be restarted. I am surprised that this status is still shown after a reboot.
See https://www.veritas.com/support/en_US/article.000026908

It might be a good idea to check if there are orphaned resource allocations for tapes.

Run 'nbrbutil -dump'
(in ...\netbackup\bin\admincmd) 
and compare the 'MDS Allocations' section with the following output:
vmoprcmd -d
(in ...\volmgr\bin)

 

Hi,

I have attached the outputs. From what I can see, the MDS allocation references media 01368L, when I compare to vmoprcmd -d I can see 01368L is in use, no other mentions of any other tapes. Could also be completely mis-interpreting!

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Correct - only media id is in use by NDMP server GBWWSCLOBKP01.

You need to confirm media server ownership for the other Datastore tapes that are ACTIVE and in the robot.

We saw in avilable_media output that there are lots of FULL tapes in the robot and lots of tapes that are AVAILABLE but not in the robot.
Can you put some of those AVAILABLE tapes in the robot?

You will need some sort of tape rotation strategy regarding onsite (in the robot) and offsite (non-robotic) tapes. 

It also seems that a Scratch Pool was added at some point without new tapes being added to Scratch, but rather to specific pools. This means that expired level 5 tapes cannot be returned to Scratch (well, they cannot return as they were not in Scratch as a start...)

With all the Retention level 9 backups, your company must be going through LOTS of tapes, meaning that you need a safe storage location for all those Infinity tapes.

 

Hi Marianne,

Attached is an updated list of the ownership of the 19 active tapes from Datastore that are in the robot. I have out the media owner on the right.

Our tape library is located in our colo site, which I am attending tomorrow to load some further blank tapes. I dont seem to have the same tape move options via the web browser page of the tape library so I am unsure if I can move any of the available tapes at this point.

We do a monthly export of our monthly/infinite tapes and as I understand this is typically around 150 LT06 tapes per month sometimes more. I plan to load an additional 250 tapes tomorrow.

Do you think there is any use in trying the link I suggested earlier (https://www.veritas.com/support/en_US/article.000007415) as it does seem to fit with the symptoms we are seeing, albeit could be part of a wider issue.

Marianne
Level 6
Partner    VIP    Accredited Certified

I see... more than sufficient ACTIVE non-full tapes in the robot assigned to gbwwsclobkp01.

I do not see any evidence of the situation described in Article 000007415.

bptm log on the master/media server may also shed some light (the log folder does not exist by default and needs to be created - level 3 log should be fine).
 
Hopefully another pair of eyes will pick up something that I have overlooked.

My only remaining suggestion is to log a Support call with Veritas, but you are running an unsupported version....

 

Thanks Marianne, I have enabled the logging will load new tapes tomorrow and see how things go.

If anyone else has anything to offer, I am all ears!

Marianne
Level 6
Partner    VIP    Accredited Certified

Curious to see bptm log...

I was thinking of one more thing.
Please check one or two of the tapes that should be used, for attributes that could cause NBU not to consider them  (like hardware expiration).

Run the following command from cmd (...netbackup\bin\admincmd):
nbemmcmd -listmedia -mediaid <media-id>

Hi Marianne,

 

Thanks for all your assistance with this, it seemed that the issue was the Tape Library having a moment. A physical restart of the library followed by an inventory seemed to get things going again.

Thanks again!