cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup thinks all drives are busy (when they aren't)

Alexander_Harri
Level 4

We have 2 storage units in our environment (1 for NDMP drives and 1 for non_ndmp drives). Both of the seem to be exhibiting the following problem. Even though both storage units have drives available, Netbackup insists that there are no available drives and queues the jobs. Then randomly (sometimes hours later), it will suddenly acknowledge the available drive and start writing to it.

Any ideas? I have a case open with symantec, but I would like to see if there are any ideas on the forum.

system
Netbackup 6.5.2a
OS: RHEL 5 (64 bit)

9 REPLIES 9

Roobix_Cube
Level 5

More clarification, please.

 

Re: "both storage units have drives available" - how are you determining this?  (eg, are you certain you haven't exceeded your Max Concurrent Write Drives or Max Streams Per Drive?)

 

Re: "Netbackup insists that there are no available drives" - how are you determining this other than the queued jobs (ie, are jobs finishing with a Status 219?  Are you getting certain error messages?)

 

 

Alexander_Harri
Level 4
I am able to determine that the drives are available because:

a) The GUI shows the drive as up and not active (the status reads as TLD)
b) Running the command line utility shows the same thing
c) If I look at the library, the drive is empty

What happens is that the jobs remain queued until the window closes and then errors out with error 196 (backup window closed).

If I look at the job while it is queued it says the following: awaiting resource <Storage Unit name>. No drives are available
If I look at the activity monitor the state details says: Drives are in use in storage unit <storage unit name>

I know that the setup is correct because
a) this didn't happen before this week and I haven't changed anything (and I tried re-creating one of the storage units)
b) usually after a long time, the job starts (with no changes made by me).
Message Edited by Alexander Harris on 01-29-2009 07:39 PM

Roobix_Cube
Level 5

Nothing may have changed, which could be your problem.....ie, are you now running more than the STU can handle?  Look in the Job Details window for the job that was queueing.  Determine which storage unit was being used at the time of the queueing, which media server is associated with that storage unit, and what type of storage unit it is.  Verify that the policy specifies the correct storage unit or is it "any available"?

 

For the NDMP STU, use "set_ndmp_attr –verify <ndmp_host_name>" on the media server that is connecting to the NDMP host to verify the connection to the NDMP host with the proper credentials.

 

Alexander_Harri
Level 4
Please re-read the post.

To further clarify:

This morning:

6:00 AM - All of the overnight incrementals timeout (ie there are no jobs scheduled to run anymore). There is an available drive in the correct storage unit. It is up. The physical drive is empty. There are no jobs scheduled to right to this drive.

7:30 am - The scheduled catalog backup which uses the same storage unit starts. It completes the first stage (staging the database). The second stage is still waiting for a drive. It used to work and I haven't changed anything (added new clients, etc) since then.

And yes, I can connect to the NDMP hosts/get a list of drives using set_ndmp_attr

be_happy
Level 2
Certified
Hi Alexander,

Me too faced the same situation, all the drives are up on the media server and all the media manager daemons are up too.
But if we kickoff the backups the jobs are going into queued state.

To resolve this issue and make the jobs to go into active state we have to make the master to re-read the storage unit configuration files.

You can execute below command to achive this:

bpschedreq -read_stunits 
bpschedreq -read_stu_config 

Hope it helps you,
Thanks.

Claudio_Veronez
Level 6
Partner Accredited
I have this problem every weekend.


My LOG and Archive poloicies R set to run hourly.


the problem is (I think)  on saturaday a lot of FILES backups stats and usually elapsed time is over 5 hours... SO  5 logs and Archives..


The first ARCH runs and Truncate the LOG... the next LOG will run but will not find anything..


I have to restart netbackup to fix it up



/etc/init.d/netbackup stop
/etc/init.d/netbackup start




Omar_Villa
Level 6
Employee
what about the STU number of concurrent drives that can be used?

Normand_Frenett
Level 3
Hi,

I have the same problem after a drive problem. I have 2 tape drives.
To clear the problem I downgrade from NB 6.5.3.1 to NB 6.5. Unfortunatly after re-upgrading at NB 6.5.3 the problem came back...

ivighetto
Level 3
Maybe remained some file in /usr/openv/netbackup/db/media/drives?
My experience is that if I find there a file (something as drive_<DRIVE_NAME>) related drive is not used.