cancel
Showing results for 
Search instead for 
Did you mean: 

Why Drives go down

Puffy
Level 6

Hi,

 

Just checking whether there is any documents that explain why drives go down ?


I have cleaned the drives and ask the tape library vendor to change some drives but still see some drives go down occasionally.


Thanks

NBU 6.0 Mp4 

  

11 REPLIES 11

Puffy
Level 6

I managed to read some documentation on the possiblity that the frozen tapes might have caused the drives to go down like in my case.


Anyone has experienced this type of frozen media causing drives to go down. What remedy do you have ? change drives/tapes ?

 
What could be the cause ?

 

 

 

 

 

Dion
Level 6
Certified

Before you start throwing drives and media away, take a look at your storage unit configurations.  Make sure that you don't have the same tape drives configured in multiple storage units.  If you do there could be contention for the same drive by multiple backups.  This will generate errors and will put drives down and will also freeze tapes.

 

There is some documentation regarding tapes being frozen but it doesn't alway mean that they are bad.  I would first check your config.

 

An example Storage Unit 1 (contains drive1 drive2 and drive3).  Storage Unit 2 (contains the same drive1 drive2 and drive3).

If you have one policy that backs up to Storage Unit 1 and another that backs up to Storage Unit two, the backups from each could possibly keep swapping drives until they generate enough errors to be downed as NetBackup "sees" these as physically different drives.

 

Good luck

Puffy
Level 6

We only have 7 drives and I have checked we are using the same storage unit for all policies.

 

But one thing I note is that we have several policies running together with the same client. Will this cause the problem that I am facing now. In fact, it's only 4 drives that keep going down. Whenever I up them, they go down shortly.

 

 

 

Message Edited by Puffy on 09-22-2008 12:51 AM

Kevin_Good
Level 5
Certified

I would check the output of "tpconfig -d" and "scan", make sure all the tape drives still match the device path, and serial numbers.

 

 

 

J_H_Is_gone
Level 6

I want to quesiton Dion on what he said

"

An example Storage Unit 1 (contains drive1 drive2 and drive3).  Storage Unit 2 (contains the same drive1 drive2 and drive3).

If you have one policy that backs up to Storage Unit 1 and another that backs up to Storage Unit two, the backups from each could possibly keep swapping drives until they generate enough errors to be downed as NetBackup "sees" these as physically different drives."

 

How can you possibley create a storage unit and tell it which tape drives to use ( you can tell it how many, put I don't know of any way to say which tape drives)

 

I have many storage units that could be requesting the same 8 tape drives... but that just means that storage unit 1 gets what it needs say 5 drives and storage unit two can request 8 but only gets 3 because that is all that is left).

 

as to Puffy's question.... I will make sure those 4 drives are correct ( does not hurt to delete and readd them to make sure the serial numbers match.  Some times when you have a tape drive removed because it does not work, and a week later you have another one replaced they can be putting back in the fixed drive you had the first time.  This then gets things messed up becuse the robot thanks the serial number is already there and now is trying to see it in two places.  Delete and readd the tape drives is the first step.

zippy
Level 6

9.9999 times out 10 the drive is going bad.

 

 

replace it.

Omar_Villa
Level 6
Employee

run:

 

vmoprcmd -h <media server> -autoconfig -t

vmoprcmd -h <media server> -shmdrive | awk '{print $38, $39, $31}' | sort -n

vmglob -listall -b | grep -i <drive serial number>

 

compare the 3 DB and they must match on Serial Number, Path and Drive Name, if not you will need to update your config with tpconfig -update -drive <index> -path <path>

 

hope this helps regards.

Wile_E__Coyote
Level 4
The logs will tell you why.

Dion
Level 6
Certified
Hi

When I first took over my current environment they were having major issues with drives going down randomly.  After watching the system over a couple of nights I notice that some jobs that had already started on a drive would be intercepted by a queued job that would request the same drive through a different storage unit and get access.  The original job that was running would stay in a running state but in the logs you could see that all of a sudden, it would not have any resources to use so would keep polling.  When the second job would did something e.g. change tapes then the first job would take control and would continue backing up.  After doing this for a while, the drive would be down'd and/or the tapes would be placed into a frozen state.

After changing the configuration to only having tape drives of the same type in the same library belonging to a single storage unit, none of the drives went down again.

As far a my understanding goes, by logically configuring the same drive into multiple storage units you are telling NetBackup that that they are "physically" separate drives which can lead to contention in resource allocation.

J.Hinchcliffe, in your situation where you possibly need to have certain jobs only access to a certain number of drives you could have configured multiple storage units e.g. one with 5 drives and one with 3 drives.  From there you could configure a storage unit group (which includes both storage units).  Some of your policies (that could use all 8 drives) could be configured to use the storage unit group and the other policies (the ones you only want to use 3 drives) could be pointed directly to the storage unit configured with the three drives.

Jim-m
Level 5

If windows media servers -

 

Reference:  http://seer.entsupport.symantec.com/docs/274991.htm    Technote.

Issue:  This service will interfere with VERITAS NetBackup (tm) and cause various issues with tape drives and tape libraries. Tape drives can go down and there maybe issues when incorrect tapes are loaded into the tape drives.

Additional Information:
The service called Removable Storage needs to be set to disabled. If the service is set to Automatic or Manual it will be either running or will start running. This service will attempt to manage the tape library or tape drives and can cause potential tape drive or tape library issues. It is recommended that the service be disabled on all servers that can see the tape drives and tape library. This is especially critical in a SAN environment when there are multiple servers that could be zoned to see the tape drives and tape library.

 

J_H_Is_gone
Level 6

Dion,  my point was that I have never had this problem...

I have 8 tape drives on server, and I have had a couple of storage units  ( one with 3 drives - it starts first and I want to make sure it gets all the drives it needs). (one with all 8 drives - it starts latter...).

the one with 8 drives comes in and sees that only 5 drives are avail and ONLY USES 5 DRIVES. it does not try to get the other three.  when the first policy is done and gives up those three drives.  the second one can now take them and use all 8.

 

I have never seen this issue of policy storage units fighting over tape drives.

 

Maybe Jim-m is correct - I have removeable storage NOT running... and they may be why I never have this issue.