cancel
Showing results for 
Search instead for 
Did you mean: 

drives paths going down

shahriar_sadm
Level 6

Hi Dear,

Newly I have many queued task, I checked device monitor and many paths for a media server are in DOWN-TLD state for media server x.

I am running device configuration wizard for media server x or manually click on up path, after some days different path will be down again, it is not depend on specific media server or tape drive.

Thanks 

5 REPLIES 5

mph999
Level 6
Employee Accredited

You need to look in the bptm log to see why the drive is going down.

Additionally, the /usr/openv/netbackup/db/media/errors file on each media server may hold some clues.

 

Sujay24
Level 4
Employee

Kindly check on media server at below location:-

In Windows Media server:-

Application logs

In Unix/Linux Media server:-

Check /var/adm/messgaes

In above location you will find why drive is going down and also you can enable bptm logs on specific media server and check why is it going down.

If there are any Tape Alerts then contact H/W vendor for further assistance.

Marianne
Level 6
Partner    VIP    Accredited Certified

There is honestly no need to run the Device Config wizard, unless you see signs that the OS has lost connectivity to the drives or that drive paths have changed at OS-level.

There are lots of possible reasons why drives are being downed. 
3 media write errors on a particular drive in 12 hours will down the drive.
Hardware errors (TapeAlerts generated by drive firmware) will down the drive.
Robotic load errors will down a drive.
Manual tape movement within a robot (e.g. operator opening robot to insert tapes in empty slots while tapes are in the drives) will cause a drive to be downed as the tape cannot be returned to its 'home slot'. 

There are other reasons as well. As Martin said - bptm log is a good place to start.
You need to create bptm log on each media server as the media server that experienced the most recent problem will down the drive.

Another good place is /var/log/messages, BUT you need to add VERBOSE entry to vm.conf, followed by ltid restart. You need to do this on each media server. 
This will send additional hardware-related activities/errors to messages file.

When the media server DOWNs a drive, you will find the reason in the messages file.

PS:
There are quite a number of similar posts under 'Related discussions' on the right-hand side of this page. 

mph999
Level 6
Employee Accredited

You can also create these empty files to get more logging info in messages:

 

/usr/openv/volmgr/DRIVE_DEBUG

/usr/openv/volmgr/ROBOT_DEBUG

Jim-90
Level 6

Are the media server(s) being rebooted to solve the probelem?  If they are this could be the cause of the problem.  Suggest you check out HBA static binding of the tape drives IO paths, especially if the tape drives are shared between media servers.

If there is no HBA static binding and you have multiple tapes drives shared between media servers its best to rebuild the through the Device Configuration Wizard every time you reboot any media server...it doesn't take that long.

Every time a machine is rebooted there is the chance that the order of the dive files/paths of the tape drives will change.

How to detect the problem .....See attachment.  

All the serial numbers for the tape drive should be the same.  If not ...then what NBU  thinks  is tape drive #2 is actually multiple tape drives when it should only be one.