Troubleshoot Drive Down issue

Question

What are the basic Drive down Troubleshooting we do to fix the issue.
there are may scenario
* some drive are down are down on a media server but some are up.
* SSO drive on acsls are down
*drive mostly giving error 84 even with new good tapes
* VTL drives going down
what are the logs we need to look for drive down issue and how&nbsp;and why to&nbsp;reconfigure drives

rajeshthink · Accepted Answer

Currently i have VTL Drive issue. I have done everything and most have reconfigure the drives.
But still Drives are going down, i check the /varadm/messages i got following
May 24 19:26:22&lt;Master_server&gt; ltid[1413]: [ID 942091 daemon.error] Operator/EMM server has DOWN'ed drive HP.ULTRIUM4-SCSI.001 (device 26)
&nbsp;
I dont se any issue with drive in tpautoconf -report_disc.

marianne · Answer

PLEASE mention OS on your media servers. NBU relies on OS for I/O, so troubleshooting should start there. To increase NBU Media manager logging add VERBOSE entry to vm.conf on ALL media servers in your environment and restart NBU.
	Unix: /usr/openv/volmgr
	Windows: install-path\veritas\volmgr
	Also create bptm log folder on all media servers under netbackup/logs.
In the meantime, please run Tape Logs report for last 24 hours. Filter to exclude INFO type severity.
	Save report as text file and post it here as attachment.
**** EDIT ****
Typing too slow when using Connect Mobile 
Martin's list is the most comprehensive and will catch errors at all possible levels.
In all honesty, I have always been able to find reason for DOWN drives with just VERBOSE entry in vm.conf.
	Exact reason for DOWNing drive will be logged in Windows Event Viewer Application log and device errors in System log. On Unix media servers these errors will be logged to syslog (e.g. /var/adm/messages on Solaris, var/log/messages on Linux).
It is normal for only the media server experiencing the problem to DOWN a drive while it remains UP on others. That is why troubleshooting must start at OS level on media server.

mph999 · Answer

&nbsp;
Add VERBOSE to vm.conf
&nbsp;
mkdir /usr/openv/netbackup/logs/bptm
mkdir /usr/openv/netbackup/logs/bpbrm
mkdir /usr/openv/volmgr/debug/robots
mkdir /usr/openv/volmgr/debug/daemon
mkdir /usr/openv/volmgr/debug/ltid
mkdir /usr/openv/volmgr/debug/oprd
mkdir /usr/openv/volmgr/debug/reqlib
mkdir /usr/openv/volmgr/debug/tpcommand
mkdir /usr/openv/volmgr/debug/acssi &nbsp;(ACS)
mkdir /usr/openv/volmgr/debug/acsd &nbsp;(ACS)
&nbsp;
&nbsp;
Create the following empty files
&nbsp;
touch /usr/openv/volmgr/DRIVE_DEBUG
touch /usr/openv/volmgr/ROBOT_DEBUG&nbsp;
touch /usr/openv/volmgr/AVRD_DEBUG&nbsp;
touch /usr/openv/volmgr/SSO_DEBUG&nbsp;
&nbsp;
+ system messages log
&nbsp;
http://www.symantec.com/docs/TECH169477
&nbsp;
&nbsp;
Martin
&nbsp;

mph999 · Answer

I missed a bit ...
&nbsp;
&nbsp;
To reconfigure devices :
&nbsp;
http://www.symantec.com/docs/TECH125956
&nbsp;
It is difficult to know when to reconfigure devices. &nbsp;Sometimes, it is obvious, for example you may see a 'missing path' as things at the os level have changed.
&nbsp;
Sometimes, we can look through every log and find no answers, only to discover that for no reason you will ever explain, a re-config of the devices fixes the issue ... &nbsp;This however is fairly rare, compared to the other possible causes.
&nbsp;
Generally (depending on complexity of system) I will reconfigure the drives at the begining of a call, just to eliminate this from being the cause - we then know for sure the config is fine. &nbsp;If I do this, I'll delete the 'old' devices first.
&nbsp;
The thing to remember with drive (and robot) issues is that 95% of these calls are not caused by NBU - tape / drive issues should not automatically be blamed on NBU wich is usually the case.
&nbsp;
Why ...
&nbsp;
The OS performs most of the operations - eg, read/ write/ positioning - therefore, alhough not impossioble it is very very unilkely that these sort of issues are anything to do with NBU.
&nbsp;
Off te top of my head, I can only think of three 'NBU' issues relating to drives/ libraries that are actually 'our' fault, and &nbsp;all of these are fixed in the latest NBU version.
&nbsp;
Martin
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;

mph999 · Answer

You need to create the logs I explained before. &nbsp;
Martin

Forum Discussion

Troubleshoot Drive Down issue

5 Replies

Related Content

Troubleshoot Drive Down issue

troubleshoot drive down issue

Drive Troubleshooting

EV 15.0 Indexing Issue

Drive Issue

Recent Discussions

"failed, status 6" error after increasing datastore (LUN) capacity

command: bperror

MS-SharePoint policy restore error (2804) .

How to restore a backup

How to configure RBAC