cancel
Showing results for 
Search instead for 
Did you mean: 
mph999
Level 6
Employee Accredited

 

Hi All,
 
As promised - here is a copy of the tperr.sh script - to assist investigation of tape/ drive  issues in NetBackup
 
Note, please rename script from tperr.txt to tperr.sh.
 
It is quite difficult to explain how to use tperr.sh, so I will give examples.
 
NOTE.
 
The most important part :
 
SYMANTEC DO NOT SUPPORT THIS SCRIPT
 
This is written for Solaris only, tested on Solaris 10/ NBU 6.5 - 7.1 though should be ok at 7.5
I have no plans to rewrite this for any other OS, or, in any other 'language' 
It is not designed to be 'pretty' - it is designed to be functional
I have tested the script to the best of my ability - as far as I know it works
If you plan to run this script on your server, check it first - if you are not happy with it, DO NOT run it.
 
I have seen no issues with the script, BUT ...
 
It loads the drive and media list into an associate array, if you have many many thousands of media this will be quite big, and, I don't know what will happen.  If it 'doubt' use the -n option
 
 
About the script:
 
It was wrtten to asssist in helping to determine if media freezing errors are caused by the drives or tapes (or both).
It works using statistics on the /usr/openv/netbackup/db/media/errors file -  you compare the results of each tape/ drive agaist the others, so the more entries in the media/errors file, the greater the accuracy.
In other words - if you only have a few lines - the results are not likely to useful.  If you have hundreds of lines, the results have been shown to be very accurate.
 
It does not give magic answers, it gives details that you can use, to come to your own conclusions.
 
The errors file can have incorrect lines for example, it may be missing the media details, but contain the drive details.  Such linkes will give incorrect output.
To workaround this, tperr.sh checks that the tape / drive on each line exists in EMM - if not, the line is ignored.
 
If the errors file is copied to another (non-netbackup) server the script will filter out every line (as none will be found) - use -n to turn off this check.
 
 
The 'help' info can be dispayed using tperr.sh -h
 
List of valid options to use on tperr.sh:
tperr.sh -a
tperr.sh -a -f /tmp/file
tperr.sh -d <drivename>
tperr.sh -d <drivename> -u
tperr.sh -d <drivename> -f <alternate file>
tperr.sh -d <drivename> -f <alternate file>  -u
tperr.sh -t <media id>
tperr.sh -t <media id> -u
tperr.sh -t <media id> -f <alternate file>
tperr.sh -t <media id> -f <alternate file>  -u
 
Description of options:
[-a] - Generally summary of media/ drive errors
[-f <filename>] - Specify alternate location of errors file.  This can be copied from another server
[-d <drivename>] - For any tape that had an error in <drivename>, shows the other drives that these tapes had an error in
[-t <media id>] - For any drive that had an error with <media>, shows the other tapes that this <drivename> had an error with
[-u] - Used with either the -d or -t options to limit the output to only the <drivename> or <media id> specified
[-n] - Do not validate each line of the errors file.  This is required if the errors file is from another NBU environment
[-m <mail address>] - Send output to <mailaddress>  Can be used with any option.
[-l] - Set up media manager logs and increase verbose levels
[-L] - Collect media manager logs
 
 
For the -m option to work the Solaris mailx command must be working
It can be tested with a command such as  mailx -s Test mail <email address> </dev/null
 
 
Basic use - tperr.sh -a
 
Errors File exists ....
R0TP00 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 3)
R0TP01 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 14)
R0TP02 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 10)
 
 
HP.DAT72X6.000 has had errors with 3 different tapes   (Total occurrences (errors) for this drive is 19)
HP.C5713A.000 has had errors with 3 different tapes   (Total occurrences (errors) for this drive is 8)
 
 
Here we can easilty spot trends, for example. if a media errors in many different drives compared to other media, or it has many more errors than the others.
For example, tape R0TP01 has the highest number errors, and Drive HP.DAT72X6.000.  
 
 
tperr.sh -d HP.DAT72X6.000
 
This output shows, for any tape that had an error in the <drive>, which other drives that tape also had an error is.   The numbers in ( )s are the total error count.
 
 
 The tapes that had an error in drive HP.DAT72X6.000, also had errors in the following other drives ...
 
 
Media - R0TP00
  Drive - HP.C5713A.000 (       2 )
  Drive - HP.DAT72X6.000 (       1 )
Media - R0TP01
  Drive - HP.C5713A.000 (       2 )
  Drive - HP.DAT72X6.000 (      12 )
Media - R0TP02
  Drive - HP.C5713A.000 (       4 )
  Drive - HP.DAT72X6.000 (       6 )
 
 
Here we see that the other drive (HP.C5713A.000) has low error counts.  This may indicate that the issue is with drive HP.DAT72X6.000
 
 
We can also do the opposite, specify a tape ...
 
tperr.sh -t R0TP01
 
 
 The drives that had an error with media R0TP01 also had errors with the following other media  ...
 
NetBackup Drive - HP.C5713A.000
  R0TP00 (       2 )
  R0TP01 (       2 )
  R0TP02 (       4 )
NetBackup Drive - HP.DAT72X6.000
  R0TP00 (       1 )
  R0TP01 (      12 )
  R0TP02 (       6 )
 
 
 
With either the -t -or -d options, we can use -u, to limit the results only to the tape or drive specified ...  eg.
 
 
tperr.sh_pub -t R0TP01 -u
Unique file option
Errors File exists ....
 
 
The drives that had an error with media R0TP01 also had errors with the following other media  ...
 
NetBackup Drive - HP.C5713A.000
  R0TP01 (       2 )
NetBackup Drive - HP.DAT72X6.000
  R0TP01 (      12 )
 
As we see - this limits the output to just the media 'R0TP01'.  This is the way to determine exactly which drives the errors occured in for any media.
 
-u can also be used with -d
 
 
By default, the script looks for the errors file in the default location.  If this is moved, or the errors file is copied to another server the -f option can be used to read in this alternate file ...
 
 
As mentioned, if there are 'bad' lines in the errors file, if the -n option is used to disable checksing against emm, incorrect output will be seen.
 
For example.
 
 
tperr.sh -a -n
 
Errors File exists ....
R0TP00 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 3)
R0TP01 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 14)
R0TP02 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 10)
R0TP03 has had errors in 2 different drives   (Total occurrences (errors) of this volume is 22)
1 has had errors in 1 different drives   (Total occurrences (errors) of this volume is 14
10
1)
 
 
has had errors with 1 different tapes   (Total occurrences (errors) for this drive is )
HP.DAT72X6.000 has had errors with 4 different tapes   (Total occurrences (errors) for this drive is 31)
HP.C5713A.000 has had errors with 4 different tapes   (Total occurrences (errors) for this drive is 18)
 
 
The ONLY way to remove this 'bad' output, is to either not use the -n option, or, remove the bad lines from the media/errors file.
 
In this example, the bad output was caused by this 'incorrect' line, that is missing a media entry.
 
02/24/12 15:00:07 1 READ_ERROR HP.C5713A.000
 
The -l option will set up the common media manager logs.  Please read the comment in the script regarding this option as it 'edits' bp.conf.
The -L option will copy the logs to a common directory.  (/usr/openv/netbackup/logs/tperr_logs).
Comments
Harpreet_Singh1
Level 3

Hi,

I have tried to run it, but it did not work for me.

It has just created a directory in /tmp/tperr_symantec_29024.

<<SunOS harpux103 5.10 Generic_142900-14>>.

With Regards.

Harpreet Singh

mph999
Level 6
Employee Accredited

Hi Harpreet - What was the issue ?

Providing there is valid lines in the errors file (/usr/openv/netbackup/db/media/errors) it should work.

It looks in the default location, if you have copied the errors file to a separate location, you specify this with -f .

Example :

tperr.sh -f /path/to/errorsfile -a 

If the server you are running it on is not a NetBAckup master or media server, use -n.

Martin

mph999
Level 6
Employee Accredited

I've sent you an email

Version history
Last update:
‎03-02-2012 01:34 AM
Updated by: