cancel
Showing results for 
Search instead for 
Did you mean: 

How do I determine a media should not be used again?

NB-OPS
Level 4

Hi,

 

I get periodic media errors which caused the tape drives to go down. How do I determine I should not use the media anymore? For example, I just got

Jul 13 12:36:17 buphost2 scsi: [ID 107833 kern.warning] WARNING: /pci@400/pci@0/pci@d/SUNW,qlc@0/fp@0,0/st@w500308c09f1a109d,0
(st10):
Jul 13 12:36:17 buphost2        Error for Command: space                   Error Level: Fatal
Jul 13 12:36:17 buphost2 scsi: [ID 107833 kern.notice]  Requested Block: 1                         Error Block: 1
Jul 13 12:36:17 buphost2 scsi: [ID 107833 kern.notice]  Vendor: HP                                 Serial Number:
Jul 13 12:36:17 buphost2 scsi: [ID 107833 kern.notice]  Sense Key: Media Error
Jul 13 12:36:17 buphost2 scsi: [ID 107833 kern.notice]  ASC: 0x14 (recorded entity not found), ASCQ: 0x0, FRU: 0x0

 

It seems to me it is a very serious error on the media. It's marked frozen by NBU.

 

1. Is it smart enough to mark that error block bad so the tape can be continued to be used if I unfreeze it?

2. Is it correct to assume even after the media is expired, it should not be used again? Basically once it's expired, I should throw it away?

 

Thanks,

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

mph999
Level 6
Employee Accredited

Here is the script

https://www-secure.symantec.com/connect/downloads/tperrsh-script-solaris-only

Also see

Media information - Comment:15 Jul 2013 : Link

https://www-secure.symantec.com/connect/forums/logging-detail-information-about-unrecoverablerecoverable-error

https://www-secure.symantec.com/connect/forums/media-diagnostic-tools

View solution in original post

2 REPLIES 2

mph999
Level 6
Employee Accredited

Yes, that is serious enough to retire the tape.

There is no easy way, without specialist software, to spot media before they fail. 

You could keep an eye on ...netbackup/db/media/errors file, on each media server and count the number of times a tape, or drive appears. 

However it is not exact, a few errors are perfectly fine, though not lthe one you have shown above.

Some errors appear in that file that are not really errors, eg drive needs cleaning.  Then there is the question of how masterworks are too many, 5, 15, 25  ???

Unfortunately, it comes down to experience.

Google for tperr.sh 

This is a script I wrote, available on connect that runs on Solaris and provides a level of analysis on the errors files. 

 

mph999
Level 6
Employee Accredited

Here is the script

https://www-secure.symantec.com/connect/downloads/tperrsh-script-solaris-only

Also see

Media information - Comment:15 Jul 2013 : Link

https://www-secure.symantec.com/connect/forums/logging-detail-information-about-unrecoverablerecoverable-error

https://www-secure.symantec.com/connect/forums/media-diagnostic-tools