cancel
Showing results for 
Search instead for 
Did you mean: 

No log of tape being frozen

Daryl_Kinnaird
Level 4

Netbackup Master server on HP-UX 11.23 IA-64  running NBU 6.5.5

I have a tape (700528) that was frozen, due to getting 3 write errors within 12 hrs.  The thing is none of the logs I checked show a message stating that the media was frozen and why.

 

output from /usr/openv/netbackup/db/error  for tape in question.

 1307587095 1 388 16 verstke3 4218141 0 0 cis004zil bptm cannot write image to media id 700528, drive index 13, I/O error
1307587103 1 386 8 verstke3 0 0 0 *NULL* bptm TapeAlert Code: 0x03, Type: Warning, Flag: HARD ERROR, from drive STKT9940B1 (index 13), Media Id 700528
1307587103 1 386 16 verstke3 0 0 0 *NULL* bptm TapeAlert Code: 0x04, Type: Critical, Flag: MEDIA, from drive STKT9940B1 (index 13), Media Id 700528 

 

output from bptm log for tape in question.

 19:38:15.361 [12954] <16> write_data: cannot write image to media id 700528, drive index 13, I/O error
19:38:23.396 [18408] <8> process_tapealert: TapeAlert Code: 0x03, Type: Warning, Flag: HARD ERROR, from drive STKT9940B1 (index 13), Media Id 700528
19:38:23.398 [18408] <16> process_tapealert: TapeAlert Code: 0x04, Type: Critical, Flag: MEDIA, from drive STKT9940B1 (index 13), Media Id 700528 

output from /usr/openv/netbackup/db/media/errors for tape in question

06/08/11 19:38:15 700528 13 WRITE_ERROR STKT9940B1
06/08/11 19:38:23 700528 13 TAPE_ALERT STKT9940B1 0x30000000 0x00000000

0x30000000 0x00000000 is equal to the two errors in the bptm log and the error log (HARD ERROR and MEDIA)

 

Status of tape from ADMIN GUI

 700528    700528    HCART2    ACS    6/8/2011 9:28    10/5/2008 9:46    6/8/2011 18:59    6/8/2011 19:00    Frozen   

I know the available_media command will show the Frozen tapes as well but it does not show why it was frozen, which is what I am looking for.  I need this so I can keep track of potential bad media so they can be replaced.                                                                                                                            

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I have noticed the changed logging behaviour since NBU 6.x - up and until NBU 5.1, I could simply run a 'bperror -media -hoursago |grep -i freez' on a daily basis on the master and catch all tapes being frozen on all media servers.

bperror/Tape Logs report will still log errors like these:

FREEZING  media id B02176, it is write protected and cannot be used for backups

FREEZING  media id B02696, it is write protected and cannot be used for backups

FREEZING  media id B02970, too many data blocks written, check tape/driver block size configuration

These tapes were also frozen, but the Media Log report contains ONLY the TapeAlert (the TapeAlert code is supposed to tell us that the tape has been frozen - as per the TapeAlert codes in Admin Guide II):

TapeAlert  Code: 0x08, Type: Warning, Flag: NOT DATA GRADE, from drive IBM.ULTRIUM-TD2.005 (index 5), Media Id BO1826

TapeAlert  Code: 0x0e, Type: Critical, Flag: UNREC. MECH. CARTRIDGE FAILURE, from drive IBM.ULTRIUM-TD2.011 (index 14), Media Id B94543

TapeAlert  Code: 0x04, Type: Critical, Flag: MEDIA, from drive IBM.ULTRIUM-TD2.011 (index 14), Media Id B04829

View solution in original post

7 REPLIES 7

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The TapeAlert is causing the media freeze.

Please see NBU Admin Guide II under Using TapeAlert:

p. 192:

A set of TapeAlert conditions are defined that can cause the media in use to be frozen. An additional set of conditions are defined that can cause a drive to be downed. Table 3-13 on page 192 describes the TapeAlert codes..

Andy_Welburn
Level 6

media was frozen and why"

... the fact that NetBackup is "FREEZING media id XXXXXX, etc etc etc" should be in the media servers bptm log. Maybe the logging level needs increasing? Also, OS logs?

A few T/N's for NB7.x, but they are still relevant:

Logs for troubleshooting frozen media
http://www.symantec.com/business/support/index?page=content&id=HOWTO33178

Frozen media troubleshooting considerations
http://www.symantec.com/business/support/index?page=content&id=HOWTO33061

About conditions that cause media to freeze
http://www.symantec.com/business/support/index?page=content&id=HOWTO33062

Daryl_Kinnaird
Level 4

I understand why the tape was fozen, it had 3 write erros within the default 12 hr window.  What I don't see from the logs OS or Netbakup is a statement saying the tape was frozen. 

Another example when I checked frozen tapes today I had one 701922 that was frozen on the 12th  becaues it had 3 POSITION ERRORS.  checked the the log files (bptm) (errors) (syslog.log), see the errors in bptm, see the errors in the errors log  file.  But again no message saying that the tape is being frozen because of the errors.

mph999
Level 6
Employee Accredited

The process that freezes a tape, or downs a drive is ltid.

The log is 

/usr/openv/volmgr/debug/ltid (create the directory)

To set verbose, add

VERBOSE

into the /usr/openv/volmgr/vm.conf file    There is no number after VERBOSE, just the word.

I would create these to set up a reasonable log collection for media manager operations:

mkdir /usr/openv/volmgr/debug/ltid

 

mkdir /usr/openv/volmgr/debug/tpcommand

 

mkdir /usr/openv/volmgr/debug/ltid/robots

Create these empty files :

/usr/openv/volmgr/DRIVE_DEBUG and ROBOT_DEBUG

Restart ltid

stopltid

ltid -v

The touch files will increase the drive/ robots messages in the system log.

 

Martin

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I have noticed the changed logging behaviour since NBU 6.x - up and until NBU 5.1, I could simply run a 'bperror -media -hoursago |grep -i freez' on a daily basis on the master and catch all tapes being frozen on all media servers.

bperror/Tape Logs report will still log errors like these:

FREEZING  media id B02176, it is write protected and cannot be used for backups

FREEZING  media id B02696, it is write protected and cannot be used for backups

FREEZING  media id B02970, too many data blocks written, check tape/driver block size configuration

These tapes were also frozen, but the Media Log report contains ONLY the TapeAlert (the TapeAlert code is supposed to tell us that the tape has been frozen - as per the TapeAlert codes in Admin Guide II):

TapeAlert  Code: 0x08, Type: Warning, Flag: NOT DATA GRADE, from drive IBM.ULTRIUM-TD2.005 (index 5), Media Id BO1826

TapeAlert  Code: 0x0e, Type: Critical, Flag: UNREC. MECH. CARTRIDGE FAILURE, from drive IBM.ULTRIUM-TD2.011 (index 14), Media Id B94543

TapeAlert  Code: 0x04, Type: Critical, Flag: MEDIA, from drive IBM.ULTRIUM-TD2.011 (index 14), Media Id B04829

Daryl_Kinnaird
Level 4

Added VERBOSE to the vm.conf file and added the empty DRIVE_DEBUG and ROBOT_DEBUG files.  Ltid will be stopped and started when our weekly maintenace is done which will be on Thursday.

J_H_Is_gone
Level 6

If you have NOM you can setup an alert to tell you when a tapes is frozen.