cancel
Showing results for 
Search instead for 
Did you mean: 

Media frozen, but that was not the media in tape drive...

Laurent_J
Level 3

Hi to all,

This night, my backup policies ran and terminated, 3 policies, starting a the same time (23:30) but with a strange thing in logs :

 

log1 :


9/10/2010 23:40:36 - requesting resource bsfe20163-LTO4
19/10/2010 23:40:36 - requesting resource bsfe20163.NBU_CLIENT.MAXJOBS.sav-prod
19/10/2010 23:40:36 - requesting resource bsfe20163.NBU_POLICY.MAXJOBS.Sauvegarde_Production
19/10/2010 23:40:36 - granted resource bsfe20163.NBU_CLIENT.MAXJOBS.sav-prod
19/10/2010 23:40:36 - granted resource bsfe20163.NBU_POLICY.MAXJOBS.Sauvegarde_Production
19/10/2010 23:40:36 - granted resource VEND-1
19/10/2010 23:40:36 - granted resource bsfe20163-LTO4
19/10/2010 23:40:36 - estimated 13298089 Kbytes needed
19/10/2010 23:40:45 - connecting
19/10/2010 23:40:46 - connected; connect time: 00:00:01
20/10/2010 00:32:36 - granted resource MARD-2
20/10/2010 00:32:36 - granted resource IBM.ULTRIUM-HH4.000
20/10/2010 00:32:36 - granted resource bsfe20163-LTO4
20/10/2010 00:32:38 - mounting MARD-2
20/10/2010 00:32:40 - mounted; mount time: 00:00:02
20/10/2010 00:32:40 - positioning MARD-2 to file 1
20/10/2010 00:32:43 - positioned MARD-2; position time: 00:00:03
20/10/2010 00:32:43 - begin writing
20/10/2010 02:50:32 - end writing; write time: 02:17:49
the requested operation was successfully completed(0)


The tape drive is standalone, obviously noone change the tape in the drive, i personnaly put the tape in the drive, and this was MARD-2, not VEND-1...50 minutes later, the tape drive granted the good tape...

 

But look, this is log2 :


19/10/2010 23:30:02 - requesting resource bsfe20163-LTO4
19/10/2010 23:30:02 - requesting resource bsfe20163.NBU_CLIENT.MAXJOBS.sav-prod
19/10/2010 23:30:02 - requesting resource bsfe20163.NBU_POLICY.MAXJOBS.Sauvegarde_Archivage
19/10/2010 23:30:03 - granted resource bsfe20163.NBU_CLIENT.MAXJOBS.sav-prod
19/10/2010 23:30:03 - granted resource bsfe20163.NBU_POLICY.MAXJOBS.Sauvegarde_Archivage
19/10/2010 23:30:03 - granted resource LUND-2
19/10/2010 23:30:03 - granted resource IBM.ULTRIUM-HH4.000
19/10/2010 23:30:03 - granted resource bsfe20163-LTO4
19/10/2010 23:30:04 - estimated 28615585 Kbytes needed
19/10/2010 23:30:04 - started process bpbrm (2112)
19/10/2010 23:30:16 - Error bptm(pid=1924) FREEZING media id LUND-2, Medium identifiers do not match    
19/10/2010 23:30:16 - Warning bptm(pid=1924) media id LUND-2 load operation reported an error     
19/10/2010 23:30:16 - current media LUND-2 complete, requesting next resource Any
19/10/2010 23:30:19 - mounting LUND-2
19/10/2010 23:30:21 - connecting
19/10/2010 23:30:22 - connected; connect time: 00:00:01
19/10/2010 23:30:26 - current media -- complete, awaiting next media Any A pending request has been generated for this resource request.
     Operator action may be required. Pending Action: No action.,
     Media ID: VEND-1, Barcode: --------, Density: hcart, Access Mode: Write,
     Action Drive Name: N/A, Action Media Server: N/A, Robot Number: N/A, Robot Type: NONE,
     Volume Group: CNPT, Action Acs: N/A, Action Lsm: N/A
    
20/10/2010 00:32:36 - granted resource MARD-2
20/10/2010 00:32:36 - granted resource IBM.ULTRIUM-HH4.000
20/10/2010 00:32:36 - granted resource bsfe20163-LTO4
20/10/2010 00:32:38 - mounting MARD-2
20/10/2010 00:32:40 - mounted; mount time: 00:00:02
20/10/2010 00:32:40 - positioning MARD-2 to file 1
20/10/2010 00:32:43 - positioned MARD-2; position time: 00:00:03
20/10/2010 00:32:43 - begin writing
20/10/2010 05:44:01 - end writing; write time: 05:11:18
the requested operation was successfully completed(0)


The tape id drive is, this time, seen as LUND-2, and an error is generated (i guess the tape drive reported the SN of MARD-2 tape, in the bptm log the SN mismatch). The tape LUND-2 is frozen.....50 minutes later, at the same time that the previous log, the tape drive granted the good tape...

 

And now the log3 :


19/10/2010 23:30:02 - requesting resource bsfe20163-LTO4
19/10/2010 23:30:02 - requesting resource bsfe20163.NBU_CLIENT.MAXJOBS.bsfe20163
19/10/2010 23:30:02 - requesting resource bsfe20163.NBU_POLICY.MAXJOBS.Sauvegarde_VCB_et_Catalogue
19/10/2010 23:30:15 - granted resource bsfe20163.NBU_CLIENT.MAXJOBS.bsfe20163
19/10/2010 23:30:15 - granted resource bsfe20163.NBU_POLICY.MAXJOBS.Sauvegarde_VCB_et_Catalogue
19/10/2010 23:30:15 - granted resource LUND-2
19/10/2010 23:30:15 - granted resource IBM.ULTRIUM-HH4.000
19/10/2010 23:30:15 - granted resource bsfe20163-LTO4
19/10/2010 23:30:16 - estimated 194110959 Kbytes needed
19/10/2010 23:30:16 - current media LUND-2 complete, requesting next resource Any
19/10/2010 23:30:16 - started
19/10/2010 23:30:19 - mounting LUND-2
19/10/2010 23:30:26 - current media -- complete, awaiting next media Any A pending request has been generated for this resource request.
     Operator action may be required. Pending Action: No action.,
     Media ID: VEND-1, Barcode: --------, Density: hcart, Access Mode: Write,
     Action Drive Name: N/A, Action Media Server: N/A, Robot Number: N/A, Robot Type: NONE,
     Volume Group: CNPT, Action Acs: N/A, Action Lsm: N/A
    
19/10/2010 23:30:32 - connecting
19/10/2010 23:30:32 - connected; connect time: 00:00:00
20/10/2010 00:32:36 - granted resource MARD-2
20/10/2010 00:32:36 - granted resource IBM.ULTRIUM-HH4.000
20/10/2010 00:32:36 - granted resource bsfe20163-LTO4
20/10/2010 00:32:38 - mounting MARD-2
20/10/2010 00:32:40 - mounted; mount time: 00:00:02
20/10/2010 00:32:40 - positioning MARD-2 to file 1
20/10/2010 00:32:43 - positioned MARD-2; position time: 00:00:03
20/10/2010 00:32:43 - begin writing
20/10/2010 02:24:46 - end writing; write time: 01:52:03
the requested operation was successfully completed(0)


idem, tape drive reported LUND-2, frozen in log2 so tape complete...waiting....50 minutes later, tape drive reported the good tape in drive...

 

After that, all backup ran, copy, and terminated with no errors in admin console ! But only in admin console, because "bpstart" script in sauvegarde_production (log2) was not launched...but bpend script was !

Somes ideas please ? For example, how to force the tape drive to say to the media manager what tape is in drive ?

 

Have a good day,

10 REPLIES 10

Andy_Welburn
Level 6

changed recently?

Have seen similar issues whenthe drives have been configured incorrectly e.g. what NetBackup sees as drive 1 is actually drive 2 so the software labels written to the tape then no longer match the barcode labels & NB will use this to verify the correct tape is loaded - otherwise it freezes the tape.

See if this T/N can help:

Steps to verify device configuration using robtest.
http://www.symantec.com/business/support/index?page=content&id=TECH29201

Laurent_J
Level 3

Hi andy, thank you for your answer.

This is not a new configuration, and no hardware modifications were made recently (no H/W mods were made, at all...).

I looked your T/N but the tape drive is a standalone one, i can't use robtest in this configuration :)

Andy_Welburn
Level 6

Sorry, missed that bit!

***EDIT***

All I can suggest is load the frozen tape & read the label on it?

If there's no data on the tape, maybe worth re-labelling it?

You've probably already seen this one(!):

How to perform Media Management with standalone Drives in NetBackup 5.x and 6.x.
http://www.symantec.com/business/support/index?page=content&id=TECH56903

Marianne
Level 6
Partner    VIP    Accredited Certified

Please enable bptm log for future troubleshooting.

Also:

Add this line to <install-path>\veritas\volmgr\vm.conf

VERBOSE

Restart NBU Device Management service.

Check Event Viewer Application log for errors if this problem occurs again.

Laurent_J
Level 3

I have an explanation of all tapes names indicated in logs :

First backup to run, log2 - 23:30:

Tape drive says tape in drive is LUND-2, NB grants this tape, but reports SN is MARD-2 one. NB freezes LUND-2 because, obviously, SN mismatchs.

 

Second backup to run, log3 - 23:30:

NB grants the tape LUND-2, but in the same time NB freeze it. LUND-2 is in state complete, and backup wait for another one...

 

third backup to run, log1 - 23:40:

10 minutes after LUND-2 was frozen, backup3 launchs, and request directly anther tape (because NB knows that tape reported in drive is frozen). NB grants another tape, VEND-1 (which was tape for the previous backup) and assign drive to mount this one. As a standalone one, if any tape of the same pool is mounted in tapedrive (and was not frozen..) the tape will be granted in place of VEND-1.

 

00:32:

tapedrive reports another tape in drive, MARD-2, and all backup start...

 

What do you think of this explanation ?

And if this is correct, my problem could be the tapedrive that do not correctly refresh tape SN/label on tape insertion, or do not communicate correctly with Media Manager ?

And why, 52 minutes later, tapedrive indicated good tape in drive, is there a "refresh" setting for that in NBU ?

 

PS : Sorry for my english, i hope you will read (or decrypt...) what i wanted to say :D

Marianne
Level 6
Partner    VIP    Accredited Certified

I FULLY understand your explanation. We need bptm log to see what is happening at tape management level - activity monitor does contain all relevant details. bptm log will also tell us if the internal media id (contained in the tape header) is read correctly.

bptm log as well as VERBOSE level media manager log (logged to Event Viewer application log) is needed for further troubleshooting.

 

It is also recommended to SUSPEND media that you eject from the tape drive. This way NBU will not ask for a previous tape to be loaded.

Laurent_J
Level 3

Ho, i didn't notice Mariane post, sorry !

 

I did what you said.

In the same time, i made a log file of 2 logs files from bptm process, between 10/19 23:30 and 10/20 06:00.

I attached it in this post, if you want to dig in ;)

Marianne
Level 6
Partner    VIP    Accredited Certified

Was the tape drive replaced at any point in time?

 

Extract from bptm log just before media is FROZEN:

23:30:16.683 [1924.2956] <4> manage_drive_attributes: expected serial number [AA0PC8KPX4], reported [AA0PC8L1DK]

 

The problem could also be related to tape drive firmware version.

Have a look at this TN: http://www.symantec.com/business/support/index?page=content&id=TECH51486

I know the TN mentions DLT, but it seems the same applies to your tape drive.

Verify tested driver and firmware versions in the Hardware Compatibility Guide:

http://www.symantec.com/business/support/index?page=content&id=TECH59978

Laurent_J
Level 3

No, the tape drive was not replaced, but some OS updates from Microsoft were installed yesterday.

Firmware of the tape drive is up to date, but the driver was Microsoft one, and not in the compatibility list. I downloaded and installed the manufacturer driver in the compatibility list, and re-configured the tape drive.

Hope this can be my solution, thanks for your help !

If something happen in the next few days i will post in this thread.

Have a good day :)

MarcoV
Level 3
Partner Accredited Certified

Hi,

Double check the mentioned S/N pointed out by Marianne. 

Did the server reboot after the M$ patches? Maybe the Tape Symbolic Names swapped after installing the M$ patches / the reboot.

Having supported tape drivers is always nice.. ;-))
Maybe therefore the M$ driver is not supported due to inconsistent behavior...

Regards,