05-30-2017 05:13 AM
Hello ALL,
Netbackup 7.1
Redhat 2.6
We suffering with the following issue:
When we are trying to restore from tape , stopping with this error:
annot read media header, may not be NetBackup media or is corrupted (172)
We've tried with this links:
https://www.veritas.com/support/en_US/article.TECH18639
and then with this one
(On this link we couldn't get block size :
bpmedialist -mcontents -m E978L5
cannot read media header, may not be NetBackup media or is corrupted)
Do you have any suggestions ?
BR,
Turgun
05-30-2017 05:20 AM
05-30-2017 05:26 AM
05-30-2017 05:33 AM - edited 05-30-2017 05:41 AM
I assume that you are trying to restore from a tape drive of the same vendor. If not and you know the drive that backup use, try to use it for the restore.
check if you can read the tape and if the tape header is ok
05-30-2017 05:59 AM
No, I couldn't read the media header :
# /usr/openv/volmgr/bin/tpreq -m E978L5 -d dlt -p SN_ArchiveSMSC2 -f /tmp/292
# mt -f /tmp/292 rewind
# dd if=/tmp/292 bs=1024 | od -cx
0000000 \0 \0 \0 002 g e o s m s c 2 _ 1 4 4
0000 0200 6567 736f 736d 3263 315f 3434
0000020 9 1 2 0 7 2 7 \0 \0 \0 \0 \0 \0 \0 \0 \0
3139 3032 3237 0037 0000 0000 0000 0000
0000040 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000 0000 0000 0000 0000 0000 0000 0000
*
0000200 \0 \0 \0 \0 V _ 323 327 177 377 377 377 \0 \0 \0 \t
0000 0000 5f56 d7d3 ff7f ffff 0000 0900
0000220 \0 \0 \0 001 \0 \0 005 240 \0 004 \0 \0 \0 \0 \0 \0
0000 0100 0000 a005 0400 0000 0000 0000
0000240 E 9 7 8 L 5 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
3945 3837 354c 0000 0000 0000 0000 0000
0000260 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 001 \0 003 \0 \0
0000 0000 0000 0000 0000 0100 0300 0000
0000300 \0 \0 \0 \0 \0 \0 \0 \0 T h I s I s
0000 0000 0000 0000 6854 7349 4920 2073
0000320 A B P B a C k U p H e A d
2041 5042 4220 4361 556b 2070 6548 6441
0000340 E r \0 S N _ A r c h i v e _ S M
7245 5300 5f4e 7241 6863 7669 5f65 4d53
0000360 S C 2 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
4353 0032 0000 0000 0000 0000 0000 0000
0000400 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000 0000 0000 0000 0000 0000 0000 0000
*
0000540 \0 \0 \0 \0 O p e n m i n d _ A r c
0000 0000 704f 6e65 696d 646e 415f 6372
0000560 h \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0068 0000 0000 0000 0000 0000 0000 0000
0000600 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000 0000 0000 0000 0000 0000 0000 0000
*
dd: reading `/tmp/292': Cannot allocate memory
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied0002000
, 0.0234027 s, 43.8 kB/s
05-30-2017 06:35 AM
Looks somthing has overwritten the tape - I hope @mph999 can assist me.
If we look at the records:
This seems to be a Netbackup header - it says "BP backup header"
0000320 A B P B a C k U p H e A d
0000340 E r \0 S N _ A r c h i v e _ S M
But hey - here is a Openmind archive here. Does Openmind ring a bell ?
0000540 \0 \0 \0 \0 O p e n m i n d _ A r c
05-30-2017 06:43 AM
@Nicolai, Do you mean O p e n m i n d _ A r c overwrite the tape?
05-30-2017 06:45 AM
Yes, it does not look like a Netbackup header.
05-30-2017 07:06 AM
Is there any standart for media header?
Where is the error?
0000540 \0 \0 \0 \0 O p e n m i n d _ A r c
Cuz of \0 \0 \0 \0 ?
Because we have SN_Openmind policy:
# bppllist | grep Openmind
SN_Openmind
SN_Openmind_Full
Br,
Turgun
05-30-2017 07:12 AM
These media is used by this Policy -SN_Archive_SMSC2
The name of schedule is Openmind_Arch
#bppllist SN_Archive_SMSC2 -U
:
Schedule: Openmind_Arch
Type: User Archive
:
:
05-30-2017 10:52 AM
Did someone call me :0)
Tapes, my absolute favorite topic ...
Yep, that header looks wrong, it should look like this:
0000000 V O L 1 M 0 0 0 0 2 \0 \0 \0 \0 \0 \0
0000020 \0 \0 \0 \0 \0 \0 \0 001 \0 \0 \0 024 Y - t 203
0000040 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000060 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \n \0 \0 \0 001
0000100 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 004 \0 \0 \0 \0 \0
0000120 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
0000160 T h I s I s A B P t A p
0000200 E h E a D e r \0 \0 \0 \0 \0 \0 \0 \0
0000220 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
... something appears to have overwritten it.
05-30-2017 12:33 PM
05-30-2017 11:17 PM
Not via NBU as it will refuse to even mount the tape.
You could try and dd the data off the tape, depending on where the logical eot (end-of-tape) marker is though, the drive itself will refuse to read past that point, even if there is further data on the tape. EG. Think of a full NBU tape, then you relabel it - the data is still there, you just can't get past the eot mark that was written after the new header.
This is where data recovery specialists come in - they have tape drives with modified firmware that can read past an eot mark, and get at data that is still on the tape. Problem with ths is there is no promise of success, and it's usually very very expensive.
05-30-2017 11:29 PM
Ok, Thanks @mph999
In which case, media header would be overwritten ?
This cartridge was in tape and we are not using another backup software except Netbackup.
We need to avoid similar issue in future.
Br,
Turgun
05-31-2017 12:02 AM
We cannot really answer this question, NBU didn't overwrite the tape, I know this because the NBU Backup Header is still there, at least some of it). In fact NBU won't overwtite tapes, as it looks for the empty header to confirm it is positioned correctly - although issues can happen if external scsi-resets cause the tape to rewind mid-backup, this is invisible to NBU and the operating system, but causes the Media and Backup headers to be overwritten (unless the overwrite is <1k which is not likely). It is usually caused by devices sharing a tape drive (eg, NBU media server and NDMP filer) being set to use different types of scsi reservation.
If you run scsi_command -map on the tape' like this (move the tape into a drive first, as with od -c)
/usr/openv/volmgr/bin/scsi_command -map -f /dev/rmt/0cbn
You can read the tape and get an idea of what is on it (easier to read than od -c)
root@womble 22110840 $ scsi_command -map -f /dev/rmt/0cbn
root@womble 22110840 $ scsi_command -map -f /dev/rmt/0cbn
00000000: file 1: record 1: size 1024: NBU MEDIA header (TAPE03)
00000001: file 1: eof after 1 records: 1024 bytes <<<<<<<<<<<<<<<<<<< Media header size 1024 bytes
00000002: file 2: record 1: size 1024: NBU BACKUP header <<<<<<<<<< Backup header, also 1024 bytes
backup_id womble_1496216120: frag 1: file 1: copy 1
expiration 1497425720: retention 1: block_size 65536
flags 0x0: mpx_headers 0: resume_count 0: media TAPE03
00000003: file 2: record 2: size 32768 <<<<<<<<<<< Backup data (was a tiny backup)
00000004: file 2: eof after 2 records: 33792 bytes <<<<<<< (=32768+1024 = Backup data _ Media header)
00000005: file 3: record 1: size 1024: NBU EMPTY header (file 2) <<<<< (NBU empty header)
00000006: file 3: eof after 1 records: 1024 bytes
eot <<<<< end-of-tape mark (logical)
Given we still see the Backup Header on your tape, then whatever spatted the Media header, was 1024 byes or less in size.
05-31-2017 02:06 AM
Output of scsi_command -map -f on our media is :
# scsi_command -map -f /tmp/292
00000000: file 1: record 1: size 1024: NBU BACKUP header
backup_id geosmsc2_1449120727: frag 1: file 1440: copy 1
expiration 2147483647: retention 9: block_size 262144
flags 0x0: mpx_headers 0: resume_count 0: media E978L5
00000001: file 1: record 2: size 262144
00001801: file 1: eof after 1801 records: 471860224 bytes
00001802: file 2: record 1: size 1024: NBU EMPTY header (file 1441)
00001803: file 2: eof after 1 records: 1024 bytes
eot
05-31-2017 03:13 AM
OK ... so the media header is completely missing
00000000: file 1: record 1: size 1024: NBU BACKUP header
backup_id geosmsc2_1449120727: frag 1: file 1440: copy 1
expiration 2147483647: retention 9: block_size 262144
flags 0x0: mpx_headers 0: resume_count 0: media E978L5
00000001: file 1: record 2: size 262144
00001801: file 1: eof after 1801 records: 471860224 bytes
00001802: file 2: record 1: size 1024: NBU EMPTY header (file 1441)
00001803: file 2: eof after 1 records: 1024 bytes
eot
I was hoping it might show some details about what is on the tape where the media header should be (that dd | od -c ) was showing, but sadly not, so I can't really tell anymore from this, apart from it doesn't look like you had a scsi rewind event, as that tends to overwrite both headers.
There is only one image on the tape, so the header could have become damaged when the header was first written (though the drive should have detected this as it reads back what is written to validate it (read head is positioned after write head)) or sometime after if the tape was manually mounted. The image is back in 2015, so I doubt you have any logs from back then.
This is a long shot, but you could try :
mt -f /dev/tapedevice -rew
mt -f /dev/tapedevice -fsr 1 (should position te tape in front of file 1 record 2 (start of the data)
dd if=/dev/tapedevice bs=262144 count=1801 >/tmp/myfile.tar (count = (471860224/262144)) (rounded up)
/usr/openv/netbackup/bin/tar xvf /tmp/myfile.tar
(might need to play with the number count value)
05-31-2017 08:03 AM
100% sure you don't have two tapes with the same label, this could explain the senario ...
06-01-2017 02:03 AM
Yes ,I am sure.
We don't duplicate the labels as it is against the backup rule.