cancel
Showing results for 
Search instead for 
Did you mean: 

VHD Backup2Tape turns Tape Offline

bitformer
Level 3

Hi,

we have some issue with a BE15 Installation on a 2012 R2 Hyper-V Host. Server has an internal Tape Drive LTO-6 and a NAS as iSCSI Target attached.

After weeks of normal function one day the backup of a 2.500 GB fileserver (3x VHD File) started turning off the Tape drive cancelling the job. I can not turn the drive online again (by unchecking the Offline checkbox). But i could bring the Tape Devise online again by restarting all BE Services within BE Service Manager. Event happens always after some 250 GB of the backup of that VM so probably at the same VHD and maybe the same bit/block etc.

On the other Hand: The Backup to a local Drive (íSCSI Target / NAS) with the same GRT job runs smoothly. I can even duplicate the VM backup set of that specific VM from local iSCSI Storage to the Tape Drive without any problem. All Hardware Checks by the vendor of server/tape drive were successful and without any error.

The only indication from the day it stopped working are some defrag errors in the VMs event log a few hours before the first backup fail.

Anyone an idea? bit-error in VHD? Update to BE16? duplicate Disc IDs? Limitation of LTO Filesystem or VHD size when backup is directly to tape.

Obviously i can workaround the problem by duplicating the VM from local backup to tape.

Installation:

Hyper-V Host with 5 VMs (50-2500GB each) (all 2012 R2 latest Patchlevel)

internal LTO-6 Drive SAS

local B2D Storage (iSCSI Target from NAS)

BE 15 FP5 with HF: 115844, 116031, 116730, 126145

best regards

Torsten

8 REPLIES 8

pkh
Moderator
Moderator
   VIP    Certified

Perhaps the tape is full.   Did you check whether there is an outstanding alert asking you to insert an overwritable tape into the tape drive?

Hi pkh,

media set is 4 days overwrite/append. Tapes are on daily basis so always 6+ days.

There is a daily inventory job 15 minutes before the daily backup tasks making sure the tape is correctly recognized.

The jobs are defined as "append or overwrite, if no appendable media is availabe" (2nd checkbox).

There are also no outstanding alerts with tape change request.

best regards

Torsten

Larry_Fine
Moderator
Moderator
   VIP   

When BE takes the tape drive offline, it should throw some alerts and put some entries in the adamm.log file.  Do you see any alerts?  Cn you post the adamm.log file?

What SAS HBA/controller are you using for the tape drive?  Hopefully it is not a RAID controller, unless it is one of the few certified ones.  see https://www.veritas.com/support/en_US/article.000038683

pkh
Moderator
Moderator
   VIP    Certified

Hyper-V Host with 5 VMs (50-2500GB each)

2500GB = 2.5TB  which is the native uncompressed capacity of a LTO6 tape.  To backup all your 5 VM's, you must be using compression.  Depending on your data type, compression can vary hence you may have exceed the capacity of your tape.

There are also no outstanding alerts with tape change request.

Note that these alerts go away when the job is cancelled or BE services are restarted.  You would have to check for such alerts when the problem occurs.

- Controller is a H240 SAS HBA

- hardware compression was and is activated

- a copy of the whole data from disk to tape consumes 2.02 TB on a 2.38 TB Tape. Compression ratio is 1.3:1 so there is space left (300+GB)

- the error occurs after 255GB on an empty tape (overwritable / not appendable) always with the same VHD

- there was no media request in the warning history after 2017/11/23.

- i tested the VM exclusively and was "live on stage" when the tape turned offline. there was no media request for i knew when it would happen

- there is a Tape0 and a Tape1 (Tape0 was replaced due to hardware error a few months ago and is still listed as "deactived") Tape1 is the only present tape device

ADAMM LOG attached

best regards

Larry_Fine
Moderator
Moderator
   VIP   

Is the H240 set to Simple mode, rather than RAID mode?  It appears to be a dual mode card and only one mode would support tape.  https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04496620

These entries from the adamm.log appear to either be a SAS communications failure or the tape device simply vanished.  But I agree that your description of the issue and the apparent reproducability of the issue make me suspicious of the data also.  Unfortunately, I personally know of no data that could cause this sort of scenario.

[37756] 11/23/17 10:13:02.718 DeviceIo: 00:02:01:00 - Refresh handle on "\\.\Tape1", SCSI cmd 16, new handle ffffffff, error 2
[37756] 11/23/17 10:13:02.718 DeviceIo: FindVanishedDevice: SwitchOver Adapter Info NULL for pathIde=2
[37756] 11/23/17 10:13:02.718 DeviceIo: Failed to find new device for "\\.\Tape1"
[37756] 11/23/17 10:13:02.718 DeviceIo: 00:02:01:00 - Refresh handle on "\\.\Tape1", SCSI cmd 1a, new handle ffffffff, error 2
[37756] 11/23/17 10:13:02.718 DeviceIo: FindVanishedDevice: SwitchOver Adapter Info NULL for pathIde=2
[37756] 11/23/17 10:13:02.718 DeviceIo: Failed to find new device for "\\.\Tape1"
[37756] 11/23/17 10:13:03.219 DeviceIo: 00:02:01:00 - Refresh handle on "\\.\Tape1", SCSI cmd 00, new handle ffffffff, error 2
[37756] 11/23/17 10:13:03.219 DeviceIo: FindVanishedDevice: SwitchOver Adapter Info NULL for pathIde=2
[37756] 11/23/17 10:13:03.219 DeviceIo: Failed to find new device for "\\.\Tape1"
[37756] 11/23/17 10:13:03.720 PvlDrive::DisableAccess() - ReserveDevice failed, offline device
       Drive = 1012 "Bandlaufwerke 0002"
       ERROR = 0x0000001F (ERROR_GEN_FAILURE)

[37756] 11/23/17 10:13:03.735 PvlDrive::UpdateOnlineState()
       Drive = 1012 "Bandlaufwerke 0002"
       ERROR = The device is offline!

- i will check that mode as soon as possible. since the backup worked well until 2017/10/22 and the customer had no maintenance during that time i have no clue how the mode could have changed

- @pkh: i did a tape wipe + inventory before running a test job (single VM as Hyper-V backup) and the job terminated at exaclty 254GB as all the tests before with the same symptoms

835 - Smart HBA H240 in Steckplatz 4 befindet sich im HBA-Modus. Alle physischen Laufwerke, die an diesen Controller angeschlossen sind, sind direkt im Betriebssystem zugänglich.

- which translates to... HBA Mode is activated