cancel
Showing results for 
Search instead for 
Did you mean: 

Backup Exec 2012 Offlining HP Tape Drive Randomly

Matt_Freestyle
Level 4

Hi All. First post here and I am really, really hoping for some help.

I am working on an environment that appears to have the fairly common problem of Backup Exec offlining an HP Autoloader at seemingly random times. Now, three weeks ago another tech fixed this issue by disabling 4 recommended HP services and it worked fine. The other night, the issue re-appeared despite nothing having changed - wierd.

Here is the setup:

Server 2K8R2 BE 2012 SP1
HP Ultrium1760 DRV tape device
HP 1/8 G2 Autoloader
HP P212 SaS controller

Here is the (compehensive) list of things I / we have tried to no avail:

1 - Replaced all hardware at a very early stage of the problem (has been ongoing for months now)
2 - Ensured Backup Exec's own drivers are being used for the device
3 - Set DB Maintenance to run way outside of the backup routine
4 - Un-installed and re-installed tape drive + autoloader
5 - Disable HP services as recommended by Symantec
6 - Attempt to backup to disk - this works
7 - Opened 3 seperate cases with Symantec, none of which fully resolved the issue as it re-occured
8 - Checked the ADAMM.log file and found this error just before the device offlined: 

[4608] 08/02/12 01:15:04.389 DeviceIo: 04:07:00:00 - Device error 1167 on "\\.\Tape0", SCSI cmd 4d, 1 total errors
[4608] 08/02/12 01:15:13.063 PvlDrive::DisableAccess() - ReserveDevice failed, offline device
Drive = 1033 "Tape drive 0001"
ERROR = 0x0000001F (ERROR_GEN_FAILURE)

9 - Tried running SGMON with Verbose devices and media logging enabled - this didnt go well. The device didnt error but none of the backup jobs completed.

The only thing I think we havent tried is running tracer.exe after seeing the device go offline. 

I read in Symantecs documentation that they don't support SaS controllers with RAID enabled, the P212 is one of those controllers. However, in the same document there was a list of tested controllers and the P212 is on that list, so I'd be a little cheesed if Symantec said it was a compatibility problem despite having a document that says it should work.

Ive also checked all the usuale places (event log, BE job / device logs etc) for more info and there really isnt much to go on. HP's testing tools always come back with passes when run, Im 99% sure that this isnt a hardware issue but have nowhere left to go with it.

Could someone please help me out? 

P.S. just as an FYI - this Backup Exec instance came from an upgrade of 2010, it wasnt a brand new install.. Don't know if that matters.

Thanks in advance,

Matt

31 REPLIES 31

Matt_Freestyle
Level 4

Hi Guys

HP have replaced the tape drive within the autoloader along with the cables. Lets see how it goes.. See you all again in a few weeks I suspect!

 

Cheers,

 

Matt

Matt_Freestyle
Level 4

Hi All

 

Okay - so the replacemet drive didnt work! HP are continuing to look at the case however and have asked me to make the following changes:

"Random backup failures when HP StorageWorks Ultrium Tape Drive is connected to a LSI based Host Bus Adapter and Storport driver version installed on the system is later than 5.2.3790.3959, due to Insight Manager Storage Agent timeout if the driver returns SCSI status BUSY and Storport driver retries the command unlimited times. In most of the cases, the tape drive will be discovered properly by the Operative System and will work fine when tested with HP Library And Tape Tools. Even if all possible polling to the tape drive is already prevented, the drive will fail backups randomly. System Event Log will not show any data that can be related to a drive or HBA failure (Event IDs 7, 9, 11 or 15).

Issue could appear on both Microsoft Windows Server 2003/2008, both 32bit and 64bit editions and with HP Insight Management Agents installed.

 

Solution

1. Click on Run, type regedit.

2. Open the path HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\[device identifier name for the tape LTO-3 tape device]\[numeric device instance id for the LTO-3 tape device]\Device Parameters\

3. Right Click Device Parameters, click New> Key and rename it as Storport.

4. Right Click Storport key> New> DWORD and rename it as BusyRetryCount and the value should be set to 250 decimal.

5. Exit regedit.

6. Reboot the server.

 

This registry settings are documented on Microsoft public article http://support.microsoft.com/kb/932755 under "Update to modify the behavior of the BUSY status and the Task Set Full status in the Storport driver", in our case, the default value of 20 has been changed to 250."

 

I will let everyone know if this fixes the issue or not. 

AmadeoD
Not applicable

Has this ticket been finally resolved with this Win/Reg Storport hacks?

Matt_Freestyle
Level 4

Hi all,

Nope - still not fixed. HP havent got back to me for around 4 days now. I think they are losing interest in the case, despite it clearly being some kind of hardware fault.

My money is now on the P212 SaS card being the cause of the issue, I am going to push for a replacement card if possible and then go from there I think

As always I will keep this post updated because I am determined to get this effin problem fixed!

Matt

Daniel_Pazout
Level 2

Hi all,

I facing the same problem,

The hardware is

HP X1600G2 24TB StorageWorks Server

HP MSL4048 with 1 x HP LTO5 SAS drives (all on current firmware)

Dedicated P212 SAS Controller (current firmware)

I summoned HP support on this and sent them my logs....

They responded: The conntroller is not showing HW issues, but for backup is not a good choice. Should be a controller without RAID function. We will search for an efective solution...

Well p212 is recomeded for this tape library but not for backup.... interesting      

Matt_Freestyle
Level 4

Hi Daniel,

Interesting! HP are suggesting to me that it could be an issue with the motherboard on the Storageworks x1600 and that they may have to escelate to the Proliant team to resolve.

Unfortunately, my client hasn't got back to me on if they wish for this to happen or not, and typically the backup has been working okay for the last few days.

Please do let us know how your issue progresses.

Thanks

Matt

Katsuki_Okamura
Level 2
Employee

Recently I have resolved an issue which is almost same configuration and the same error. Then, I updated TECH61192.

"An unknown error occurred" may occur when backing up to tape on a server with the HP Server Management Agents Software installed.
http://www.symantec.com/business/support/index?&page=content&id=TECH61192&locale=en_US

The issue was not resolved with the step 1 to 6 and step 8 in TECH61192.
Finally HP Server Support team suggested to uninstall "HP StorageWorks VDS Hardware Providers for MSA Disk Arrays", and then the issue was resolved. I added this step as step 7 into TECH61192.
Sorry but I don't know how to uninstall VDS Hardware Providers. Please contact to HP server support to uninstall the VDS Hardware Provider.

Please NOTE that Symantec does not support connecting a tape library to a RAID controller without hardware vender recommendation. If HP does not recommend the RAID use now, you may purchase a standalone SAS controller.
 

Daniel_Pazout
Level 2

Hi Matt,

Yesterday I did the following steps:

1) I uninstalled "HP StorageWorks VDS Hardware Providers for MSA Disk Arrays" as suggested by Katsuki and waited for the end of bussines to restart ...

2) Then later I was contacted by HP with a replacement controller, just to be shure that the controller is not defective. I picked up and I installed it to the server

3) firmware was pretty outdated so I updated with:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=US&swItem=MTX-9f...

Now I will wait 14 days and if the error does not reappear, I'll try to install HP StorageWorks VDS Hardware Providers for MSA Disk Arrays back to see if it was the controller....

 

 

Daniel_Pazout
Level 2
Hi all, from my tests it is truly "HP StorageWorks VDS Hardware Providers for MSA Disk Arrays", 3 days after instalation the error was back... Daniel

Wes_Miller
Level 4
Run “”tapeinst.exe” in the root of your BackupExec directory. Check “Use Symantec tape drivers for all support devices” Check “Delete entries for tape devices that are unavailable, removed, or turned off” Check “Use Plug and Play Drivers” FYI -> USB DEVICES ARE NOT SUPPORTTED Click and next your way through the rest then finish. This worked for me. This is for the known issue that removing a tape drive physically or changing it and the tape drive still appears in BE. This also covers the cannot delete because the drive is in use as well. It is not necessary to edit the DB unless this doesn’t work. NOT LIKELY.

CraigV
Moderator
Moderator
Partner    VIP    Accredited

Hi Wes,

 

This is a very old post...no need to drag it back up again. The OP never bothered to respond at all.

Thanks!

Matt_Freestyle
Level 4

Hi All,

Sorry to drag this up from the grave. Also, sorry for not responding. The notification of replies started going into my junk for some reason!

Okay so I never did get the issue resolved on the existing hardware, in the end we ended up with HP diagnosing a "low level hardware issue". We shipped the customer an old server, whacked a new P212 RAID controller into it and connected the library up to it. So far, so good. It's been working for over a month now.

Seems the most likely cause of these SCSI reservation errors is as HP say, a low level hardware issue.

Thanks,

Matt.