05-28-2012 08:12 AM
OS: Windows Server 2008 r2 Std SP1
Software: Backup Exec 2010 r3
HBA: LSI SAS 3442E (Flashed as initiator target)
Library: Overland ArcVault 48 with two HP SAS drives
I intermitently get the following error:
Type | Category | Message | Time Alert Received | Job Name | Device Name | Server Name | Source |
---|---|---|---|---|---|---|---|
Error | Device Error | The drive hardware is offline. | 27/05/2012 18:52:15 | HP2 | CURIE | Device |
Solved! Go to Solution.
08-09-2012 09:15 AM
So I finally fixed my issue! I purchased an LSI SAS 9200-8e, no issues since.
I think uprading to Windows Server 2008 triggered the problem. There appears to be an incompatiblilty between the LSI SAS 3442E / HP SC44Ge cards and Windows Server 2008. Some people have fixed it with registry edits, see posts below.
Thanks for all your input.
05-28-2012 08:25 AM
Is this a RAID card?
Refer: http://www.symantec.com/docs/TECH70907
From the TN: "As documented on the Backup Exec Hardware Compatibility Lists, Host-Bus-Adapters featuring RAID are generally not supported or recommended."
When i googled for this HBA model i found: LSISAS3442E-R. 3Gb/s SAS Eight-Port Host Bus Adapter with Integrated RAID.
If you are not sure about this check in OEM documentation or contact your vendor.
Thanks....
05-28-2012 08:39 AM
It can be a RAID card but it is flashed with target firmware so it has no RAID functionality.
I have used the same card successfully for the past three years and it is the HBA recomended to me by Overland.
05-28-2012 08:53 AM
I have used the same card successfully for the past three years
successfully with Backup Exec?
Overland may recommended it, but it always better to go according to Backup Exec compatibility while using BE.
Do you have any other compatible HBA card to test with?
05-29-2012 01:01 AM
Yes, with backup exec.
I don't have another card to test with.
05-29-2012 01:23 AM
Hi Relentim,
Have you tried to uninstall the library completely from both Windows and BE? Try the following:
1. Uninstall the library from BE, & shut it down. Disconnect the cables from the server.
2. Uninstall the library from Windows Device Manager, and restart the server.
3. Allow to boot into Windows, and make sure the library doesn't show up.
4. Shut the server down, reconnect the library and start it up. Allow to complete the initialization, and then start up the server.
5. Make sure you're running the latest Symantec DDI package, and then install the Symantec drivers. Make sure the robotics shows up as Unknown Medium Changer.
If this doesn't work, the next best thing is to log a call with Symantec (assuming you have support in place with them!), and then see what they say. However, they MIGHT point out the RAID controller...
Thanks!
05-29-2012 10:09 AM
[1136] 05/27/12 18:52:15.015 PvlDrive::DisableAccess() - ReserveDevice failed, offline device
Since this is a SAS environment, there should be no reservation conflicts like could occur in an FC environment. Therefore I suspect a hardware or communication issue that leads to the Reserve command failing.
05-31-2012 01:54 AM
Thanks for your advice, I tried this but expienced the issue again last night.
05-31-2012 02:26 AM
...are you using the Symantec drivers for the device? I don't know Overland, but if there is a robotics changer involved, does it show up as Unknown Medium Changer in Device Manager? If not, update the drivers to this and try again...
05-31-2012 02:36 AM
Yes, drivers are Symantec and the library is listed as Unknown Medium Changer in Device Manager.
06-02-2012 07:44 PM
If you've been using this HBA successfully for three years (and nothing's changed) then you are probably dealing with a hardware issue. If you can reproduce the problem with the smallest of backups, then capturing it with tracer.exe would be very helpful. This is a software based hardware analyzer that ships with BE.
Here's a Technote that will help guide you to the root of the problem, using tracer.exe:
http://www.symantec.com/business/support/index?page=content&id=TECH49432
If you get stuck, then post the output of tracer here and we'll take a look at it for you.
06-12-2012 08:28 AM
I cannot reproduce the issue on a small backup, I get hours of successful backups before it occurs.
06-12-2012 12:40 PM
...and you AV isn't possibly blocking BE's services at all during the backup?
06-13-2012 12:57 AM
I have no AV on the backup server.
06-13-2012 01:01 AM
...OK, so try this then: remove 1 of the drives and run your jobs to 1...see if this takes the library offline. Doesn't have to be a big backup. Once done, swop the drives around and then repeat.
06-15-2012 01:33 AM
I disabled drive 2, my nighly backup ran for 1:33 then failed with drive 1 going offline.
I then enabled drive 2 and retryed the backup, drive 2 also failed after 0:35.
I spoke to LSI and they recomended a new driver, I'll try this over the weekend and report back.
Thanks for your help.
06-15-2012 06:59 AM
why did the drives go offline? was it the same reservation error?
06-15-2012 07:28 AM
Yes both reservation errors.
This time there was also a controller error at around the same time of the second event. That is why I contacted LSI.
I have attached the adamm.log, the drives dropped offline at 14/06/12 23:36 and 15/06/12 00:14. The controller error is at 15/06/12 00:10. Other errors in the log may have occurred during troubleshooting.
06-20-2012 07:09 PM
Sounds like you're dealing with a faulting controller.
08-09-2012 09:15 AM
So I finally fixed my issue! I purchased an LSI SAS 9200-8e, no issues since.
I think uprading to Windows Server 2008 triggered the problem. There appears to be an incompatiblilty between the LSI SAS 3442E / HP SC44Ge cards and Windows Server 2008. Some people have fixed it with registry edits, see posts below.
Thanks for all your input.