cancel
Showing results for 
Search instead for 
Did you mean: 

E00084F9 - Communications Failure Bex12.5 -> Exchange Agent on local server.

JJPera
Level 3
Hi all,

I seem to be having a problem with my Exchange backups and struggling to find a solution to fix it, so thought i'd throw it around here for some thoughts.
I have a number of jobs queued to backup different storage groups from the exchange info store, after a random amount of time into the 1st job (or on rare occaisions it will complete and the 2nd job will fail) the following message appears as a failure:
Error category    : Resource Errors
Error             : e00084f9 - A communications failure has occurred between the Backup Exec job engine and the remote agent.
and then all following jobs fail with:
Error category    : Job Errors
Error             : e0000f16 - The operation failed to acquire the minimum number of drives and media needed.

The Setup:
Server: W2k8 Std x64 SP2
Bex: 12.5 SP3
Exchange 2007
HP Ultrium 1840 stand alone drive, sat in an external rack caddy,

Bex Server, Remote Agent, and Exchange Agent all installed on the system, and the drive is attached directly to the server, so no network traffic outside of server is involved.

I've tried new SCSI Cables, new drivers (both HP and Symantec), a different but same model ultrium drive both mounted and unmounted in the external caddy, running the jobs manually, and all combinations thereof, and probably half a dozen other things I can't think of at the moment.
The job fails with the same error anywhere between 800mb into the task, all the way through to 9 mintues and 40+GB  backed up.

I've followed the suggestions for using "all network interfaces" in the job options, and AOFO is not enabled.


However... attaching an old SDLT2 Quantum drive works - connected in exactly the same way with the same job and same settings, just a diffrerent device selected.
It takes lots lots longer, and multiple tapes have to be used to get all of the data backed up, so this is not ideal.
As i've tried 2 different Ultrium drives, and both have the same issue, i'd think it's something software/driver related, but i've no idea what.

Heeeelp? :)
1 ACCEPTED SOLUTION

Accepted Solutions

JJPera
Level 3
Well,

I'd already tried HP's drivers but...

It looks like we may have a solution :)

The combination of the LTO4 + HP SC11Xe SCSI HBA (LSI U320 2000 Series) + a random HP service that i'm not sure what it does seems to have been the cause.

Stopping the "HP WMI Storage Providers" service has now allowed me to get 5 jobs to complete this morning, where before it was failing on the 1st.

Thanks to the support guys at Symantec for helping track this down, and i've no idea why HP didn't suggest it as it's their product :s...

I hope if anyone else has a problem like this, the information above will be useful so they don't have to spend many weeks trying to track things down with random troubleshooting steps.

Cheers for the suggestions Craig, agreed it was a strange one.

James

View solution in original post

8 REPLIES 8

CraigV
Moderator
Moderator
Partner    VIP    Accredited
Hi JJ,

Check out these links:

http://seer.entsupport.symantec.com/docs/290960.htm - refers to no appendable media!

http://seer.entsupport.symantec.com/docs/321611.htm - refers to incorrect media set being selected.

What I would also suggest from my side is to load up the latest service pack for your version of Backup Exec. More often than not, the SP would address issues that you might be experiencing.

Laters!

JJPera
Level 3
Hi Craig,

Thanks for the response and links.

Already on the latest SP (3), and have also downloaded the latest driver pack.

The 0000f16 errors only happen after the 1st job has fallen over with 00084f9, and then if I run a job that has "overwrite media" selected to reset the append flags, then run the failed 0000f16 it will start to backup just fine.... until it falls over sometime later with 00084f9.
So looks like the 1st link of the 2 above addresses this issue, but the root cause with the communications failure is what i'm trying to get around...

It's just strange that with exactly the same setup apart from the tape drive the older SDLT runs fine, but the LTO4 falls over after completing some of the job.

James

CraigV
Moderator
Moderator
Partner    VIP    Accredited
Hi JJ...

Just something I thought about. When you replace the SDLT with the LTO4, do you run the Device Configuration Wizard from BEWS to redetect the drive, and are your jobs pointed to that drive? Try that and see if this sorts it out...

Laters!

JJPera
Level 3
Hi Craig

Thanks for the extra thoughts on this, as i've still failed to find a solution so far.

I'm not sure if it's the device config wizard that I've run, but as part of the troubleshooting have removed all references to the LTO4 device from BEWS (it was called HP2) and readded it (Now called HP1) and all jobs now point explicitly to the HP1 drive.

To now allow me to carry on testing the failure, and be able to get a good backup both the SDLT and the LTO4 are connected to the server, the SDLT jobs run without fail but fall but overflow the tape and get finished in the morning, i've been playing with the LTO4 jobs during the day after the SDLT has done it's thing.

Having a ticket open with Symantec at the mo has pointed me back to the hardware vendor as "Hard Errors" are showing in the stats for the drive.
HP have replaced the drive (this is the 3rd one i'm now on) and the problem still persists.

I guess back on the phone to HP to say their new drive still does not work, and see what they say...

James



CraigV
Moderator
Moderator
Partner    VIP    Accredited
Mmm...disconnect the SDLT, restart the services so that it's not detected in BEWS and try running a job to the LTO4...maybe having 2 drives connected is causing the issue, and BEWS THINKS it needs a license for the second drive...worth a shot.

JJPera
Level 3
Unfortunately the 2nd drive has only been added again since the original failed.
And would BEWS compain straight away if it was getting confused, as this is not the case, it will do anything from 900mb -> 50gb before throwing the error up.
B2D works fine.

Just off the phone to HP who have run the diagnostics on the hardware and it's come back with a clean bill of health.

I'm thinking drivers falling over somewhere, but no idea where or why.

CraigV
Moderator
Moderator
Partner    VIP    Accredited
Mmm...try HP's drivers for the device and see if that sorts it out. That's a weird error though...

JJPera
Level 3
Well,

I'd already tried HP's drivers but...

It looks like we may have a solution :)

The combination of the LTO4 + HP SC11Xe SCSI HBA (LSI U320 2000 Series) + a random HP service that i'm not sure what it does seems to have been the cause.

Stopping the "HP WMI Storage Providers" service has now allowed me to get 5 jobs to complete this morning, where before it was failing on the 1st.

Thanks to the support guys at Symantec for helping track this down, and i've no idea why HP didn't suggest it as it's their product :s...

I hope if anyone else has a problem like this, the information above will be useful so they don't have to spend many weeks trying to track things down with random troubleshooting steps.

Cheers for the suggestions Craig, agreed it was a strange one.

James