cancel
Showing results for 
Search instead for 
Did you mean: 

bengine application error

Guy_Braeckmans
Level 3
Hi,

Using BE 10 on Windows Server 2003 enterprise with SP1. Every few days the BE engine stops and the event log shows event id 4097 with the following desription :

The application, D:\Backup Exec\bengine.exe, generated an application error The error occurred on 05/25/2005 @ 01:04:23.957 The exception generated was c0000005 at address 7C8327FB (ntdll!RtlInitializeCriticalSection)

or

The application, D:\Backup Exec\bengine.exe, generated an application error The error occurred on 05/12/2005 @ 23:13:53.190 The exception generated was c0000005 at address 7C8224B2 (ntdll!ExpInterlockedPopEntrySListFault)

or

The application, D:\Backup Exec\bengine.exe, generated an application error The error occurred on 05/11/2005 @ 01:49:22.388 The exception generated was c0000005 at address 7C35042B (MSVCR71!wcscpy)

We have SP1 of BE installed and hotfix 16 and 17. This is happening on 2 machines since we installed V10 (9.1 before that). As a consequence the rest of the backup jobs are not done and we have no valid backups of that day.
56 REPLIES 56

Janice_Fielder
Level 3
I removed the password protection for my tapes on 10/27/05 and have not had the problem since I removed it.

Janice

paro
Level 3
Partner
Hello everybody.

I'm also experiencing errors with bengine.exe at a customers site, which for some reason suddenly has began to hang.

I have tried everything written in this thread (MDAC, password protection etc) and different others too, but the problem persists and I can't seem to track it's origin cause it fails completely randomly.

I have been running the newest version of BE (10.1, build 5629, no sp/hotfix) since the beginning and it seems like the big difference for my customer and you guys is that we're experiencing this problem when backing up to disk.

And the rig is a 2003 sp1.

Is it possible to get any more suggestions on how to solve this problem?

Here is one typical event info describing the problem from the application log:

Event Type:Error
Event Source:Application Error
Event Category:(100)
Event ID:1000
Date:2006-01-03
Time:00:13:50
User:N/A
Computer:
Description:
Faulting application bengine.exe, version 10.1.5629.0, faulting module ImageBasedFHPlugin.dll, version 10.1.5629.0, fault address 0x00006ef4.

and followed by DrWatson himself one second later:

Event Type:Information
Event Source:DrWatson
Event Category:None
Event ID:4097
Date:2006-01-03
Time:00:13:51
User:N/A
Computer:RIKSSRV03
Description:
The application, C:\Program Files\VERITAS\Backup Exec\NT\bengine.exe, generated an application error The error occurred on 01/03/2006 @ 00:13:51.452 The exception generated was c0000005 at address 011A6EF4 (ImageBasedFHPlugin).

Help IS appreciated

// Patrik

Miha_Zgajnar
Not applicable
I have removed password protection on 10.12.2005 and the problem with bengine application error repeated again on 14.1.2006. That means that password protection is not the reason for bengine application error.

Miha

Janice_Fielder
Level 3
It does appear that this does not fix it for everyone :( So far I haven't seen it again. I did go to the "d" version 3 weeks ago and also went from Windows 2000 to 2003 on that server.

Janice

paro
Level 3
Partner
Hello all,

I reinstalled 10.1 and the backups are doing fine now. I can't see why a reinstallation would fix this issue because my earlier installation also was brand new, but hey, it's working now and that's what's important.

So, sadly enough for all you guys who has been struggling with the same problems as I did, a reinstallation is all I can recommend. But remember that there are alot of people who has tried to reinstall without being able to get rid of the problem.

Oh well, I hoep that you all will get the appropriate attention från Symantec/Veritas and they will come up with a resolution or workaround.

Also, if you're considering reinstalling and want to transfer all your policys, media set and so on, it is quite tricky and time consuming doing this.

Good luck!

/Patrik

Richard_De_Riso
Level 2
I`m Running BE version 10.0 Rev 5484 Which has been Serviced & hotfixed and is completley up to date.
I`m getting the following error:

Faulting application bengine.exe, version 10.0.5484.30, faulting module catcommon.dll, version 10.0.5484.0, fault address 0x0000f7f1.


Unlike a lot of people in the thread i am able to backup consistently without problems, however when it comes to doing a restore from the backups taken its very hit and miss, well to be honest its more miss than hit.

Help Appreciated.

Richard

Haim_Possek
Not applicable
I tried to restore a file yesterday from the disk backup file. I got exactly the same error

Faulting application bengine.exe, version 10.0.5484.30, faulting module catcommon.dll, version 10.0.5484.0, fault address 0x0000f7f1.

The service stopped and have been restarted. After that I successfully restored the needed file. Luckily no backup job was running that time.

Richard_De_Riso
Level 2
Hello,

I find that if you have a small restore to do say anything under 1 meg you may get away with it, however anything over that size and it just keeps failing which isnt good. Imagine someone deletes their user folder with 1 gb worth of data!!!

Are veritas looking into this?? Is there gong to be patch released soon for it?

Or is it a case of stop using veritas and look to using Arc server as this is just not good enough, This thread has been going on for some time now and a soloution to the problem has still not been found, People have re-installed with varied success, but the problem is still there!

Richard

Malcolm_Ison
Level 3
This problem has been driving me and my boss mad.

Our backups are just NOT reliable anymore - the backup keeps failing randomly and states "Recovered" in the Job History. - but fails.

Windows 2003 (no SP)
We are using Build 5520 SP1 - latest drivers also.
Our tape device is an HP Ultrium2 (LTO)

The server hosting the device, is an HP Proliant 380 G3 - the jobs failed when the tape device was hung off a RAID controller, and it still fails NOW on a dedicated SCSI card!!!

--------------------------------------------------

In the BE Job Log for the failed job, it gives an error when you try to view it (XML) and just shows an incomplete .TXT file:

This is the error when trying to view the job log:

"Unexpected end of input.

152: D:\Program Files\VERITAS\Backup Exec\NT\Data\BEX_FAPNODEA_00033.xml"

----------------------------------

In the Job History it states:

Device name : HP 1Target name : All Devices (FAPNODEA)Media set name : Media Set 1
Error category : Job ErrorsError : e00081d9 - The Backup Exec job engine system service is not responding.
For additional information regarding this error refer to link V-79-57344-33241

--------------------------------------

In the Windows event viewer there is an entry stating:

Faulting application bengine.exe, version 10.0.5520.12, faulting module msvcr71.dll, version 7.10.3052.4, fault address 0x00010428.
--------------------------------------------------------

Our jobs DO have password protection on - this might be the reason....?

--------------------------------------------------

IM going to try and use version 10d to see if that helps!

-----------------------------------------------------

I have an open call with Veritas\Symantec support - but am very disappointed in their support generally. Very basic skills and hard to understand.

At the moment i have had to rename the Catalog folder, and re-run the backup job to see if this works - this wastes so much time if it doesnt solve the problem.

I doubt it will help as ive completely reinstalled the software anyway!!!! due to the same problem in the first place.

Yet again, another day goes by without a backup!

I just hope our building doesnt burn down at the weekend!!!

Cheers for listening!

Frustrated Malcolm!

Guy_Braeckmans
Level 3
Malcolm,

They had me do the same stuff you mentioned when I first started this thread in may 2005. Rename the catalog folder, etc. The only thing this does is make you loose time. Support at Veritas/Symantec is worthless. I started having this problem when I upgraded from version 9.1 to version 10. I removed 9.1 first from the server before installing V10. The builds that followed this first V10 build did not solve the problem (no, not even 10.1). As I mentioned already in this thread, not once during almost a year did I get any input from higher level support and I refuse to open a call and pay some more for a defective product. It seems to me we can call this a defective product as it does not do what it was made for, backup computers.

Guy

Malcolm_Ison
Level 3
We are in a annual support agreement, i will be discussing this lacks support with our account manager in the hopes this is NOTICED at a higher level.

Malcolm_Ison
Level 3
Ok - the problem is caused by having password protection turned on.

We turned the option off last week, and we've since had over a weeks worth of solid backups!

As a test - we turned password protection back on last night, and the backup failed once again.

Now we know whats causing it, we need to get Veritas to issue a patch.

johan_holmstrom
Level 4
I've had the same problem aswell. Turned on SGmon and captured this:

beserver: 03/16/06 10:18:27 -1 Client Rundown Disconnected 'xxx1BU01':0x116b4d8 <-------------- This is where the BE Engine stopped
beserver: 03/16/06 10:19:50 -1 SQLLog(18):AgeSession m_threadMap: SessionThreadID:770, CurrentThreadID:5f8
beserver: 03/16/06 10:19:50 -1 SQLLog(19):AgeSession m_threadMap: SessionThreadID:7b4, CurrentThreadID:5f8
beserver: 03/16/06 10:19:57 -1 Client requested key. <-----------------This is where I started the service again
beserver: 03/16/06 10:19:57 -1 Client 'xxx1BU01' connected('','CASTELLUM\xxxxxxx'): 0x116b4d8
bengine: 03/16/06 10:19:57 DeviceManager thread starting
bengine: 03/16/06 10:19:57 ImageBasedFHPlugin::ImageBasedFHPlugin() - Entering.
bengine: 03/16/06 10:19:57 ImageBasedFHPlugin::ImageBasedFHPlugin() - Exiting.
bengine: 03/16/06 10:19:57 DeviceManager thread ready
bengine: 03/16/06 10:19:57 DeviceManager: going to sleep for 900000 msecs
bengine: 03/16/06 10:19:57 ImageBasedFHPlugin::init() - Entering.
bengine: 03/16/06 10:19:57 ImageBasedFHPlugin::init() - Parameters:
bengine: 03/16/06 10:19:57 ImageBasedFHPlugin::init() - Exiting.
bengine: 03/16/06 10:19:57 LegacyBEPlugin::LegacyBEPlugin() - Entering.
bengine: 03/16/06 10:19:57 LegacyBEPlugin::LegacyBEPlugin() - Exiting.
bengine: 03/16/06 10:19:57 LegacyBEPlugin::init() - Entering.
bengine: 03/16/06 10:19:57 LegacyBEPlugin::init() - Parameters:
bengine: 03/16/06 10:19:57 LegacyBEPlugin::init() - Exiting.
bengine: 03/16/06 10:19:57 BEImage::BEImage() - Entering.
bengine: 03/16/06 10:19:57 BEImage::BEImage() - Exiting.
bengine: 03/16/06 10:19:57 BEImage::init() - Entering.
bengine: 03/16/06 10:19:57 BEImage::init() - Parameters:
bengine: 03/16/06 10:19:57 BEImage::init() - Exiting.
bengine: 03/16/06 10:19:57 FS_InitFileSys
bengine: 03/16/06 10:19:57 Initializing FSs
bengine: 03/16/06 10:19:57 Initializing SNMP traps
bengine: 03/16/06 10:19:57 Catalog path: D:\SrvProg\VBE\Catalogs\
bengine: 03/16/06 10:19:57 Data path: D:\SrvProg\VBE\Data\
bengine: 03/16/06 10:19:57 Log file root: BEX
bengine: 03/16/06 10:19:57 Software Edition: SOFTWARE_EDITION_DATACENTER_MULTI
bengine: 03/16/06 10:19:57 NSE: 0
bengine: 03/16/06 10:19:57 Authorized Agents:
bengine: 03/16/06 10:19:57 AUTH_AGENT_UNIX
bengine: 03/16/06 10:19:57 AUTH_AGENT_MAC
bengine: 03/16/06 10:19:57 AUTH_AGENT_NT
bengine: 03/16/06 10:19:58 AUTH_AGENT_XCH
bengine: 03/16/06 10:19:58 AUTH_AGENT_ESE
bengine: 03/16/06 10:19:58 AUTH_AGENT_NETWARE
bengine: 03/16/06 10:19:58 WinNT information: 5.2 (3790) SP 1.0 "Service Pack 1"
bengine: 03/16/06 10:19:58 WinNT ProductType: 3 "ServerNT"
bengine: 03/16/06 10:19:58 WinNT ProductSuite: 0x0110
bengine: 03/16/06 10:19:58 Terminal Server
bengine: 03/16/06 10:19:58 Server listening on RPC endpoint 'BackupExecJobRunner'
bengine: 03/16/06 10:19:58 Running... 'Q' to stop
beserver: 03/16/06 10:20:02 -1 Client Rundown Disconnected 'xxx1BU01':0x116b4d8 <------------------Here it stopped again


I then removed the catalog folder and restarted all the services. The difference I saw is that when I had the old catalog running, my backupserver connected and disconnected with hex code 0x116b4d8.
After having the new catalog, the backupserver connects with hex code 0x1189eb0.
Here are the logfile with the new catalog:


beserver: 03/16/06 10:28:36 -1 Client 'xxx1BU01' connected('','CASTELLUM\xxxxxxxx'): 0x1189eb0
bengine: 03/16/06 10:28:36 DeviceManager thread starting
bengine: 03/16/06 10:28:36 ImageBasedFHPlugin::ImageBasedFHPlugin() - Entering.
bengine: 03/16/06 10:28:36 ImageBasedFHPlugin::ImageBasedFHPlugin() - Exiting.
bengine: 03/16/06 10:28:36 ImageBasedFHPlugin::init() - Entering.
bengine: 03/16/06 10:28:36 ImageBasedFHPlugin::init() - Parameters:
bengine: 03/16/06 10:28:36 ImageBasedFHPlugin::init() - Exiting.
bengine: 03/16/06 10:28:36 LegacyBEPlugin::LegacyBEPlugin() - Entering.
bengine: 03/16/06 10:28:36 LegacyBEPlugin::LegacyBEPlugin() - Exiting.
bengine: 03/16/06 10:28:36 LegacyBEPlugin::init() - Entering.
bengine: 03/16/06 10:28:36 LegacyBEPlugin::init() - Parameters:
bengine: 03/16/06 10:28:36 LegacyBEPlugin::init() - Exiting.
bengine: 03/16/06 10:28:36 BEImage::BEImage() - Entering.
bengine: 03/16/06 10:28:36 BEImage::BEImage() - Exiting.
bengine: 03/16/06 10:28:36 BEImage::init() - Entering.
bengine: 03/16/06 10:28:36 BEImage::init() - Parameters:
bengine: 03/16/06 10:28:36 BEImage::init() - Exiting.
bengine: 03/16/06 10:28:36 DeviceManager thread ready
bengine: 03/16/06 10:28:36 DeviceManager: going to sleep for 900000 msecs
bengine: 03/16/06 10:28:36 FS_InitFileSys
bengine: 03/16/06 10:28:36 Initializing FSs
bengine: 03/16/06 10:28:36 Initializing SNMP traps
bengine: 03/16/06 10:28:36 Catalog path: D:\SrvProg\VBE\Catalogs\
bengine: 03/16/06 10:28:36 Data path: D:\SrvProg\VBE\Data\
bengine: 03/16/06 10:28:36 Log file root: BEX
bengine: 03/16/06 10:28:36 Software Edition: SOFTWARE_EDITION_DATACENTER_MULTI
bengine: 03/16/06 10:28:36 NSE: 0
bengine: 03/16/06 10:28:36 Authorized Agents:
bengine: 03/16/06 10:28:36 AUTH_AGENT_UNIX
bengine: 03/16/06 10:28:36 AUTH_AGENT_MAC
bengine: 03/16/06 10:28:36 AUTH_AGENT_NT
bengine: 03/16/06 10:28:36 AUTH_AGENT_XCH
bengine: 03/16/06 10:28:36 AUTH_AGENT_ESE
bengine: 03/16/06 10:28:36 AUTH_AGENT_NETWARE
bengine: 03/16/06 10:28:36 WinNT information: 5.2 (3790) SP 1.0 "Service Pack 1"
bengine: 03/16/06 10:28:36 WinNT ProductType: 3 "ServerNT"
bengine: 03/16/06 10:28:36 WinNT ProductSuite: 0x0110
bengine: 03/16/06 10:28:36 Terminal Server
bengine: 03/16/06 10:28:36 Server listening on RPC endpoint 'BackupExecJobRunner'
bengine: 03/16/06 10:28:36 Running... 'Q' to stop
bengine: 03/16/06 10:28:36 JobStatus thread starting <--------------This is where it used to disconnect

Malcolm_Ison
Level 3
Check this article out - it admits theres a problem.

http://seer.support.veritas.com/docs/276242.htm

Although Service packing \ or upgrading to 10d didnt help us. We've had to disable the password protection feature to get good backups.

I do wonder if its to do with HP Insight agents or something HP\Compaq server specific.

It would be good to know if people that arent using HP hardware still suffer the issue.... OR people using HP Ultrium LTO drives (We are using HP Ultrium LTOs 1 2 and 3)

Jason_Eisel
Level 4
We are having the same problem too. We have already upgraded to 10.1 with all applicable hotfixes and service packs. The media is not password-protected. It seems this error happens each night around 5:00pm and all jobs within a 5-10 minute span of the error, fail. I'm very hesitant about calling Veritas support, because I had the same error under 10.0. The tech mentioned upgrading to 10.1 should solve the problem; well...it didn't.

Does anybody know how to contact an engineer or software developer at Veritas? It seems the technicians monitoring these forums are sending everybody in circles. About 80% of the technotes and instructional pages are located in Backup Exec help.

Peter_Ludwig
Level 6
Jason Eisel,

For which time do you have the BE database backup scheduled? Just an idea, not based on knowledge...

greetings
Peter

Tom_Dudek
Level 3
Like everyone else here, I too am having this problem. I have my own thread on the subject but after 3 months of worthless advice the thread got dropped into the abyss by getting moved to the Backup Exec for Netware un-moderated forums as I mentioned that one of the servers I am backing up is a Netware file server. This being despite the fact that the media server is a Windows 2003 box running Backup Exec 10d.

Anyway, I wanted to add that I have never used password protection on my tapes and I still get the error. I strongly believe that the problem is linked to the actual real time length of the backups. I never get the error when backing up very small amounts of data! The error only started recently after running successful backups for 6 months. I have recently tried running all my backups without the verify process at the end. Turning off verify has shaved off around an hour and a half of my backups. It used to take 7 hours 30 minutes or so and now completed in just over 6 hours. Interestingly, since I turned off verify ALL the backups for the last 3 weeks have completed successfully. As soon as I turn verify back on, it extends the backup total time to over 7.5 hours and the Job Engine fails before the backup completes. It is almost as if the job engine cannot run longer than the 7 hour 30 minute mark without falling over!

I suppose I could fragment my one big backup job into a few smaller ones. This may make the engine keep running. I did see a thread a while ago that talked about doing this for restoring jobs in situations where large restores caused the job service to fall over.

At the end of the day, all this is a work around to the core of the problem.... a flaky Backup Exec Job Engine service.