cancel
Showing results for 
Search instead for 
Did you mean: 

Agent stops almost randomly and backup fails (9.1)

Roland_Lockhart
Level 3
Hi
 
We have BE 9.1
on a Server 200 sp3 machine
 
Running against a Exabyte VXA 2 tap drive with robotic changer
 
Just recently the backups have been failing, sometimes on a local disk other times across the network to other machines
 
the error is a00084f8 - The network connection to the Backup Exec Remote Agent has been lost
 
 
The eventvwr logs show the following error event ID 34113 (with the same message above)
 
At a similar time the backup exec log files indicate this happens
 
 914 16/04/2007 2:49:40: ERROR: ndmpcSendRequest->connection error
 914 16/04/2007 2:49:40: ERROR: ndmpSendRequest failed:
 914 16/04/2007 2:49:40: NDMPEngine: NDMP control connection lost.
 7a0 16/04/2007 2:59:47: DeviceManager: timeout event fired
 7a0 16/04/2007 2:59:47: DeviceManager: processing pending requests
 7a0 16/04/2007 2:59:47: DeviceManager: going to sleep for 900000 msecs
 914 16/04/2007 3:04:40: NDMPEngine::MessagePumpAndWaitForResults(): TF_NDMPGetResult() timer elapsed!
 914 16/04/2007 3:04:40: ERROR: ndmpcSendRequest->connection error
 914 16/04/2007 3:04:40: ERROR: ndmpSendRequest failed:
 914 16/04/2007 3:06:54: WriteEndSet( 1 ) returning 0
 914 16/04/2007 3:06:56: WriteEndSet( 1 ) returning 0
 914 16/04/2007 3:06:56: WriteEndSet( 0 ) returning 0
 914 16/04/2007 3:06:56: HARDWARE COMPRESSION ===> Setting compression off.
 914 16/04/2007 3:07:29: TF_CloseSet
 914 16/04/2007 3:07:29: NDMPEngine::Run(): ProcessBSD() returned -568
 914 16/04/2007 3:07:29: TAPEALERT: Get TapeAlert Flags Return Code = 0X0
 914 16/04/2007 3:07:29: TAPEALERT: TapeAlert Device Flag  = 0X0
 914 16/04/2007 3:07:29: TAPEALERT: TapeAlert Changer Flag = 0X0
 914 16/04/2007 3:07:30: TAPEALERT: Get TapeAlert Flags Return Code = 0X0
 914 16/04/2007 3:07:30: TAPEALERT: Get TapeAlert Flags Return Code = 0X0
 914 16/04/2007 3:07:30: TF_FreeDriveContext( 1B09408 )
 914 16/04/2007 3:07:30: TF_FreeTapeBuffers: from 2 to 0 buffers
 914 16/04/2007 3:07:30: FreeFormatEnv( cur_fmt=0 )
 7a0 16/04/2007 3:10:00: DeviceManager: dismount event fired
 7a0 16/04/2007 3:10:00: DeviceManager: processing pending requests
 7a0 16/04/2007 3:10:00: DeviceManager: going to sleep for 900000 msecs
 914 16/04/2007 3:10:00: Job thread terminating
 7a0 16/04/2007 3:25:00: DeviceManager: timeout event fired
 
(Note the NDMP error)
 
 
This has started happening recently, even though this machine has been up and working perfectly for months , with NO changes made by us.
 
To try to fix this I have patched it up to SP5 for 9.1 with no change
 
I have effected a repair using the install media and repatched, no change
 
This machine is a hard working file server that exhibits no other problems.
 
Has anyone come across this before? A google search reveals little. Some help from the Symantec crew would be really good.
 
 
 
1 REPLY 1

Roland_Lockhart
Level 3
An update on this
Since no-one ever seems to close one of these problems with a solution, I thought I'd contribute some things I discovered.
 
I repaired the installtion of 9.1, patched it to SP5 to no avail. I turned off the AV, defragged the disk, made sure there was no spurious network traffic and assigned a fixed network interface and set the network speed to autodetect. None of these thigs made a spec of difference. What I finnally noticed was that the backups failed at approximately the same point. It must therefore be a file.
 
I calculated the volume of data, down to a specific directory. i made a backup elsewere, and then deleted the folder. Hooray the backup made it past that point. Now its still stopping with the same agent disconnection error, but I have a bigger volume being backed up now. By a process of elimination I will solve this problem, but I am sorely dissapointed that a file problem makes the backup stall, and I find it amazing that no-one else has encountered and solved this problem with 9.1 before. I have to say, untill the backups are returned to their fully functional state i wont be convinced that the file problem is the actual cause. This has been a month of fret and pain, and its still not fixed. I look at the features and bug fixes in ver 10 and 11 and I dont think an upgrade will fix this.
 
Anyway and upgrade will need to go thru test before its pushed to prod.
 
I hope this helps the next hapless 9.1 user with agent problems
 
RKL 23/4/07