cancel
Showing results for 
Search instead for 
Did you mean: 

BEWS 9.1 Backup fails with "lost connection", then the services die.

photonnzfx
Level 3
When backing up certain directories the agent loses connection to the backup server and the services running on the backup server die. All subsequent backups fail.

Things to note:
The file server head unit is Linux with the Unix agent installed. Directories accessed by Windows based computers backup fine. Directories accessed by MAC computers seem to be the problem. We've not been able to find any illegal characters. Retrospect seems to back up these directories with no problems. BEWS 9.1 fails every time (at random places in the directory structure).

When trying to look at the log file for the failed job, it cannot create the xml document. Here is the text. You can see problems with how it's reading the directory structure.

Directories like:
directory /hc_nz//hc_2d/hc_hydra/hc_h049//t//
Å—hake/grade and its subdirectories

Should read:
directory /hc_nz/hc_2d/hc_hydra/hc_h049/shake/grade/

======================================================================
Job server: BACKUPSERVER
Job name: 2D Scripts Incremental
Job started: Friday, 4 February 2005 at 4:00:03 a.m.
Job type: Backup
Job Log: BEX00196.xml
======================================================================

Drive and media information from media mount:
Robotic Library Name: OVERLAND 1
Drive Name: HP 1
Slot: 7
Media Label: NBN626L2
Media GUID: {EFD8A3AC-02DC-4CAD-BF87-201D25850C30}
Overwrite Protected Until: 31/12/9999 12:00:00 a.m.
Appendable Until: 31/12/9999 12:00:00 a.m.
Targeted Media Set Name: 2d_scripts
======================================================================
Media operation - append.
Hardware compression enabled.
======================================================================
FELAFELFELAFEL/p_drive
Family Name: "Media created 4/02/2005 4:00:03 a.m."
Backup of "FELAFEL/p_drive "
Backup set #1 on storage media #1
Backup set description: "2D Scripts Incremental"
Backup Type: FULL - Back Up Files - Allow incrementals and differentials using modified time
Backup started on 4/02/2005 at 4:00:45 a.m..
Directory not found. Can not backup directory /hc_nz//2d/hc_hind/hc_0237/script//
hake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//hc_2d/hc_hydra/hc_h049//t//
Å—hake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//2d/hc_hydra/hc_h059-c1//t//
Å—t/shake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//73-mockup_v004.psd//d/hk///

žshake/grade and its subdirectories.
The network connection to the Backup Exec Remote Agent has been lost. Please check for network errors.
A timeout occurred waiting for completion of media server data processing
Backup completed on 4/02/2005 at 4:42:51 a.m..
Backed up 1151 files in 1103 directories.
4 items were skipped.
Processed 11,285,057,037 bytes in 42 minutes and 6 seconds.
Throughput rate: 256 MB/min
----------------------------------------------------------------------
======================================================================
Job ended: Friday, 4 February 2005 at 4:44:18 a.m.
Job completion status: Failed
======================================================================
10 REPLIES 10

Ameet_Thakkar
Level 6
In order to troubleshoot the problem, kindly refer the following steps mentioned in the technote given below.
Title:-"A timeout occurred waiting for data from the agent during operation shutdown" or ". Check for network errors" (a00084f8 HEX or e00084f8 HEX) is reported during a backup operation http://seer.support.veritas.com/docs/258159.htm
------------------------------------------------------------------------

Another step we would like to mention you that, while carrying out each of the Backup's of C: and D:, kindly make sure that SGMON is running in background.
Keep SGmon running while starting the remote agent OR Set SGmon enabled to 1 under HKEY_LOCAL_MACHINE\SOFTWARE\VERITAS\Backup Exec\Debug while starting the remote agent( Steps to start SGMON is mentioned in below point number
2) Please let us know the outcome of backup jobs once you start SGMON before carrying out backup job operation. Do you still come across any error messages or the remote agent service still hangs.

To isolate the problem, kindly carry out a test backup on Backup to disk folder and verify the result of the same.
For more information on Backup to Disk folder, you may refer to Administrator's Guide VERITAS Backup Exec (tm) 9.1 for Windows Servers Administrator's Guide (English) http://seer.support.veritas.com/docs/266190.htm
( Refer from page number- 149, Using Backup to Disk folders and files )
------------------------------------------------------------------------

To add further to the above steps, kindly refer the following technote link and let us know if your problem is related to the same.

Title:- A timeout error occurs when using Backup Exec 9.1 to back up Microsoft Windows 2003 Shadow Copy Components and non-Windows remote agents in the same operation, and the non-Windows agents occur first in the selection list.

http://seer.support.veritas.com/docs/270467.htm

We hope this helps. Let us know if the problem still exists and revert back with the error message.

photonnzfx
Level 3
Thanks for your suggestions Ameet,

I'll address your suggestions in order:

1) Your first suggestion is not related to my problem. The server being backed up is not a windows box (it is linux). The port range needed for the agent are not being blocked or otherwise used, the network device drivers are up to date and functioning correctly, the drive is not fragmented, the network link settings are correct. The open file option is not available to me. I would expect that BEWS would be able to not crash if it encounters a locked file. In any case the failure is not specific to one directory or file.

2) I am not backing up C: or D:, and again this is not a windows machine being backed up. It is a remote Linux server running a RAID array connected over fiber. I looked up SGMON on Google and found an application not related to Veritas at all. What do you expect me to get from using this utility?
I've tried to isolate the problem by splitting up the backup into two parts, what I found (as I mentioned in my original message), is that directories accessed by windows machines are backed up just fine. Directories or files that are accessed by Apple computers (OSX) seem to crash the Backup Exec services running on the backup server. This is not specific to one file or directory, and the file or directory of failure seems to change at random. It is very consistant that only the Apple (OSX) touched area is affected. I tried to backup a select portion of files from the Apple (OSX) directories to a Backup to disk folder and encountered the same failure.

3) Your last suggestion does not apply as this Linux server it is the only selection in the list, and there is only one directory (and all subdirectories) selected. It is a very simple selection list:
SERVER/p_drive/hc_nz/hc_2d/*.* /SUBDIR

I have the recovery mode of the backup exec services set to restart the services after failure, so later scheduled backup jobs now complete rather than fail. However the directories that are failing are critical to backup, so I've taken to running a secondary backup system that backs up the directories in question with no problems. You might be interested to know it's a competitor's product called Retrospect. It's much less expensive and has worked flawlessly :\

photonnzfx
Level 3
Here is the latest log for the failing job:

Job Log for Felafel_array0_2d_nightly_incremental
-----------------------------------------------------------
Completed status: Failed Expand AllCollapse All
Job Information
Job server: BACKUPSERVER
Job name: Felafel_array0_2d_nightly_incremental
Job started: Tuesday, 15 February 2005 at 4:00:05 a.m.
Job type: Backup
Job Log: BEX00230.xml
__________
Device and Media Information
Robotic Library Name: OVERLAND 1
Drive Name: HP 1
Slot: 2
Media Label: NBN621L2
Media GUID: {302DF431-2E26-4B7D-A2DF-4AEFA94613A4}
Overwrite Protected Until: 31/12/9999 12:00:00 a.m.
Appendable Until: 31/12/9999 12:00:00 a.m.
Targeted Media Set Name: Felafel_array0_2d_nightly_incremental
__________
Job Operation - Backup
Backup Options
Media operation - append.
Hardware compression enabled.
__________
Server - FELAFEL
__________
Set Information - FELAFEL/p_drive
Backup Set Information
Family Name: "Media created 20/01/2005 4:00:02 a.m."
Backup of "FELAFEL/p_drive "
Backup set #2 on storage media #1
Backup set description: "Felafel_array0_2d_nightly_incremental"
Backup Type: INCREMENTAL - Using modified time
__________
Backup started on 15/02/2005 at 4:00:59 a.m..
__________
Backup Set Detail Information
Robotic Library Name: OVERLAND 1
Drive Name: HP 1
Slot: 10
Media Label: NBN629L2
Media GUID: {D57EBB5F-95E1-41D5-A992-9E8E55205564}
Overwrite Protected Until: 31/12/9999 12:00:00 a.m.
Appendable Until: 31/12/9999 12:00:00 a.m.
Targeted Media Set Name: Felafel_array0_2d_nightly_incremental
Backup set #2 on storage media #2
The network connection to the Backup Exec Remote Agent has been lost. Please check for network errors.
A timeout occurred waiting for completion of media server data processing
__________
Backup completed on 15/02/2005 at 1:00:08 p.m..
__________
Backup Set Summary
Backed up 156712 files in 3583 directories.
Processed 207,651,626,228 bytes in 8 hours, 57 minutes, and 35 seconds.
Throughput rate: 368 MB/min
__________
Job Completion Status
Job ended: Tuesday, 15 February 2005 at 1:01:19 p.m.
Completed status: Failed
Final error: 0xa00084f8 - The network connection to the Backup Exec Remote Agent has been lost. Please check for network errors.
__________
Final error category: Resource Errors
__________
Errors
Click an error below to locate it in the job log
Backup - FELAFEL/p_drive A timeout occurred waiting for completion of media server data processing

Vidyaj__Patneka
Level 6
Hi,

We suggest you to disable the option of Auto Update of Daylight Saving in the time zone and then perform the backup job and check the results.

We hope this helps.

photonnzfx
Level 3
Could you be a little more specific?

There are three possible places to make this change. I'll assume your not talking about our time server so that leaves the backup server and the remote computer.

Should I make this change on the computer running BEWS, on the remote computer being backed up?

Can you explain how this could effect an application that tries to process nightly?

Thanks,

photonnzfx
Level 3
Well, I'm back to report mixed results. I performed a re-install/upgrade of MDAC as suggested in another post. I also ran the BEUtility, attached to the local media server and performed a "repair database", as well as a "compact database". Then I did as you suggested and unchecked the "Automatically adjust clock for daylight savings changes" checkbox in the "Date and time Properties" window of the BEWS computer.

I don't like making so many changes while trouble shooting a problem, but I was desperate to get a quick fix.

The results were mixed as I said. The communications failure problem seems to have been solved, but the backup still results in a "Failed" message due to several errors of "Directory no found". The report cannot be produced in XML, probably due to the control characters that seem to mangle the directory paths of the selections in error.

I've confirmed the selections in text mode, they appear correctly in the selection list.
I've confirmed they are valid directories.
I've confirmed the proper permissions are in place

Here is a job log from one of the failed jobs, they are consistantly failing in the same place.

======================================================================
Job server: BACKUPSERVER
Job name: 2D Scripts Incremental
Job started: Sunday, 27 February 2005 at 3:00:00 a.m.
Job type: Backup
Job Log: BEX00294.xml
======================================================================

Drive and media information from media mount:
Robotic Library Name: OVERLAND 1
Drive Name: HP 1
Slot: 10
Media Label: NBN629L2
Media GUID: {D57EBB5F-95E1-41D5-A992-9E8E55205564}
Overwrite Protected Until: 31/12/9999 12:00:00 a.m.
Appendable Until: 31/12/9999 12:00:00 a.m.
Targeted Media Set Name: 2d_scripts
======================================================================
Media operation - append.
Hardware compression enabled.
======================================================================
FELAFELFELAFEL/p_drive
Family Name: "Media created 26/02/2005 3:00:05 a.m."
Backup of "FELAFEL/p_drive "
Backup set #2 on storage media #1
Backup set description: "2D Scripts Incremental"
Backup Type: FULL - Back Up Files - Allow incrementals and differentials using modified time
Backup started on 27/02/2005 at 3:01:04 a.m..
Directory not found. Can not backup directory /hc_nz//2d/hc_boar/hc_1451/script//shake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//hc_2d/hc_cliff/hc_1442/pt//
shake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//hc_2d/hc_goddess/hc_0054///

ƓĈshake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//hc_2d/hc_hydra/hc_h049//t//
żhake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//2d/hc_hydra/hc_h059-c1//t//
żt/shake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//2d/hc_mountain/hc_1757/////

ƓĈ씰shake/grade and its subdirectories.
Directory not found. Can not backup directory /hc_nz//73-mockup_v004.psd//d/hk///
Ɔshake/grade and its subdirectories.
Backup completed on 27/02/2005 at 3:30:16 a.m..
Backed up 1965 files in 1808 directories.
7 items were skipped.
Processed 13,430,220,938 bytes in 29 minutes and 12 seconds.
Throughput rate: 439 MB/min
----------------------------------------------------------------------
======================================================================
======================================================================
FELAFEL/p_drive
Verify of "FELAFEL/p_drive "
Backup set #2 on storage media #1
Backup set description: "2D Scripts Incremental"
Verify started on 27/02/2005 at 3:32:16 a.m..
Verify completed on 27/02/2005 at 3:40:09 a.m..
Verified 1965 files in 1808 directories.
0 files were different.
Processed 13,430,220,938 bytes in 7 minutes and 53 seconds.
Throughput rate: 1625 MB/min
----------------------------------------------------------------------
======================================================================
Job ended: Sunday, 27 February 2005 at 3:40:53 a.m.
Job completion status: Failed
======================================================================

Cheers,

photonnzfx
Level 3
hello... anyone from Veritas wish to answer my previous question? Should I start a new thread?

photonnzfx
Level 3
Ok, what the heck is up here? What do I have to do to get support from Veritas? I've half a mind to send the product back and ask for a refund. The fact that I have to post questions to a forum and *hope* for an answer is insulting.

I expect that support should be forthcoming within the first year of the product purchase!

I've done everything asked of me by the moderators, now I need some answers please.

Amruta_Purandar
Level 6
Hello,

We apologize for the delay in our response.

You are backing up remote Linux server and MAC machines. Please refer the Software compatibility list to ensure thatyour versions are supported.

VERITAS Backup Exec (tm) 9.1 for Windows Servers - Software Compatibility List (SCL)
http://support.veritas.com/docs/261695

Please specify which versions are posing problems.

Also you need to install the Unix Agent as well as the MAC agent to protect remote machines.
However if you have already done so, please let us know the versions you are backing up.

NOTE : If we do not receive your intimation within two business days, this post would be "assumed answered" and archived.

Sheetal_Risbood
Level 6
Archiving the post as per our previous reply