cancel
Showing results for 
Search instead for 
Did you mean: 

Backup job making no progress

Artegic
Level 6

BE2012 SP4 on W2K8 R2 backing up a bunch of ESXi5 VMs to deduplication storage on DAS with linked duplication to disk. Schedule is weekly full and daily incremental.

Because of BE's limitations, one of the VMs running W2K8 R2 cannot be backed up through AVVI and is backed up through RAWS instead. Normally the full backup of that machine takes between five and eight hours. This morning I found the full backup job for that server still running but without making any progress. Elapsed time is 02:11:52:00 (ie. two and a half days) and steadily increasing. Byte count is 401.093.217.634 and hasn't changed in the last three hours. Job rate is slowly decreasing as a result. There are no error or warning messages in the Backup Exec console nor anything that looks related in the event logs.

I guess my only course of action is to cancel the job and start it again. I would however like to find out what caused that malfunction and how to avoid it in the future. Any thoughts?

1 ACCEPTED SOLUTION

Accepted Solutions

CraigV
Moderator
Moderator
Partner    VIP    Accredited

Hi,

 

This has been around for ages. I've had this before with BE 12.5, and BE 2010 Rx. I guess if you enabled monitoring of the job you'd know as it would have in the logs what the job was waiting for.

Preventing it is another story...I've never seen anything around a job that has started with no progress in byte count.

Thanks!

View solution in original post

9 REPLIES 9

CraigV
Moderator
Moderator
Partner    VIP    Accredited

Hi,

 

This has been around for ages. I've had this before with BE 12.5, and BE 2010 Rx. I guess if you enabled monitoring of the job you'd know as it would have in the logs what the job was waiting for.

Preventing it is another story...I've never seen anything around a job that has started with no progress in byte count.

Thanks!

Artegic
Level 6

Hi CraigV,

thanks for the answer. Meanwhile another backup job (self backup of the media server) has run without a hitch, while the hanging one didn't budge. After that, I have requested to abort the job. It has been sitting in state "abort waiting" for an hour now. I guess it's services time restart again.

Regards,
Tilman

Artegic
Level 6

Restarting the services needed two attempts, Job Engine would refuse to quit on the first one, at least I didn't have to kill it through Task Manager this time.

The job log of the hanging job ends with

WARNUNG: "\\ZAGDC01.ARTEGIC.LOCAL\E:\SILO\Software\OS\Windows 8 Prof\de_windows_8_x64_dvd_915409.iso" ist beschädigt. Dateiprüfung nicht möglich.

(WARNING: "\\ZAGDC01.ARTEGIC.LOCAL\E:\SILO\Software\OS\Windows 8 Prof\de_windows_8_x64_dvd_915409.iso" is damaged. File check not possible.)

The named file is pretty big (3.4 GB) but definitely not damaged, and has been backed up on previous runs without complaint.

The job terminated in state "Recovered" with error code E0008821: "The job was recovered by the start of the Backup Exec RPC service. No user action required", immediately followed by another one terminating with error code 80004005: "Fixed point restart cannot be applied to the newly started backup job. The backup job will not be performed." Cute.

I have now started the job manually. Curious what will happen once it gets to the "damaged" file.

 

lmosla
Level 6

Hello,

Are there any errors in Windows Events during the job? Try recreating the job.  

 

VJware
Level 6
Employee Accredited Certified

Based on the warning above, it looks like the 'backup' portion completed and it was stuck on the 'verify' portion. Expand the job log and check if it took an acceptable time for the 'backup' to complete or not.

Additionally, if AOFO enabled in the backup job ? Any AV scans on the remote server ? Check the remote server's event viewer regarding any disk events specifically and possibly, this may be a one-time occurence. A bit difficult to investigate the cause on a forum and if it reoccurs, would recommend to log a formal support case.
 

Artegic
Level 6

On second try the job ran without a problem.

No errors or warning in the event log for either job on either server.

Judging from the hanging job's log the hang definitely happened during backup.

AOFO option "use snapshot method" is enabled, "Snapshot provider" is set to "Automatic - allow selection of snapshot provider by VSS", option "process logical media for the backup one after another" is enabled, option "activate fixed point restart" is disabled.

Of course there are virus scanners running on all involved servers. You can hardly run a Windows machine responsibly without one, can you?

It's not the first time I encountered that kind of incident, but it's rare enough that I see no chance of getting anywhere with Symantec support, so I'll refrain from opening a support case.

If there's any way of preparing for the next occurrence so that I can get some information on what is going on there, I'll be glad to hear it.

Artegic
Level 6

Btw I think the "is damaged, file check not possible" message I posted from the job log is an artefact caused by the job being killed while it was backing up that file, and has nothing to do with the actual problem. I also think the problem struck that particular file by chance. Intuitively I would suspect a network glitch in combination with insufficient recovery mechanisms, but of course that's hard to confirm or disprove without information on Backup Exec's network communication protocols.

Artegic
Level 6

I hesitate to mark this thread as "solved" when the "solution" is "This has been around for ages and nobody knows what to do about it." But for the purposes of this forum, I guess that's the correct way to close the topic.

CraigV
Moderator
Moderator
Partner    VIP    Accredited

...best bet is to do the following:

1. Check the Known Issues section and see if this is listed, and then vote it up;

2. Put in an Idea to get this sorted out.

Thanks!