Forum Discussion

Ahmad54's avatar
Ahmad54
Level 3
12 years ago

Backup Job Fails When Run From Schedule, Succeeds When Run Manually

Hi

I have a problem with Oracle job. The job used to run smoothly over the last couple months, however, after recovering from a long catastrophic power failure, we noticed that Backup Exec had a problem with the Oracle job. The rest of jobs are fine, and complete successfully with no errors.

I tried to edit the backup selection, and noticed that the Oracle instance was unavailable and displayed a message about wrong credentials, I retyped the username and password again (sys) but it didn't accept it, so I thought of restarting the remote agent service, and that worked, the Oracle instance status is now Open, and I was able to browse the tablespaces and archived logs. Now I wait until the next backup schedule, the job starts and runs for a about 90 minutes, then fails.

I went to the job log and I could find three different errors, the first two errors are under the Set Information \\oracle\D: an they are as follows:

The first one is

WARNING: "\\oracle\D:\folder1\file1.ext" is a corrupt file. This file cannot verify.

Unable to open the item \\oracle\D:\folder1\file2 - skipped

Unable to open the item \\oracle\D:\folder1\file3 - skipped

..

etc

Directory folder1\ was not found, or could not be accessed.

Nonve of the files or subdirectories contained within will be backed up.

V-79-57344-33967 - The directory or file was not found, or could not be accessed.

The second one is

V-79-57344-65033 - Directory not found. Cannot backup directory \folder2 and its subdirectories.

V-79-57344-65033 - Directory not found. Cannot backup directory \folder3 and its subdirectories.

V-79-57344-65033 - Directory not found. Cannot backup directory \folder4 and its subdirectories.

...

etc

The third error is located right after the Set Information - \\oracle\D: and before the Oracle-Win::\\oracle\instanceid in the job log, and it is as follows:

V-79-57344-34108 - An unexpected error occurred when cleaning up snapshot volumes. Confirm that all snapped volumes are correctly resynchronized with the original volumes.

The job log, however, doesn't show any error for the \\oracle\C:, Oracle-Win::\\oracle\instanceid, or \\oracle\System?State.

I noticed that the Event Viewer on the Oracle server shows the following errors/warning (please notice the timestamp):

EventViewer-System-Warning-24.jpg

 

then

EventViewer-System-Error-35.jpg

then in the event viewer application category:

EventViewer-Application-Error-57481.jpg

I tried to search all over the Symantec TechNotes and that didn't help.

so I tried to run the job manually to monitor it, and to my surprise, it ran successfully with no single error or warning.

I waited until the next job schedule and it fails again. I run the job manually once more, and again it is a success.

Why does the job fails with the previous errors when it runs automatically, and completes successfully when it is run manually?

Any help is much appreciated.

 

  • SOLVED!

    At first, the database admin didn't agree to install that patch on the Oracle server (!!!) so I thought of schedule the job to run on a different hour.

    I went to backup exec to review the job history one more time, and for my surprise I found that the job RAN SUCCESSFULLY. So I headed to the Oracle server to check the event viewer, no event 24 or 35. I went to the D: partitioni which houses the database files, it used to have about %8 of free disk space, which is about 32GB, but now the database admin deleted/moved some unused folders and the free disk space is over 150GB now.

    So to sum it up:

    1. The issue was about free disk space, not high disk I/O. 
    2. The job completes successfully if scheduled to run on a different hour, so there must be some kind of script that runs on the Oracle server  -and consumes disk space (temporarily)- on a specific hour (around 8PM) and this is when the backup job is scheduled to run.

    Thanks everyone for your support!

  • ...the reason why I suggested this was because I've had it personally, and also seen it repetitively on the forums. A job and selection list are essentially entries into the BEDB. Either repairing the BEDB using BEutility.exe, or simply creating a new backup job/policy and selection list should fix this 99% of the time.

    Corruption does happen, but no 2 situations would be the same.

    Thanks!

  • Hello,

    The new job failed when it was re-scheduled to run at the same time as the old job, so maybe Armit Alok's suggestion is correct. To confirm, I scheduled the new job to run at a different time, and it completed successfully as expected. Now to reproduce that faulty scenario, I rescheduled it to run at the same time as the old job, again. I'll wait to see the results and will post the results.

    If it failed, then there must be something wrong happening on that hour when the job runs. Either on the backup server or on the backed up server.

    If you've seen this issue before, any tip is much appreciated.

    Thank you!

  • => Ensure that no shadow copy jobs are running on all the volumes of the Exchange server while the backup kicks in

     => Install or install/reinstall Microsoft patch 940349 on the remote server

    http://support.microsoft.com/kb/940349  -  Availability of a Volume Shadow Copy Service (VSS) update rollup package for Windows Server 2003 to resolve some VSS snapshot issues .

     

    This patch is only for windows 2003 server.

     

     

  • After scheduling the new job to run on the same time as the old one, it failed, again.

    I checked the event viewer on the Oracle server, and it generated event ID 24 and 35 each time the job fails.

    So there must be something happening on the Oracle server that prevents the job from completing. 

    I know I can simply change the job parameter to run at a different time, but I'm going to give Kunal.Mudliyar's suggestion a try. I'm going to install KB940349 and see the results.

    Thank you all for your posts so far!

  • SOLVED!

    At first, the database admin didn't agree to install that patch on the Oracle server (!!!) so I thought of schedule the job to run on a different hour.

    I went to backup exec to review the job history one more time, and for my surprise I found that the job RAN SUCCESSFULLY. So I headed to the Oracle server to check the event viewer, no event 24 or 35. I went to the D: partitioni which houses the database files, it used to have about %8 of free disk space, which is about 32GB, but now the database admin deleted/moved some unused folders and the free disk space is over 150GB now.

    So to sum it up:

    1. The issue was about free disk space, not high disk I/O. 
    2. The job completes successfully if scheduled to run on a different hour, so there must be some kind of script that runs on the Oracle server  -and consumes disk space (temporarily)- on a specific hour (around 8PM) and this is when the backup job is scheduled to run.

    Thanks everyone for your support!