Backup Job Fails When Run From Schedule, Succeeds When Run Manually
Hi
I have a problem with Oracle job. The job used to run smoothly over the last couple months, however, after recovering from a long catastrophic power failure, we noticed that Backup Exec had a problem with the Oracle job. The rest of jobs are fine, and complete successfully with no errors.
I tried to edit the backup selection, and noticed that the Oracle instance was unavailable and displayed a message about wrong credentials, I retyped the username and password again (sys) but it didn't accept it, so I thought of restarting the remote agent service, and that worked, the Oracle instance status is now Open, and I was able to browse the tablespaces and archived logs. Now I wait until the next backup schedule, the job starts and runs for a about 90 minutes, then fails.
I went to the job log and I could find three different errors, the first two errors are under the Set Information \\oracle\D: an they are as follows:
The first one is
WARNING: "\\oracle\D:\folder1\file1.ext" is a corrupt file. This file cannot verify.
Unable to open the item \\oracle\D:\folder1\file2 - skipped
Unable to open the item \\oracle\D:\folder1\file3 - skipped
..
etc
Directory folder1\ was not found, or could not be accessed.
Nonve of the files or subdirectories contained within will be backed up.
V-79-57344-33967 - The directory or file was not found, or could not be accessed.
The second one is
V-79-57344-65033 - Directory not found. Cannot backup directory \folder2 and its subdirectories.
V-79-57344-65033 - Directory not found. Cannot backup directory \folder3 and its subdirectories.
V-79-57344-65033 - Directory not found. Cannot backup directory \folder4 and its subdirectories.
...
etc
The third error is located right after the Set Information - \\oracle\D: and before the Oracle-Win::\\oracle\instanceid in the job log, and it is as follows:
V-79-57344-34108 - An unexpected error occurred when cleaning up snapshot volumes. Confirm that all snapped volumes are correctly resynchronized with the original volumes.
The job log, however, doesn't show any error for the \\oracle\C:, Oracle-Win::\\oracle\instanceid, or \\oracle\System?State.
I noticed that the Event Viewer on the Oracle server shows the following errors/warning (please notice the timestamp):
then
then in the event viewer application category:
I tried to search all over the Symantec TechNotes and that didn't help.
so I tried to run the job manually to monitor it, and to my surprise, it ran successfully with no single error or warning.
I waited until the next job schedule and it fails again. I run the job manually once more, and again it is a success.
Why does the job fails with the previous errors when it runs automatically, and completes successfully when it is run manually?
Any help is much appreciated.
SOLVED!
At first, the database admin didn't agree to install that patch on the Oracle server (!!!) so I thought of schedule the job to run on a different hour.
I went to backup exec to review the job history one more time, and for my surprise I found that the job RAN SUCCESSFULLY. So I headed to the Oracle server to check the event viewer, no event 24 or 35. I went to the D: partitioni which houses the database files, it used to have about %8 of free disk space, which is about 32GB, but now the database admin deleted/moved some unused folders and the free disk space is over 150GB now.
So to sum it up:
- The issue was about free disk space, not high disk I/O.
- The job completes successfully if scheduled to run on a different hour, so there must be some kind of script that runs on the Oracle server -and consumes disk space (temporarily)- on a specific hour (around 8PM) and this is when the backup job is scheduled to run.
Thanks everyone for your support!