Dear all,
I'm trying to maintain a steady backup routine on a client's network. There's just one server being used as file, SQL and Exchange server which needs to be backed up at night. The specs are:
* 2 x 160 GB S-ATA HD (Raid 1)
* P4 630 (2 x 3,2 GHz)
* 2 GB DDR2-Ram
* GBit-NIC
* OS: Windows Server 2003 SBS Edition (32 bit)
* Sony StorStation SDX-570V AITi200-ST (AIT-2 Turbo S-ATA)
According to:
http://support.veritas.com/docs/282249 (Hardware compatibility list, page 20, bottom),
the tape device used is supported by BackupExec 11d.
Notes:The backup setup was deployed as-is about half a year ago.
I'm using a localized (German) version of the software, so the error messages are translated and might not be exactly the same as in the original version.
What is supposed to happen:The backup job should store some selected directories of the c: partition, as well as the Exchange Server/Active Directory and everything that relates to it.
There's an SQL server running as well, but it is stopped before and restarted after the backup job (I don't use the SQL server agent). This setup seems to run okay for a few days.
What actually happens:However, usually once per week, sometimes more often, sometimes less, the customer complains about not being able to eject the tape from the drive in order to insert the next weekdays' cartridge. The status mails I receive from the BackupExec software state the job failed, the reason being "hardware error". When I logon remotely to see the status of BackupExec, usually the job is still being executed, but I have no means of either aborting it (the job state changes to "aborting..." and stays there indefinetely) or ejecting the media (since the job still seems to be active). The only way of getting on with the daily backup routine is to eject the tape by pushing the emergency eject trigger on the tape drive with a needle and rebooting the server, otherwise the job will hang until the end of days, blocking all subsequent jobs.
Other symptoms:During the time of the backup (between 21:00 and 01:30), the Windows server reports heavy processor usage (usually 2-4 status mails per day). This could also be due to stopping and restarting the SQL server, but I'm not sure.
The errors don't seem to relate to a specific work day, the last three failures were on Tuesday, Wednesday and Thursday (on different weeks).
Things I already tried:* Replacing two tapes with new ones, since I thought the media could be defect
* Applying all the latest patches via LiveUpdate (was quite a hassle, too, but that's another story)
* Reconfiguring the backup job in every manner possible
* Reinstalling tape drivers
* Excluding holidays from the schedule, although tapes are set to overwrite and are not automatically ejected
I'm pretty much out of ideas by now on what to do next. What I can do is to add some more information from event logs etc to this post once I can get to the machine in question again, but in the meanwhile, maybe one of you has a tip for me.
Thanks in advance,
Stefan