Disk to Disk jobs failing

david_chaika_2 · ‎08-26-2005

BE 10.0.5520
Replaced a previous version 9.
server: HP ML330-G3, IDE controller built into motherboard, System on drive C:, data on D: 150GB drive;

Six servers are backed up, one job for each one, jobs run successively, with timeout settings so they don't overlap. Server S1 backs up to folder D:\S1, server S2 to folder D:\S2, etc.

I get one clean backup if I initialize the IDE array, prepare and format the drive, rebuild the backup folders, repoint the jobs.

The backup will run and then at irregular times we start getting errors. First comes one Error, Event ID 9, "The device, \Device\Scsi\LsiCsb61, did not respond within the timeout period." Then followed by many Error Event ID 11, "The driver detected a controller error on \Device\Harddisk1". All these events slow down the backups, causing them to run into their timeout setting, or fail altogether, after several hours and 0 bytes backed up.

I have tried replacing MB and replacing drive. Controller version is the same as on a ML330-G3 in another office that is using an earlier build of BE10.0, 484 or smth like that and runs D2D backups fine.

Last night the first two jobs ran ok. The errors started towards the end of the third job, i.e., after nearly 4 hours of backup. Job started at 9:30 and the first error came at 9:57. It finally cancelled, timed out at 10:25 (cancel is set to start at 10:10), after backing up 7 out of 8GB on the server. The previous two nights the job completed successfully around 10:03. The next job started at 10:15 while the 9:30 job was still trying to cancel, and overran its timeout setting too after backing up only 3GB. The next two jobs backed up zero bytes and ended in Failed. The next job backed up zero bytes and ended in Canceled, timed out. Once those disk1 errors begin they never stop as long as jobs are running, regardless of whether a job starts while another is still running or not.

Finally, a backup to tape ran successfully (tape in this backup server received files over the network directly from a different server).

This D2D backup procedure worked fine for about 6 months on a different identical server but has never worked consistently on the current server. I have been trying things for a month. This week I got two good backups in a row and thought I had fixed it by having a batch file delete all the .BKF files and defrag the D: drive.

Any ideas?

priya_khire · ‎08-29-2005

Hello,

It seems the problem only occurs on the backup-to-disk jobs. What error do you get when the job fails?

Refer to the following technote which might provide an insight into the problem and help resolve it:

Title: When performing a backup operation, a device "request to space to end of data" or "semaphore timeout" or a "SCSI bus timeout" error is generated.
http://support.veritas.com/docs/190598

Also check the following:

How to resolve SCSI bus timeouts
http://support.veritas.com/docs/191158

If the issue persists, also look for errors in the event viewer. Hope this information helps. Revert in case the problem persists.

Note : If we do not receive your reply within two business days, this post would be marked ‘assumed answered’ and would be moved to ‘answered questions’ pool.

Regards.

david_chaika_2 · ‎11-07-2005

Moving BE to another server removed the issue.

VOX

Disk to Disk jobs failing