Forum Discussion

Mary_Land's avatar
Mary_Land
Level 3
14 years ago

Backup to Disk - Extremely slow verify (BE 2010 R2)

Hello everyone,

I encounter extremely slow verify rate when doing backup-to-disk.

 

Media server:

-Hardware:  HP ProLiant DL360 G5 (2 x Xeon 5140, 18 GB RAM, Smart Array P400i with Battery Back Write Cache and brand-new battery, 4 x HP 600GB SAS-2 6Gbps 10,000 rpm drives in a RAID 5 array, 512MB strip size => 1.5 MB stripe, 2 logical drives, HP SCSI HBA which is ATTO ExpressPCI UL5D LP)

-OS:  Windows Server 2008 R2 Standard, all updates up to and including February 2011Backup to Disk target:

-Tape drive:  Tandberg 820, SCSI

 

-Local logical drive on media server mapped to drive letter D, 1.5 TB, NTFS file system, 64K allocation unit size

Backup Exec version:  2010 R2 (13.0 Rev. 4164), with Service Pack 1 and Hotfix 150096 installed

Backup to Disk Folder settings:

-Maximum size for backup-to-disk files:  4 GB (also tested with 1 GB)

-Allocate maximum size for backup-to-disk files:  YES

-Maximum number of backup sets per backup-to-disk file:  100 (application default)

-Concurrent:  1

Results of running B2D Compatibility Test Tool (B2DTest_64.exe) on B2D target:  ALL PASSED (I examined the log in C:\ProgramData\Symantec\BackupExec\Logs\B2DTest.log and saw that all 16 tests passed.)

 

Media sets:

-Weekly:  13-day overwrite protection, 1 hour append (just a formality, since there's no option to set it lower)

-Daily:  6-day overwrite protection, 1 hour append (same as above)

 

Remote server being backed up:

-Hardware:  HP ProLiant DL380 G7 (1 x Xeon 5620, 36 GB RAM, Smart Array P410i with 512MB Flash Back Write Cache, 8 x HP 300GB SAS-2 6Gbps 10,000 rpm drives in 2 arrays - RAID 1 for OS, RAID 5 for data, 2 logical drives)

-OS:  Windows Server 2008 R2 Standard, all updates up to and including February 2011

-RAWS (x64) which was push-installed from BE media server

 

Network:  Both the media server and the server being backed up have Gigabit NIC, both are connected to the same HP ProCurve 2910al switch (Layer 2/3 managed switch), both NICs and ports on the switch are forced to 1000, no auto-negotiation.

 

I originally set up a policy to have Full backup (Copy files) running on each Friday evening which will back up the entire remote server described above with verify, then a duplicate job linked to this Full backup job will run immediately afterwards to duplicate the sets to tape (LTO Ultrium 3).  During the week Monday through Thursday a recurring job will run to back up Working Set (changed today).

When I was test-running the Working Set job last week (to disk), it took some 3 hours to back up/verify 24 GB.  I examined the job history and found that the verify speed was the culprit, as slow as 282 MB/min.  The rate for various sets varied between 261 MB/min - 1152 MB/min.

In terms of backup speed, I thought the network was the bottleneck, so I performed a test by copying a set of 11 files (a total of 6.42 GB, which consists of the DVD ISO for BE 2010 R2 which is 3.3 GB, plus those 1of4, 2of4, etc. splitted files) from a Windows file share on the server being backed up to the media server (stored them in the same folder as the BKF files).  Speed was ~6600 MB/min.  Then I created a test backup job in Backup Exec with AOFO (VSS) and no compression, and that job ran at 3988 MB/min backup rate (and decreasing), but a mere 378 MB/min for verify (yes, I know verify is performed locally, still the 6600 MB/min copying using SMB v1 (for testing I disabled SMB v2) vs. 3988 MB/min backup rate shows that something is wrong, probably with BE.

Now, in terms of verify speed, I know that BE attempts to read the sets from the BKF files and calculate checksums to compare with stored checksums.  I know that RAID 5 is slow, but it can't be that slow.  Something is seriously wrong with BE if the verify speed is 378 MB/min average.

During verify operations, I checked the CPU and memory usage on the media server, and there's no spike whatsoever, CPU barely even hit 1%.

I have run both ATTO Disk Benchmark Tool and IOMeter to benchmark the RAID and can't see anything wrong from the results, so it can't be the disk subsystem.

At this point, I'm so frustrated that I don't even know what else to do, short of NOT verifying.  I had just rebuilt the media server from scratch and tested again, pretty much same results.

If anyone could chime in and point me to the right direction, it would be much appreciated.

18 Replies