br_nar
15 years agoLevel 3
Exchange 2007 non-GRT Backup is 22x times faster than GRT backup
I'm using the following:
An Exchange non-GRT backup has an average throughput of 1817MB/min (and takes only 6 minutes to complete).
I find the performance of the GRT backup (which is 22x times slower) unacceptable.
While investigating the cause, I noticed the following behaviour of Backup Exec:
For the NON-GRT backup the 'TCP Receive Window' always remains 64240 bytes and therefore the client keeps sending data.
For the GRT-enabled backup:
It seems to me that it only processes the data once that all data has been transferred, because only after all data has finished, it creates an IMGxxxxxx\vdb_date_time subfolder with some additional data in it. However I'm not sure about this.
If Backup Exec is not processing the contents on-the-fly: why is the transfer taking so long?
If Backup Exec does process the data on-the fly: wouldn't it be better to process the contents AFTER all data has been transferred? This design would solve the above disadvantages, and it would make more sense if the job details would show: 6 minutes required to transfer + 114 minutes required to extract individual messages.
In case some technician is interested: I have Network Traces (wireshark) of the above behaviour, in case it cannot be reproduced.
Kind regards,
br_nar
- Backup Exec 12.5 for Windows with Service Pack 3 installed (incl the latest hotfixes)
- Windows Server 2008 Service Pack 2
- All servers have Gbit NICs connected to Gigabit swithes
- Exchange 2007 Service Pack 2 + Rollup 1
- The Exchange Information Store is about 10GB in size
An Exchange non-GRT backup has an average throughput of 1817MB/min (and takes only 6 minutes to complete).
I find the performance of the GRT backup (which is 22x times slower) unacceptable.
While investigating the cause, I noticed the following behaviour of Backup Exec:
- A snapshot is created using the Microsoft Software Shadow Copy provider 1.0 (Version 1.0.0.7).
- The data in the snapshot is accessed on the Exchange server by the BEREMOTE.EXE process.
- The BEREMOTE.EXE process on the Exchange server sents the data over a TCP connection to the BEREMOTE.EXE process on the Backup server
The data is sent in chunks of 64KB which takes about 1.2milliseconds
The client then has to wait for about 40 milliseconds for a response from the server, before it can sent the next chunk of 64KB.
Since the ethernet protocol has an MTU of 1500 bytes, it has to sent the data in several smaller packages (44x 1460 data bytes)
The 44 packages are all sent within 1.2ms, and then it takes 40 milliseconds (and sometimes even more) before it can sent the next chunk. - The Backup Server writes the data to files in an IMGxxxxxxx folder as a subfolder in the Backup-to-disk location.
The files in the IMGxxxxxxx folder are a 1:1 replica of the snapshot data.
The slowness is due to a 'bug' in the BEREMOTE.EXE
I verified this by accessing the above Snapshot using the MKLINK.EXE command (see http://blogs.msdn.com/adioltean/archive/2008/02/28/a-simple-way-to-access-shadow-copies-in-vista.aspx for details)
The copy of the same data over the network to the same harddisk location using Windows built-in COPY (or XCOPY) command took proceeded at a rate of about 250Mbps a second over the SMB protocol).
For the NON-GRT backup the 'TCP Receive Window' always remains 64240 bytes and therefore the client keeps sending data.
For the GRT-enabled backup:
- the 'TCP Receive Window' decreases for every TCP-ACK sent, until it's too small, and therefore the client has to stop.
- After the described timeout of +/- 40 milliseconds, the server sends a 'TCP Window Update' to the client (a reset of the TCP Receive Window to 64240 bytes) after which the client immediately start sending its data to the client.
- The network is kept busy for an extended period, and therefore the risk of job failure due to network interruptions increases accordingly.
- The remote agent on the client is kept busy for an extended period
- The snapshot is kept online for an extended period, putting an additional load on the server + additional disk space on a heavily used server.
- How comes the client may not keep sending it's data (or why is the Window Size decreased for GRT backup, while it's not decreased for non-GRT backup)?
It seems to me that it only processes the data once that all data has been transferred, because only after all data has finished, it creates an IMGxxxxxx\vdb_date_time subfolder with some additional data in it. However I'm not sure about this.
If Backup Exec is not processing the contents on-the-fly: why is the transfer taking so long?
If Backup Exec does process the data on-the fly: wouldn't it be better to process the contents AFTER all data has been transferred? This design would solve the above disadvantages, and it would make more sense if the job details would show: 6 minutes required to transfer + 114 minutes required to extract individual messages.
In case some technician is interested: I have Network Traces (wireshark) of the above behaviour, in case it cannot be reproduced.
Kind regards,
br_nar