cancel
Showing results for 
Search instead for 
Did you mean: 

Backup Exec 2010 R2 VMware Agent Performance?

Jellyman_4eva
Level 3
We have a brand new Backup Exec 2010 R2 server (DL 360 G7 2 x 2.6 (6 core) CPU with 12GB RAM), which we are looking to use to perform all backups, however we seem to be having an issue with performance when using the VMware agent. We are pointing Backup Exec to the vCenter server on a brand new VMware implementation (All 4.1 servers which are well spec'd). The SAN in use is a LeftHand iSCSI SAN.

If I create a job to backup a single VM (2008 R2 64bit with the RAWS agent on after VMware tools) which has a 20GB VMDK with approx 9GB in use, I get the following results:


NBD mode job throughput of 1197 MB/min.
SAN based transport it goes to 1567 MB/min.

I can understand the NBD job being slow because the service console in 4.0 and 4.1 of ESX is apparently capped or restricted, but I am not understanding the SAN based transport mode.

I have tried backing up to both a disk folder (On the iSCSI SAN), and a tape and the results are the same. For reference the tape is a LTO 4 1760. It just seems for some reason Backup Exec is not driving the iSCSI network hard enough, I see utilization of around 35% tops on the iSCSI NIC. If I run iometer on the backup server I can drive it to 110mb/s and max out the connection so it seems to be OK.

As a test I have attempted to backup data in a file server VM using the RAWS agent straight to tape - just a simple job selecting an entire volume not using AOFO (Thus treating it like a physical server), usage on the production NIC is bouncing (Probably due to different file sizes) but sits predominantly around 60-70% and the throughput after 54GB of 250GB is 3.5GB a minute?! Is there something I am missing...
19 REPLIES 19

RahulG
Level 6
Employee
IF you perform the backup using Raws agent the backup would happen throug your local network and not throug SAN

Jellyman_4eva
Level 3
Yes I agree with this...

The issue I am having is why is Backup Exec so much quicker going through the RAWS agent and not the VMware agent...

Thats the key question I am trying to answer. I waited for the backup to tape to finish and the final throughput was 3954 MB/min..

I then performed the same backup to an iSCSI B2D folder and the throughput is slower 3000 MB/min...

But this backup from the RAWS agent is still considerably faster than the VMware agent in either NBD or SAN transport mode. The iSCSI NIC utilization bounces around 45-50% (Especially during verify)... so again it seems the SAN can handle more...

There seems to be an issue with the VMware agent and speed... and it is this I am trying to fix.

ZeRoC00L
Level 6
Partner Accredited
It depends on your san, I have a customer with a brand new HP P2000 G3 SAN, connected to 8 GB fibre channel, and vmware agent backup run with almost 10.000 Mb /min !

teiva-boy
Level 6
Are you using MPIO?  If not you should be!  And there is a whole lot of tuning that needs to be done for iSCSI to perform well.  One of these days I need to do a write up on that, there is so much bad iSCSI out there it's un-nerving.  

Jellyman_4eva
Level 3
Hi, we are using MPIO but it is LeftHand MPIO (The only one supported by LeftHand) and basically it is only active/passive...

This is going to sound bad, but to be honest based on my backup tests above, it is the VMware Agent that is slowing up the backups, as a RAWS job is so much quicker...

To be honest I am probably going to ditch Backup Exec because there seems to be very little help and nearly everything I have attempted to do has been met with frustration, followed by googling, followed by a link to some obscure knowledgebase article of something which should really just be in their manual...

Jim_S
Level 4

I have similar results with SAN transport method from a DL380G7 with 12GB of RAM connected via GigE to our NetApp SAN.  MPIO doesn't really even enter into it as the AVVI SAN transport backup consumes at most 30% of the GigE link.

chrisal
Level 2

I have similar results with 2010 R2 and avvi in SAN transport mode. I have tried the same test from both our HP EV4000 4GB FC san and our Dell EqualLogic PS6000 1GB iSCSI san.

Speed with the AVVI agent in SAN transport mode via the HP is about 1.5 Gb/Min. Backing up data of an NTFS LUN mounted on the media server is about 6GB/Min. The AVVI agent does not seem to drive the disk subsystem very hard and it's certainly not significantly faster than backing up the same server over the network.

I have the same results from the Dell iSCSI unit - the avvi agent is roughly 4 or 5 times slower than the SAN can go backing up an NTFS volume/LUN mounted on the media server.

I have a case open with Symantec and will let you know how it goes....

We are using vCenter 4.1 and ESXi4.1

Does anyone else have the same problems with AVVI in SAN transport mode? 

James_Winslow
Level 5

Hi Chris,

Does your BE management server connects to the SAN directly or through a switch, I'd like to know if the Management server is also having at least one NIC with same IP subnet as the SAN network.

Jim_S
Level 4

My BackupExec Media Server connects directly to the iSCSI SAN switch with a single 1GB link.  However, as previously stated in this post by others, it does not appear that BE attempts to push the IO at all.

teiva-boy
Level 6

I cant believe folks here would blame the backup vendors for not pushing a link at all.  It's pretty absurd come to think of it.  It's all in the storage setup and configuration.  You think your fancy SAN is turnkey Ferrari for Storage.  Think again.

 

iSCSI is an inefficient protocol.  It needs to use MPIO (round-robin or weighted queue) to achieve any better throughput.  Active/Passive will not cut it.

Jumbo frames HAVE to be enabled to get more data per packet sent

networking hardware is extremely important to the mix.  There are a number of switches that have shared backplanes with insufficient buffers to provide the right networking traffic.

Most switches are mis-configured out of the box with no dedicated VLAN's, STP enabled, and storm control enabled.

Vmware needs to be tweaked from default to leverage MPIO too, there are a number of values, timeout values that get reduced from the default to about 1/30th the existing value, to "force" it to use redundant links for added throughput.

 

As you can see out of the box will not cut it.  Buying an expensive SAN will do nothing.  You need someone with the expertise and knowledge to identify where the bottlenecks are, what needs to be changed, and how to test your changes.  

Jim_S
Level 4

Well, considering that I do have dedicated SAN switches that are properly configured and tuned and that it does give great performance for everything BUT BackupExec 2010 I'm inclined to believe that it is BackupExec's fault, not my switch or my SAN or anything else. 

Prior to using BackupExec 2010 R2 I used BackupExec 11d and VCB with the same exact SAN and SAN switches.  VCB could saturate my iSCSI SAN and get 85% link utilization of a GB link.   SAN Transport mode in BackupExec 2010 R2 consumes at most 30% of a single GB link.  VCB was installed on a complete pile of junk server compared to the 2010 media server I have now...  There doesn't really seem to be any reason to use MPIO on the BackupExec media server itself considering that it can't even come close to saturating half of a GB link?

teiva-boy
Level 6

The operation of VCB and vStorage are entirely different.  Different file movement and file access patterns.  Not to mention more overhead in the vStorage api due to added functionality in change block tracking, snapshot improvements, thin provisioning features, etc...  VCB is what I equate to a brute force method, vStorage more elegant, but also more cautious (aka slower).

 

EDIT:  You should try VCB on BE2010 and see if the performance returns.  

chrisal
Level 2

Teiva-boy, We have the exact same results from both our iSCSI SAN and my FC SAN - both environments are configured in accordance with vendor recommendations and both experience a 3-4 times performance deficit when using the avvi agent in SAN transport mode. I'm not pointing the finger at BackupExec specifically but since VMware moved away from VCB to their Data Recovery (VDR) product we don’t really have anything to compare it to.

Judging by the posts here it seems others are also having similar performance problems. I am using Backupexec 2010 R2 with vCenter 4.1 and ESXi 4.1 with all the latest patches/updates applied.

We can easily max out both our iSCSI and FC disk subsystems just by doing a straight data copy of an NTFS volume on the same SAN(s). This is why I and the other posters are surprised that the AVVI agent in SAN transport mode doesn’t seem to max out the disk subsystem and gives 3-4 times less throughput during a SAN transport backup.

Maybe the new vStorage api's are just not that quick but it surprises me that vStorage could be a bottle neck as once the initial snapshot is complete the media server is communicating directly with the VMFS volume on the SAN. For testing purposes I have eliminated GRT by disabling it and I’ve even tried backing up a powered down VM. There is no significant difference in speed with any of the tests that I performed. RAWS agent installed or not-installed.

I am following this up with Symantec support so I’m not expecting an answer from the forums, just seeking others opinions/results which is one of the benefits of having these discussion areas.

Thanks

Chris 

chrisal
Level 2

Hi,

We have a BE 2010 R2 Central admin server which is a VM but we run the AVVI backups from a seperate phsycial media server (so we can use SAN transport mode). The media server has a direct connection to both the iSCSI and the FC SAN switches. The iSCSI SAN has it's own dedicated 1GBs switch with jumbo frames/flow control etc. enabled. the media server has two NIC's/HBA's to each SAN with MPIO enabled.

Thanks

Chris

yossi_katz
Level 3
Partner

Guys Any new news about this issue

teiva-boy
Level 6

On a lefthand iSCSI SAN, I got around 1200-1500MB/min.  On an HP EVA 4200 (or perhaps 4400) I was getting over 4000MB/min on FC.  MPIO was not working on the LeftHand unit, and the FC was at 8GB I beleive.

I was also only doing single vm jobs, thus not able to really leverage MPIO to it's potential.  (As you need to schedule multiple concurrent jobs in BE to make MPIO work, becuase each job is a single stream)  But the speed difference between LeftHand and EVA was quite staggering.  

This was un-tuned, un-tweaked right out of the box for one of the larger HP resellers in Northern California and their lab integration datacenter.  

Customers of mine with Equallogic that have more than two members in a group with fast disk, are able to get greater than 2000MB/min up to 4000MB/min in some cases, but that customer has some 6 members in a group.

There is an interesting setting in vcenter for iscsi that sets a timeout of when to use multiple paths.  I've found if overlooked, one link gets used more so than the other.  It's something no one bothers to check, and it would be good for any VMware administrator to look at network throughput for their iSCSI paths and compare traffic across the links.  Are they evenly distributed or is one handling the majority of the traffic?  Thus the slow iSCSI speeds?

BTW, 1500MB/min is what I get with NBD in many cases.  You sure BE is not defaulting to NBD as something in the SAN transport config is missing?  

Phoenix_Hawk
Level 2

Actually it´s pretty easy to answer.

If you use an iSCSI Storage with GBit Uplink and you use the VMware Method what actually happens is, that the VMware API grants you shares on the used datastore and that´s exactly the problem. First of all there is a lot of error correction done by the API. Second is, that the VM itself keeps running so the system itself needs access to it´s own resources even if it´s using the snapshot functionality.

It´s easy as it is. First of all, accusing the iSCSI protocol for being lame and crap and piece of poo shows me that there is a lack of understanding.

iSCSI (especially with 10GBit) will definitely displace fibre environments sooner or later. Sure thing, iSCSI has a lot of overhead... BUT you can configure jumbos (and you should do this on the backup server) what reduces the percentage of header data a lot.

But to answer some of the questions:

The speed is slower, cause the connection is shared. You cannot improve it by simply adding multi path IO because only a few vendors support it AND it is only useable with Enterprise Plus. The problem is, that each connection is based on initiator/target principe so you have one dedicated connection that needs to be shared on any instance.

What helps a bit is to add concurrent backup jobs (backup2disk is the preferred way), for example doing four jobs simultaneosly uses the 1GBit uplink of your backup server 100%. Even if the single job does not get more than 2GB/min you can easily get three or four jobs with round about 1.5GB/min concurrently, adding up to 6GB/min.

Depending on the data you backup, it might be faster or slower. The API indexes all data (files, databases, mails) during the backup process (that´s why you still need the RAWS for recovery and GRT) and all this impacts the data transferred during the backup process.

If you use fibre with 8GBit or iSCSI with 10GBit you get different outputs, like already stated here.

Oh and why is a fiel based RAWS backup of a VM faster then NBD/SAN? Should be already understandable by now because the access to the data is not shared on the base level. The VM has 100% of it´s hypervised resources and can handle the data all on itself so you get almost the full GBit connection.

JohnnyLeung
Level 3

Hi Friends

We also have similar problem with BE 2010 R3. The backup speed using AVVI agent(1xxxMB/min) looks much slower than using normal LAN(4xxxMB/min).

We connected the BE server to the SAN storage through the SAN switch, and the SAN allocate a HDD to the BE server as a HDD for B2D backup.

It doesn't make sense that the SAN connection has 8Gb fiber, but which the LAN do not have. We suppose the backup through avvi agent(passing through SAN) will be much faster than using traditional LAN backup.

Do you have any idea how can it be improved? Many thanks.

Johnny

teiva-boy
Level 6

Johnny, make a new thread please with your particular setup listed in detail.