cancel
Showing results for 
Search instead for 
Did you mean: 

BE 16 SAN transport poor performance

sysyl
Level 3

Hi,

I have poor performance since a lot of time with BE 16 on Windows Server 2012 R2. I Backup my VMs via SAN transport. The VMs are on a EMC Unity (last firmware) and the B2D disk is a HP P2000 (last firmware and raid 50 of with 22 disks). The backup is extremly long. The complete backup (weekend) is completed after 25 hours for 5.84 TB. The catalog job that is running after the backup take "only" 2.5 hours. After all that, the duplicate on LTO7 is so long with a variable time but can take 25 hours also. The total backup window is too large and the incremental backup are blocked during this time. The unity, p2000 and LTO are all attached via FC 8gb. (qlogic cards). The BE version is BE 16 FP1. The FP2 is not interesting for the solved problem for my usage.

I update all on the backup server. BIOS, HBA firmware, drivers. Also try to change the blocks size, chunk size of the P2000, turn off the antivirus. When I monitor a job, the job on the vcenter is did quickly and next I can see the data transfer begin on the B2D. I monitor the throughput with performance monitor of Windows, BE seems to wait large time and when it's working, the throughput is not the maximum that I can test with a simple file transfer. Only max 64 MB/s.... but an activity at 99%...

OK the unity have to manage all the VMs during the backup so the ressources are not only for BE but the problem is the same for the duplicate job and the P2000 is only here for the duplicate.

Is someone have the same experience ?

5 REPLIES 5

VirgilDobos
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi mate,

I suspect the slowness may be related to the HP P2000 storage or the deduplication pool configuration.

Have you tried running the backup using the LAN transport? Are you getting better performance?

--Virgil

I just launch a test via NBD.

For my test via SAN, I also test the PingUS soft which change some registry keys but nothing better. I precise that there is a FC port for every device (P2000, unity and LTO). P2000 and LTO are directly attached and unity is attached via FC switchs. but all is up to date and no problem on the FC switchs no performance problems for the ESX hosts (also attached on the FC switchs). The CPU is not more than 5% and the memory (32GB) is use at max 10%. 

I monitor again the backup of a small VM (27GB) via NBD and I have a question.Why the throughput seems te evolve by level of 32 MB ? one time 32 MB and the pic after 64MB. It seems there is a negociation of the throughput that doesn't work good.

I don't use deduplication. I don't have the licence option. Just the Entreprise option for CASO.

The P2000 is capable to write a file of 64 GB at more than 470 MB/s when I test with nbperfchk so the performance are good enough. And it's just a backup array. Yes it's a level entry array but 64 MB/s !!! Or maybe BE use very very small IO that can flood the array but it's not a good choice if this is the case...

The NBD backup jut finished and it take more time but I only have 1 Gb/s LAN on the backup server. See also the throughput during the verify job.

See the capture in this post and in my first post. I also post a capture of job result. 2 first job via SAN and last via NBD.

VirgilDobos
Moderator
Moderator
Partner    VIP    Accredited Certified

Hi mate,

I reviewed the details you shared and would suggest logging a case and have Support investigate this.

--Virgil

I opened a case and wait for response.

Also I just tested the same backup via SAN on another B2D (an old vnx 5300 but more powerful than the P2000. I turn it on only for this test so no another activity on it). The result is the same.

See the capture.

After long research and tests, the bottleneck is the EMC unity array that supports the VMs. Finaly I tests to backup a VM that was moved on another array (old EMC VNX) and the throughput is better by 5 times. The P2000 that is our B2D wasn't the problem.

Will try to buy some additionnals disk to optimise the throughput on our unity...

And for the LTO problem, a change in the settings of blocks size and cache permit to divide by 2 the duplication time.