cancel
Showing results for 
Search instead for 
Did you mean: 

VM Backups over SAN is slow......or about right??

Toddman214
Level 6

Windows Server 2008r2 Master and media servers

Netbackup 7.5.0.4

 

Hi all,

 

I know tuning has been beaten to death on these forums, and I've read other forum posts, tuning articles, etc, so I guess I'm missing something here, OR is all well, and this is about all I can expect?

When I run VM backups to my DataDomain, I'm getting anywhere from 17 to 29MB/s backup rates in an environment of about 1300 vm's. I've ensured that my test policy does NOT allow "try san, then NBD" to discount the possibility that NBD is taking over. The jobs always show as LAN being the transport type, but thats more an error in Netbackup and wont change whether its running over SAN or NBD. I checked with one of the storage folks here, and the esx host that my VMware backup host/media server points to is zoned to the DataDomain.  My touch files are set as follows  -

NUMBER_DATA_BUFFERS_DISK = (does not exist, so is defaulting to 128)

SIZE_DATA_BUFFERS_DISK = 1048576

What kind of real-world speeds are you all seeing with VM backups (not referring to Accellerator or anything like that)? I'm mainly curious, but if we look to be significantly slow, I'll certainly accept input as well. I really need to get my backup windows down, and vm backups are one of my biggest offenders.

 

Thank you. If I can provided more info, just let me know.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

yes... bottleneck is between theMedia server  and the DD..

network is not good enough..

the IP that you used to connect to DD is 10 gig network or 1 gig network...?

Ipefr test is purly related to network test and no involment of Netbackup... so 

you need to improve the network speed first before expecting the good speed in backup.

work with network team to improve it.

 

 

and yes.. vcenter 5.1 is fully supported with 7.5.0.5

see the release note for 7.5.0.5

https://www-secure.symantec.com/connect/forums/netbackup-7505-netbackup-75-maintenance-release-5-now-available

 

View solution in original post

6 REPLIES 6

Mark_Solutions
Level 6
Partner Accredited Certified

How many do you backup at a time?

1 per datastore is supposed to be best

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

i have VMs running 80 to 100 MBps over the SAN to DD..

how many VMs you are running at a atime?

could you post the detail status of the backup & snapshot job?

what is the Vcenter version ?

what is the Storage box (vendor/model) that provides to luns to ESXi?

does multipathing is avaliable between the storage and media server?

does Backup host and Media server is same?

 

may be worth isolating the speed issue between the backup host and DD

what is the backup speed for Backup host own data (standard ms-windows policy type) to DD?

 

Toddman214
Level 6

Mark,

The setting on the storage unit is set to a max of 40 concurrent backup jobs.  The "limit jobs per policy" attribute on the policy itself is unchecked. This particular policy handles backups of VM's on our 5.1 environment, which currently has 34 datastores.

 

 

Nagalla,

Please see my response above to Mark about number of VM's backing up at a time. The other answers are below in asterisks.

***********Parent job detail status - *******************

2/17/2014 11:51:14 AM - Info nbjm(pid=17128) starting backup job (jobid=494119) for client PDC00ETST101W, policy Z_test, schedule E-Test 
2/17/2014 11:51:14 AM - Info nbjm(pid=17128) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=494119, request id:{52268BAD-7AB1-4181-8EEE-0C15A97F0220}) 
2/17/2014 11:51:14 AM - requesting resource PDCDD_SU_1
2/17/2014 11:51:14 AM - requesting resource pdc00nbua801w.ohlogistics.com.NBU_CLIENT.MAXJOBS.PDC00ETST101W
2/17/2014 11:51:14 AM - requesting resource pdc00nbua801w.ohlogistics.com.NBU_POLICY.MAXJOBS.Z_test
2/17/2014 11:51:14 AM - granted resource pdc00nbua801w.ohlogistics.com.NBU_CLIENT.MAXJOBS.PDC00ETST101W
2/17/2014 11:51:14 AM - granted resource pdc00nbua801w.ohlogistics.com.NBU_POLICY.MAXJOBS.Z_test
2/17/2014 11:51:14 AM - granted resource MediaID=@aaaae;DiskVolume=PDCDisk2;DiskPool=PDCDD_DP;Path=PDCDisk2;StorageServer=pdc00ddma901;MediaS...
2/17/2014 11:51:14 AM - granted resource PDCDD_SU_1
2/17/2014 11:51:14 AM - estimated 0 Kbytes needed
2/17/2014 11:51:14 AM - begin Parent Job
2/17/2014 11:51:14 AM - begin VMware, Start Notify Script
2/17/2014 11:51:14 AM - Info RUNCMD(pid=17584) started           
2/17/2014 11:51:14 AM - Info RUNCMD(pid=17584) exiting with status: 0        
Status 0
2/17/2014 11:51:14 AM - end VMware, Start Notify Script; elapsed time: 00:00:00
2/17/2014 11:51:14 AM - begin VMware, Step By Condition
Status 0
2/17/2014 11:51:14 AM - end VMware, Step By Condition; elapsed time: 00:00:00
2/17/2014 11:51:14 AM - begin VMware, Read File List
Status 0
2/17/2014 11:51:14 AM - end VMware, Read File List; elapsed time: 00:00:00
2/17/2014 11:51:14 AM - begin VMware, Create Snapshot
2/17/2014 11:51:14 AM - started
2/17/2014 11:51:15 AM - Info bpbrm(pid=4936) PDC00ETST101W is the host to backup data from    
2/17/2014 11:51:15 AM - Info bpbrm(pid=4936) reading file list from client       
2/17/2014 11:51:15 AM - Info bpbrm(pid=4936) start bpfis on client        
2/17/2014 11:51:15 AM - Info bpbrm(pid=4936) Starting create snapshot processing        
2/17/2014 11:51:15 AM - Info bpfis(pid=4400) Backup started          
2/17/2014 11:51:15 AM - started process bpbrm (4936)
2/17/2014 11:51:18 AM - snapshot backup of client PDC00ETST101W using method VMware_v2
2/17/2014 11:52:34 AM - end writing
Status 0
2/17/2014 11:52:34 AM - end VMware, Create Snapshot; elapsed time: 00:01:20
2/17/2014 11:52:34 AM - begin VMware, Policy Execution Manager Preprocessed
Status 0
2/17/2014 12:01:33 PM - end VMware, Policy Execution Manager Preprocessed; elapsed time: 00:08:59
2/17/2014 12:01:33 PM - begin VMware, Validate Image
Status 0
2/17/2014 12:01:33 PM - end VMware, Validate Image; elapsed time: 00:00:00
2/17/2014 12:01:33 PM - begin VMware, Delete Snapshot
2/17/2014 12:01:36 PM - Info bpbrm(pid=3916) Starting delete snapshot processing        
2/17/2014 12:01:36 PM - Info bpfis(pid=0) Snapshot will not be deleted       
2/17/2014 12:01:36 PM - started process bpbrm (3916)
2/17/2014 12:01:40 PM - Info bpfis(pid=4572) Backup started          
2/17/2014 12:02:59 PM - end writing
Status 0
2/17/2014 12:02:59 PM - end VMware, Delete Snapshot; elapsed time: 00:01:26
Status 0
2/17/2014 12:02:59 PM - end Parent Job; elapsed time: 00:11:45
the requested operation was successfully completed(0)

 

 

***************Backup job detail status******************

 

2/17/2014 11:52:34 AM - Info nbjm(pid=17128) starting backup job (jobid=494123) for client PDC00ETST101W, policy Z_test, schedule E-Test 
2/17/2014 11:52:34 AM - estimated 0 Kbytes needed
2/17/2014 11:52:34 AM - Info nbjm(pid=17128) started backup (backupid=PDC00ETST101W_1392659554) job for client PDC00ETST101W, policy Z_test, schedule E-Test on storage unit PDCDD_SU_1
2/17/2014 11:52:36 AM - Info bpbrm(pid=7608) PDC00ETST101W is the host to backup data from    
2/17/2014 11:52:36 AM - Info bpbrm(pid=7608) reading file list from client       
2/17/2014 11:52:36 AM - Info bpbrm(pid=7608) starting bpbkar32 on client        
2/17/2014 11:52:36 AM - started process bpbrm (7608)
2/17/2014 11:52:37 AM - Info bpbkar32(pid=9664) Backup started          
2/17/2014 11:52:37 AM - Info bptm(pid=6964) start           
2/17/2014 11:52:37 AM - Info bptm(pid=6964) using 1048576 data buffer size       
2/17/2014 11:52:37 AM - Info bptm(pid=6964) setting receive network buffer to 1048576 bytes     
2/17/2014 11:52:37 AM - Info bptm(pid=6964) using 64 data buffers        
2/17/2014 11:52:37 AM - connecting
2/17/2014 11:52:37 AM - connected; connect time: 00:00:00
2/17/2014 11:52:41 AM - Info bptm(pid=6964) start backup          
2/17/2014 11:52:44 AM - begin writing
2/17/2014 12:01:16 PM - Info bptm(pid=6964) waited for full buffer 856 times, delayed 2908 times   
2/17/2014 12:01:16 PM - Info bpbkar32(pid=9664) bpbkar waited 2105 times for empty buffer, delayed 5346 times.  
2/17/2014 12:01:29 PM - Info bptm(pid=6964) EXITING with status 0 <----------       
2/17/2014 12:01:29 PM - Info bpbrm(pid=7608) validating image for client PDC00ETST101W       
2/17/2014 12:01:33 PM - end writing; write time: 00:08:49
the requested operation was successfully completed(0)

 

*****VCenter version is 5.1.0******

 

*****This particular media server is also the backup host.****

 

*****PowerPath is installed on the media server, but is has come to light that even though the policy is set to san, and the jobs complete, it may actually not be backing up over san at all, so multipathing might be a mute point. ;-( ****

*****Backup speed for ms-windows types to DD range from about 210KB/s up to 57MB/s*****

 

 

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

1) Vcenter 5.1 is fully supported in netbackup 7.5.0.5, so please upgrade backuphost to 7.5.0.4 first..if possible update the master server too.

2) i wonder why detail status is not showing the transport type something like below

Info bpbkar32 (pid=XXXXX) INF - Transport Type = san 

may be we need to check the bpbpkar log to confirm the transport type its using...

 

3) how you are selecting the VMs in Policy, is that the Query builder or manaul selection?

if query builder is using.

set the number of jobs for Data store to 1 , as you have 30 data stores.. you can have 30 jobs at a time even with this seeting if other resources permits

  1. In the NetBackup Administration Console, click Host Properties > Master Servers and double-click the NetBackup master server.

  2. In the Properties screen, scroll down in the left pane and click Resource Limit.

  3. Click in the Resource Limit column to set the maximum NetBackup usage for the resource type. The settings apply to all policies.

if manual backup selection is using in the policy then set the limit jobs per policy to 2 to and test the jobs.

4) perfrom iperf test between the backup host and Data domain to check the network speed..

you can get the steps to perfrom iperf test from the DD website  (also i just attached)

Toddman214
Level 6

Hi Nagalla,

1) Please correct me if I am wrong, but didn't support for vm 5.1 start with 7.5.0.4? Is that correct? I am running 7.5.0.4 on Master and Media servers now.

2) I'm not sure why the job details dont mention the transport type. I ran the test job again, checked the bpbkar file, and I see no mention of the the transport type in there either. Is there something else specific I should be looking for inside the bpbkar log?

3) For my vm policies, I use the query method. BUT, for my test policy, I only have one test client, so I just added it in manually. I did go in and change the resource limit for datastore to 1, applied, and ran a manual backup of a policy that uses VM query. The results were about the same, at about 25MB/s.

I installed iperf, having never used it before, and I believe the results here are telling. Take a look.

C:\>iperf -c 10.20x.xxx.xxx -t 60 -i 10
------------------------------------------------------------
Client connecting to 10.20x.xxx.xxx, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[172] local 10.20x.xxx.xxx port 62xxx connected with 10.20x.xxx.xxx port 5xxx
[ ID] Interval       Transfer     Bandwidth
[172]  0.0-10.0 sec   736 KBytes   603 Kbits/sec
[172] 10.0-20.0 sec   584 KBytes   478 Kbits/sec
[172] 20.0-30.0 sec   536 KBytes   439 Kbits/sec
[172] 30.0-40.0 sec   528 KBytes   433 Kbits/sec
[172] 40.0-50.0 sec   536 KBytes   439 Kbits/sec
[172] 50.0-60.0 sec   560 KBytes   459 Kbits/sec
[172]  0.0-60.1 sec  3.41 MBytes   475 Kbits/sec

This looks horribly slow to me.

 

 

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

yes... bottleneck is between theMedia server  and the DD..

network is not good enough..

the IP that you used to connect to DD is 10 gig network or 1 gig network...?

Ipefr test is purly related to network test and no involment of Netbackup... so 

you need to improve the network speed first before expecting the good speed in backup.

work with network team to improve it.

 

 

and yes.. vcenter 5.1 is fully supported with 7.5.0.5

see the release note for 7.5.0.5

https://www-secure.symantec.com/connect/forums/netbackup-7505-netbackup-75-maintenance-release-5-now-available