cancel
Showing results for 
Search instead for 
Did you mean: 

Slow backup performance

spenukon
Level 4

Hi

 

We have extremely slow performance in our new netbackup environment. It cannot do more than 8mb/sec. We don't have many clients running backups on this except few NDMP shares and nbu servers backing up itself.

Here is the info about our environment

Netbackup 7.5

OS : linux 6.2

Data domain

1 master and 2 media servers.

NDMP shares are direct attached to tape drives.

Filer: Sun7410.

We have mounted nfs filesystem from data domain on media server and tried copying 2 GB file and it works fine.

Physically mounted a tape using robtest and tried dd command to send the data to tape and that works great as well.

Tried copying tar files using scp between nbu servers, that is good also.

The network team said everything is good from switch point of view.

I am clueless whereelse to look. The server network settings seems to be good also.

Both san (ndmp) and networks backups gives the same performance which is even more confusing. Any help would be appreciated.

 

Thanks

Sai

1 ACCEPTED SOLUTION

Accepted Solutions

Nicolai
Moderator
Moderator
Partner    VIP   

0/12/2012 04:51:11 - Info bptm (pid=7528) using 1024 data buffer size

You need to check this value  - it shoud be 262144. If the backup really use 1K block size it will be very very slow. Check SIZE_DATA_BUFFERS_NDMP on the media server that control the NDMP backup.

more info on SIZE_DATA_BUFFERS_NDMP http://www.symantec.com/docs/TECH147938

View solution in original post

14 REPLIES 14

Nicolai
Moderator
Moderator
Partner    VIP   

Did you configure NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS ?

http://www.mass.dk/netbackup/guides/49-netbackup-buffer-tuning.html

spenukon
Level 4

Yes I do have that settings.

 

NUMBER_DATA_BUFFERS - 16

SIZE_DATA_BUFFERS - 262144

mph999
Level 6
Employee Accredited

Can you post up the details of a slow job that completes successfully from activity monitor (details tab)

Martin

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Please also post bptm log on media server that contains start and and of slow job.
(Post log file as file attachment.)

spenukon
Level 4

Hi Mariann

 

Please check the attachment.

 

Thanks

Sai

spenukon
Level 4

Here is the successfull job completion of NDMP backup. The throughput was only 143kb/sec.

 

 

10/12/2012 04:50:39 - Info nbjm (pid=21227) starting backup job (jobid=476) for client mtprdsan1002.pr.mdwr.s.nokia.com, policy sun-ndmp-lbsp_splunk-monthly-prod, schedule daily-differential
10/12/2012 04:50:39 - Info nbjm (pid=21227) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=476, request id:{FCB8BD98-141F-11E2-89EF-96D0EC7673EE})
10/12/2012 04:50:39 - requesting resource mtprdsan6010-hcart-robot-tld-0-mtprdsan1002
10/12/2012 04:50:39 - requesting resource mtprdsan6009.atl.stor.s.nokia.com.NBU_CLIENT.MAXJOBS.mtprdsan1002.pr.mdwr.s.nokia.com
10/12/2012 04:50:39 - requesting resource mtprdsan6009.atl.stor.s.nokia.com.NBU_POLICY.MAXJOBS.sun-ndmp-lbsp_splunk-monthly-prod
10/12/2012 04:50:39 - Waiting for scan drive stop HP.ULTRIUM4-SCSI.000, Media server: mtprdsan6010.atl.stor.s.nokia.com
10/12/2012 04:51:10 - Info bpbrm (pid=7525) mtprdsan1002.pr.mdwr.s.nokia.com is the host to backup data from
10/12/2012 04:51:10 - Info bpbrm (pid=7525) reading file list from client
10/12/2012 04:51:10 - Info bpbrm (pid=7525) starting ndmpagent on client
10/12/2012 04:51:10 - granted resource  mtprdsan6009.atl.stor.s.nokia.com.NBU_CLIENT.MAXJOBS.mtprdsan1002.pr.mdwr.s.nokia.com
10/12/2012 04:51:10 - granted resource  mtprdsan6009.atl.stor.s.nokia.com.NBU_POLICY.MAXJOBS.sun-ndmp-lbsp_splunk-monthly-prod
10/12/2012 04:51:10 - granted resource  200197
10/12/2012 04:51:10 - granted resource  HP.ULTRIUM4-SCSI.000
10/12/2012 04:51:10 - granted resource  mtprdsan6010-hcart-robot-tld-0-mtprdsan1002
10/12/2012 04:51:10 - estimated 1349718373 kbytes needed
10/12/2012 04:51:10 - Info nbjm (pid=21227) started backup (backupid=mtprdsan1002.pr.mdwr.s.nokia.com_1350013870) job for client mtprdsan1002.pr.mdwr.s.nokia.com, policy sun-ndmp-lbsp_splunk-monthly-prod, schedule daily-differential on storage unit mtprdsan6010-hcart-robot-tld-0-mtprdsan1002
10/12/2012 04:51:10 - started process bpbrm (pid=7525)
10/12/2012 04:51:10 - connecting
10/12/2012 04:51:10 - connected; connect time: 0:00:00
10/12/2012 04:51:11 - Info ndmpagent (pid=7527) Backup started
10/12/2012 04:51:11 - Info bpbrm (pid=7525) bptm pid: 7528
10/12/2012 04:51:11 - Info bptm (pid=7528) start
10/12/2012 04:51:11 - Info bptm (pid=7528) using 30 data buffers
10/12/2012 04:51:11 - Info bptm (pid=7528) using 1024 data buffer size
10/12/2012 04:51:11 - Info bptm (pid=7528) start backup
10/12/2012 04:51:11 - Info bptm (pid=7528) Waiting for mount of media id 200197 (copy 1) on server mtprdsan6010.atl.stor.s.nokia.com.
10/12/2012 04:51:11 - mounting 200197
10/12/2012 04:52:05 - Info bptm (pid=7528) media id 200197 mounted on drive index 0, drivepath /dev/rmt/0n, drivename HP.ULTRIUM4-SCSI.000, copy 1
10/12/2012 04:52:05 - mounted 200197; mount time: 0:00:54
10/12/2012 04:52:10 - positioning 200197 to file 11
10/12/2012 04:54:08 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Direct Access Restore information is supported
10/12/2012 04:54:08 - positioned 200197; position time: 0:01:58
10/12/2012 04:54:08 - begin writing
10/12/2012 04:54:09 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Backing up "/export/Splunk".
10/12/2012 04:54:09 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Tape record size: 1024.
10/12/2012 04:54:09 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: File history: N.
10/12/2012 04:54:09 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Date of the last level '3': Fri Oct 12 01:10:24 2012.
10/12/2012 04:54:09 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Date of this level '4': Fri Oct 12 03:54:08 2012.
10/12/2012 04:54:09 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Update: TRUE.
10/12/2012 06:58:49 - Info ndmpagent (pid=7527) mtprdsan1002.pr.mdwr.s.nokia.com: Runtime [/export/Splunk] 1102577664 bytes (1102577664): 134 seconds
10/12/2012 06:58:53 - Info ndmpagent (pid=7527) NDMP backup successfully completed, path = /export/Splunk
10/12/2012 06:59:11 - Info bptm (pid=7528) EXITING with status 0 <----------
10/12/2012 06:59:11 - Info bpbrm (pid=7525) validating image for client mtprdsan1002.pr.mdwr.s.nokia.com
10/12/2012 06:59:11 - Info ndmpagent (pid=7527) done. status: 0: the requested operation was successfully completed
10/12/2012 06:59:11 - end writing; write time: 2:05:03
the requested operation was successfully completed  (0)
 

spenukon
Level 4

Here is the job completion of normal backups that go to data domain.  Througput was only 2mb/sec.

 

10/12/2012 03:10:48 - Info nbjm (pid=21227) starting backup job (jobid=475) for client atlnagios01.pr.mdwr.s.nokia.com, policy unix-lan-backupservices-monthly-all, schedule daily-differential
10/12/2012 03:10:48 - Info bpbrm (pid=3065) atlnagios01.pr.mdwr.s.nokia.com is the host to backup data from
10/12/2012 03:10:48 - Info bpbrm (pid=3065) reading file list from client
10/12/2012 03:10:48 - Info nbjm (pid=21227) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=475, request id:{09C0298A-1412-11E2-8706-54DF7F705CA4})
10/12/2012 03:10:48 - requesting resource mtprdsan3002lsu-ost-sug
10/12/2012 03:10:48 - requesting resource mtprdsan6009.atl.stor.s.nokia.com.NBU_CLIENT.MAXJOBS.atlnagios01.pr.mdwr.s.nokia.com
10/12/2012 03:10:48 - requesting resource mtprdsan6009.atl.stor.s.nokia.com.NBU_POLICY.MAXJOBS.unix-lan-backupservices-monthly-all
10/12/2012 03:10:48 - granted resource  mtprdsan6009.atl.stor.s.nokia.com.NBU_CLIENT.MAXJOBS.atlnagios01.pr.mdwr.s.nokia.com
10/12/2012 03:10:48 - granted resource  mtprdsan6009.atl.stor.s.nokia.com.NBU_POLICY.MAXJOBS.unix-lan-backupservices-monthly-all
10/12/2012 03:10:48 - granted resource  MediaID=@aaaab;DiskVolume=mtprdsan3002lsu;DiskPool=mtprdsan3002lsu-dp;Path=mtprdsan3002lsu;StorageServer=mtprdsan3002.atl.stor.s.nokia.com;MediaServer=mtprdsan6010.atl.stor.s.nokia.com
10/12/2012 03:10:48 - granted resource  mtprdsan6010-ost-mtprdsan3002lsu-su
10/12/2012 03:10:48 - estimated 0 kbytes needed
10/12/2012 03:10:48 - Info nbjm (pid=21227) started backup (backupid=atlnagios01.pr.mdwr.s.nokia.com_1350007848) job for client atlnagios01.pr.mdwr.s.nokia.com, policy unix-lan-backupservices-monthly-all, schedule daily-differential on storage unit mtprdsan6010-ost-mtprdsan3002lsu-su
10/12/2012 03:10:48 - started process bpbrm (pid=3065)
10/12/2012 03:10:48 - connecting
10/12/2012 03:10:49 - Info bpbrm (pid=3065) starting bpbkar on client
10/12/2012 03:10:49 - Info bpbkar (pid=16631) Backup started
10/12/2012 03:10:49 - Info bpbrm (pid=3065) bptm pid: 3066
10/12/2012 03:10:49 - Info bptm (pid=3066) start
10/12/2012 03:10:49 - Info bptm (pid=3066) using 1024 data buffer size
10/12/2012 03:10:49 - Info bptm (pid=3066) using 30 data buffers
10/12/2012 03:10:49 - connected; connect time: 0:00:00
10/12/2012 03:10:50 - Info bptm (pid=3066) start backup
10/12/2012 03:10:51 - Info bptm (pid=3066) backup child process is pid 3083
10/12/2012 03:10:51 - begin writing
10/12/2012 03:16:34 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/7/var/run/saslauthd/mux] is a socket special file. Skipping
10/12/2012 03:22:05 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/5/var/log/nagios/rw/live] is a socket special file. Skipping
10/12/2012 03:46:54 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/3/var/run/saslauthd/mux] is a socket special file. Skipping
10/12/2012 03:48:27 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/3/var/log/nagios/rw/live] is a socket special file. Skipping
10/12/2012 04:16:36 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/1/var/run/saslauthd/mux] is a socket special file. Skipping
10/12/2012 05:09:08 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/root/1] is in a different file system from [/vz]. Skipping
10/12/2012 05:09:08 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/root/7] is in a different file system from [/vz]. Skipping
10/12/2012 05:09:08 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/root/3] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/2] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/1] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/6] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/4] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/7] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/5] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/8] is in a different file system from [/vz]. Skipping
10/12/2012 05:54:23 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz/private/3] is in a different file system from [/vz]. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/acpid.socket] is a socket special file. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/sdp] is a socket special file. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/audispd_events] is a socket special file. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/pcscd.comm] is a socket special file. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/avahi-daemon/socket] is a socket special file. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/dbus/system_bus_socket] is a socket special file. Skipping
10/12/2012 07:28:53 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/run/cups/cups.sock] is a socket special file. Skipping
10/12/2012 07:28:57 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var/lib/nfs/rpc_pipefs] is in a different file system from [/var]. Skipping
10/12/2012 07:29:11 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/var] is in a different file system from [/]. Skipping
10/12/2012 07:29:51 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/home] is in a different file system from [/]. Skipping
10/12/2012 07:30:07 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/sys] is on file system type sysfs. Skipping
10/12/2012 07:30:07 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/boot] is in a different file system from [/]. Skipping
10/12/2012 07:36:14 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/vz] is in a different file system from [/]. Skipping
10/12/2012 07:38:43 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/proc] is on file system type PROC. Skipping
10/12/2012 07:38:43 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/dev] is in a different file system from [/]. Skipping
10/12/2012 07:38:43 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/tmp/.font-unix/fs7100] is a socket special file. Skipping
10/12/2012 07:38:52 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/misc] is in a different file system from [/]. Skipping
10/12/2012 07:38:52 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/net] is in a different file system from [/]. Skipping
10/12/2012 07:39:50 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/usr/openv/var/vnetd/terminate_bpcd.uds] is a socket special file. Skipping
10/12/2012 07:39:50 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/usr/openv/var/vnetd/terminate_vnetd.uds] is a socket special file. Skipping
10/12/2012 07:39:50 - Info bpbrm (pid=3065) from client atlnagios01.pr.mdwr.s.nokia.com: TRV - [/usr/openv/var/vnetd/bpcd.uds] is a socket special file. Skipping
10/12/2012 07:51:14 - Info bptm (pid=3066) waited for full buffer 974900 times, delayed 975018 times
10/12/2012 07:51:14 - Info bptm (pid=3066) EXITING with status 0 <----------
10/12/2012 07:51:14 - Info bpbrm (pid=3065) validating image for client atlnagios01.pr.mdwr.s.nokia.com
10/12/2012 07:51:14 - Info bpbkar (pid=16631) done. status: 0: the requested operation was successfully completed
10/12/2012 07:51:14 - end writing; write time: 4:40:23
the requested operation was successfully completed  (0)
 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

 

Sure about your buffer settings? This is what is reported in bptm log:

03:51:11.454 [7528] <2> read_config_file: using 256 value from /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_NDMP
03:51:11.454 [7528] <2> io_init: using 30 data buffers
03:51:11.454 [7528] <2> io_init: using 1024 data buffer size

Please double-check?

mph999
Level 6
Employee Accredited

Here we are:

10/12/2012 07:51:14 - Info bptm (pid=3066) waited for full buffer 974900 times, delayed 975018 times

There are a lot of 'delays', we see that bptm is waiting for a full bufer, which means the data isn't being sent from the client very quickly.

The most likely causes are :

1.  Data read speed of the client

2. Network (this can includenetwork settings in the os and possibly network buffer size setting)

But ...

10/12/2012 03:10:49 - Info bptm (pid=3066) using 1024 data buffer size

10/12/2012 03:10:49 - Info bptm (pid=3066) using 30 data buffers

On the media servers create these files :

/usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS

Put the value 262144 in this file

/usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS

Put the value 128 in this file

(Nicolia has covered the ndmp tuning file )

Re-run a backup that uses this media server - does it go any quicker.

Martin

Nicolai
Moderator
Moderator
Partner    VIP   

0/12/2012 04:51:11 - Info bptm (pid=7528) using 1024 data buffer size

You need to check this value  - it shoud be 262144. If the backup really use 1K block size it will be very very slow. Check SIZE_DATA_BUFFERS_NDMP on the media server that control the NDMP backup.

more info on SIZE_DATA_BUFFERS_NDMP http://www.symantec.com/docs/TECH147938

spenukon
Level 4

Marianne

 

You are right, the settings was wrong on media server. I changed the SIZE_DATA_BUFFERS_NDMP settings to 262144 as per Nicolai and the backups are running as faster as 90mb/sec. Thanks a lot.

spenukon
Level 4

Hi Nicolai and mph99

Thanks a lot for your help. The NDMP backups are running faster after I changed the NDMP settings but the data domain backups are still slow. The data domain backups are little better than before. Its doing 14mb/sec now but it should be faster than this.  We have same setup in other region and it's doing much faster there.

spenukon
Level 4

Hi

I have changed the SIZE_DATA_BUFFERS to 1048576 and the data domain backups are running faster.

It is doing 120mb/sec. Thank you all for your time.

 

Thanks

Sai

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Please select the post that has helped you most and Mark as Solution?
(My vote is for Nicolai.)