NDMP backup type (Which one is better)

Amaan · ‎02-18-2012

We are backing up NetApp through NDMP. We have too many volumes and only 8 tape drives (LTO5). Lots of jobs are failing with 196.

1. Does it makes any improvment if we will do remote NDMP (sending data to media manager storage unit) because of multiplexing?

2. How it (multiplexing) actually improves the backup perfomance.

3. Need your thoughts on this. Is there any tunings available which could improve performance on NDMP backups?

4. I found only NDMP admin guide and best practise guide, is there any other useful documents on NDMP backups?

Marianne · ‎02-18-2012

The only tuning that I am aware of is SIZE_DATA_BUFFERS_NDMP on the NBU server associated with the NAS device. See http://www.symantec.com/docs/HOWTO56057

What kind of throughput are you seeing on each stream?

If changing buffer size does not improve throughput, you might be better off doing remote NDMP with multiplexing enabled.

You will have to test it to find out which method is better/best for you.

Handy NetBackup Links

Amaan · ‎02-18-2012

One more question, is it possible to check if the buffer size is not filling (how to check delays)? I know how to do that in normal backups checking bpbkar and bptm logs, is it possible to check it for NDMP backups. In one word how can i find that multiplexing will help us? I have to have some evidence before making the changes that i can show. It will be great if i can show some evidence from logs. We are getting around 35-40MB, sometimes even more upto 50mb, but not always.

Marianne · ‎02-19-2012

NDMP data transfer to tape is logged by bptm on the media server.

Look for 'waited for full/empty buffers' in the same way as for other backups.

Obviously no bpbkar log, but bptm log should give enough info.

Handy NetBackup Links

Amaan · ‎02-19-2012

One more question Marianne,

Does buffer will be one for all jobs? or it will seperate for every job, every drive?

Marianne · ‎02-19-2012

Nope - buffer setting in SIZE_DATA_BUFFERS_NDMP will apply to ALL NDMP backup jobs.

There is no way to specify different settings for different jobs/drives.

Handy NetBackup Links

Amaan · ‎02-19-2012

I couldn`t find any waited parameters in bptm logs. if our NDMP is local how bptm will be receiver. because all data will be sent from filers to tape without involving media server.

What log i can check to find that value?

Marianne · ‎02-19-2012

The NBU for NDMP Admin Guide explains how bptm on the NetBackup for NDMP server is involved in the backup process.

Handy NetBackup Links

Amaan · ‎02-19-2012

So, if i will not see any "waited" does this mean that client is waiting the buffers to be free?

Marianne · ‎02-19-2012

There is ALWAYS a 'waited for ....' entry at the end of a backup.

bptm is either waiting for full or empty buffers. Even '0' waits are logged.

Handy NetBackup Links

Amaan · ‎02-19-2012

Please help, i am not able to find this entry on bptm log.

Config details: Master NBU 7.0.1, OS Win 2008

NAS NetApp 3240 (i guess) and NDMP is local (tape drives directly attached to filers).

Issue is not able to find waited for entry in bptm log.

Marianne · ‎02-20-2012

You are looking at bptm on NetBackup for NDMP server?

Please post bptm log file after successful backup (even if it is a day or two old).

Please post log as attachment (not level 5 please - I don't have the time to go through these enormous logs, unlike my good friend Martin in support!). Level 0 log provides all the info that we need for this.

Handy NetBackup Links

Amaan · ‎02-20-2012

Sorry, our environment is already in 5. i cannot change it to 0 as it requires bounce of services. will provide logs just in case if you will have a time.

Marianne · ‎02-20-2012

Changes to bptm logging does not need a restart. The very next backup will run with new new logging level.

Please do not leave logging at level 5 for extended periods. Only enable level 5 when you are requested by Symantec Support to do so. (They have tools to extract relevant bits and pieces.) Lower levels again as soon as requested set of logs have been collected.

Level 5 logs will consume a lot of disk space in a short period as well as slow down your servers.
Level 0 logs are sufficient in 99% of day-to-day troubleshooting.

Handy NetBackup Links

Amaan · ‎02-20-2012

Hi Marianne,

I have attached bptm log. I have decreased log level at 9:30 AM. so you will be ok to look the job which started and ended after that time.

I am really needy for your help to find this.

Marianne · ‎02-20-2012

OK - we can see that the job started here (PID14700)

:

09:48:21.269 [14700.12988] <2> bptm: INITIATING (VERBOSE = 0): -w -pid 6600 -c adcna02.fmi.com -den 14 -rt 8 -rn 17 -stunit adcutil30-hcart2-robot-tld-17-adcna02.fmi.com -cl ADC_NDMP_NetApp_ADCNA02_VM -bt 1329756500 -b adcna02.fmi.com_1329756500 -st 0 -cj 4 -p NetBackup -reqid -1328565224 -jm -brm -hostname adcna02.fmi.com -ru root -rclnt adcna02.fmi.com -rclnthostname adcna02.fmi.com -rl 0 -rp 2592000 -sl Weekly_Full -ct 19 -maxfrag 1048576 -mediasvr adcutil30 -nonrsvdports -connect_options 0x01020001 -jobid 264611 -jobgrpid 264490 -masterversion 700000 -bpbrm_shm_id Global\NetBackup_BPBRM_SHM_Path_-1677965691_6600_14832 -blks_per_buffer 512 -ndmpport 50024 

......
09:48:21.316 [14700.12988] <2> read_config_file: using 262144 value from F:\Veritas\NetBackup\db\config\SIZE_DATA_BUFFERS_NDMP
09:48:21.316 [14700.12988] <2> read_config_file: using 32 value from F:\Veritas\NetBackup\db\config\NUMBER_DATA_BUFFERS
09:48:21.316 [14700.12988] <2> io_init: using 32 data buffers
09:48:21.316 [14700.12988] <2> io_init: using 262144 data buffer size
.....
10:14:00.037 [14700.12988] <4> report_throughput: VBRT 1 14700 1 1 HP.ULTRIUM5-SCSI.004 500295 0 1 0 29440  29440 (bptm.c.22646)
10:14:00.037 [14700.12988] <2> write_data: Total Kbytes transferred 29440

10:34:03.724 [14700.12988] <4> report_throughput: VBRT 1 14700 1 1 HP.ULTRIUM5-SCSI.004 500295 0 1 0 41935360  41935360 (bptm.c.22646)
10:34:03.724 [14700.12988] <2> write_data: Total Kbytes transferred 41964800

10:54:04.385 [14700.12988] <4> report_throughput: VBRT 1 14700 1 1 HP.ULTRIUM5-SCSI.004 500295 0 1 0 40867072  40867072 (bptm.c.22646)
10:54:04.385 [14700.12988] <2> write_data: Total Kbytes transferred 82831872

No more info for PID 14700. Seems backup was not yet finished by the time the log was collected.

I can see other jobs completing:

06:17:30.603 [1708.12140] <4> write_backup: successfully wrote backup id adcna02.fmi.com_1329708684, copy 1, fragment 1, 1048576256 Kbytes at 29872.116 Kbytes/sec

6:27:07.219 [7720.14080] <4> write_backup: successfully wrote backup id adcna02.fmi.com_1329695643, copy 1, fragment 3, 371356416 Kbytes at 36893.370 Kbytes/sec

06:27:19.607 [12968.8312] <4> write_backup: successfully wrote backup id adcna02.fmi.com_1329691078, copy 1, fragment 3, 339503104 Kbytes at 44597.617 Kbytes/s

07:56:03.310 [7160.10564] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329714611, copy 1, fragment 3, 797832448 Kbytes at 76355.523 Kbytes/s

08:03:07.535 [4684.8860] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329749784, copy 1, fragment 1, 13627904 Kbytes at 34752.650 Kbytes/sec

08:04:40.268 [2976.7496] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329750210, copy 1, fragment 1, 2132224 Kbytes at 35630.895 Kbytes/sec

09:26:22.103 [8252.11916] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329755122, copy 1, fragment 1, 2074880 Kbytes at 41720.387 Kbytes/sec

09:27:31.450 [9460.15268] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329755205, copy 1, fragment 1, 15360 Kbytes at 430.529 Kbytes/sec

09:28:54.026 [8992.15172] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329755272, copy 1, fragment 1, 2105856 Kbytes at 41331.816 Kbytes/sec

09:47:58.975 [8668.9592] <4> write_backup: successfully wrote backup id adcna02.fmi.com_1329751579, copy 1, fragment 1, 134391552 Kbytes at 27494.293 Kbytes/sec

10:04:43.967 [12076.12572] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329755355, copy 1, fragment 1, 23531264 Kbytes at 11121.140 Kbytes/sec

10:06:14.984 [7428.7720] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329757512, copy 1, fragment 1, 2532608 Kbytes at 48899.598 Kbytes/sec

10:10:01.153 [9224.8036] <4> write_backup: successfully wrote backup id adcna01.fmi.com_1329757596, copy 1, fragment 1, 5694976 Kbytes at 29428.511 Kbytes/sec

So, you are 100% right - for NDMP backups, there is no 'waited for full/empty buffers'. Level 0 or level 5 log.

The throughput you are seeing is enough evidence that data is not fed to the drives fast enough.
We know that your drives are capable of writing in excess of 100 Mb/sec. Not a single stream is getting CLOSE to that. (Well there was one stream writing at 76 Mb/sec...)

If you have sufficient network bandwith between the NAS and the media server (at least 1Gb) it seems like a good idea to at least test remote NDMP with multiplexing enabled. Seems you have nothing to lose.

If will be as easy as changing STU and MPX in the policy and changing back if you don't see sufficient improvement.

Handy NetBackup Links

Amaan · ‎02-20-2012

Thanks so much for your help Marianne. we will try remote NDMP. Will let you know the results. Please keep this post in your tracking if i will have any questions. :)

Thanks again.

Marianne · ‎02-20-2012

I will keep an eye out. Will be very busy over the next 2 days, but will have a look when possible...

Handy NetBackup Links

Amaan · ‎02-21-2012

Just an update. I have provided my suggestions to my team lead and just waiting for approval to make changes.

Sagar_Kolhe · ‎02-22-2012

throughput will be low ..... if you are using remote NMDP.

Because that time another client backups also running.

VOX

NDMP backup type (Which one is better)