10-20-2012 12:13 AM
All,
NetBackup 7.5 Catalog backup is running at low speed of 11-15mbps over 10Gbps LAN to DataDomain. It took 6 hours to complete the backup of 226.32 GB data. I am just wonder how i can reduce the wait & delayed time as mention below.
10/19/2012 10:02:57 AM - Info bpbkar32(pid=592) Backup started
10/19/2012 10:02:57 AM - Info bptm(pid=15712) start
10/19/2012 10:02:57 AM - Info bptm(pid=15712) using 262144 data buffer size
10/19/2012 10:02:57 AM - Info bptm(pid=15712) setting receive network buffer to 1049600 bytes
10/19/2012 10:02:57 AM - Info bptm(pid=15712) using 128 data buffers
10/19/2012 10:03:00 AM - Info bptm(pid=15712) start backup
10/19/2012 10:03:01 AM - begin writing
10/19/2012 3:57:34 PM - Info bptm(pid=15712) waited for full buffer 340550 times, delayed 1260728 times
10/19/2012 3:57:34 PM - Info bpbkar32(pid=592) bpbkar waited 102 times for empty buffer, delayed 2051 times.
As its writting to disk, parameters are below
NetBackup is installed on 1TB SAN disk.
NUMBER_DATA_BUFFERS 64 / NUMBER_DATA_BUFFER_DISK 128 / SIZE_DATA_BUFFERS_DISK 262144
Total Client 925, No online DB backups, 1 NAS Backup through NDMP and two file servers.
Solved! Go to Solution.
10-20-2012 02:32 AM
The buffer look right to me. I suspect a infrastructure issue that's not related to Netbackup.
First check the disk subsystem performance with the following command:
# Windows
d:\VERITAS\NetBackup\bin\bpbkar32.exe -nocont D:\path\to\netbackup\catalog 1> nul 2> nul
# Unix
time /usr/openv/netbackup/bin/bpbkar -nocont -nofileinfo -nokeepalives /usr/openv/netbackup/db > /dev/null 2> /tmp/file.out
For windows you will need to measure the time manually. Basically this command read the entire Netbackup catalog and throw the read data away. The time used by bpbkar is what the system is able to do at it best given the infrastructure limitation. If bpbkar uses 6 hours it a subsystem disk issue.
Test Data Domain:
Using the GEN_DATA file list directive you will be able to baseline how fast you can inject data into the Data Domain. I have reached 800MB/sec on a DD890. If you see lower speeds check the optical switch cabling. 10G is very sensitive. Always clean cables and SFP upon inserting. This feature require either a UNIX/Linux media server or a know fast Linux client.
http://www.symantec.com/docs/TECH75213
10-20-2012 02:32 AM
The buffer look right to me. I suspect a infrastructure issue that's not related to Netbackup.
First check the disk subsystem performance with the following command:
# Windows
d:\VERITAS\NetBackup\bin\bpbkar32.exe -nocont D:\path\to\netbackup\catalog 1> nul 2> nul
# Unix
time /usr/openv/netbackup/bin/bpbkar -nocont -nofileinfo -nokeepalives /usr/openv/netbackup/db > /dev/null 2> /tmp/file.out
For windows you will need to measure the time manually. Basically this command read the entire Netbackup catalog and throw the read data away. The time used by bpbkar is what the system is able to do at it best given the infrastructure limitation. If bpbkar uses 6 hours it a subsystem disk issue.
Test Data Domain:
Using the GEN_DATA file list directive you will be able to baseline how fast you can inject data into the Data Domain. I have reached 800MB/sec on a DD890. If you see lower speeds check the optical switch cabling. 10G is very sensitive. Always clean cables and SFP upon inserting. This feature require either a UNIX/Linux media server or a know fast Linux client.
http://www.symantec.com/docs/TECH75213
10-20-2012 02:51 AM
I Don't have any linux/unix box to check out GEN_DATA :(
Issue is just a catalog Backup.Rest of backups are running 50-60mbps easily.
BTW, how big the file will get generated? Same as catalog size?
10-20-2012 02:59 AM
This message indicates this is not related to buffer setting:
waited for full buffer 340550 times, delayed 1260728 times
The media server is waiting for the buffer to fill up for a very long time, that means writing to the storage is very slow. Check your disk I/O or find out if any other application are taking up the I/O resources.
If catalog is written to SAN disk, try changing it to local disk to see if the speed improves, that might tell if SAN disk is slowing things down.
10-20-2012 03:42 AM
Catalog is written to DD890.
Catalog resides on SAN disk and NetBackup installed also on the SAN disk.
If I am not wrong, you are suspecting I/O issue on the disk where catalog resides and NBU is installed.
10-20-2012 04:26 AM
Correct.
10/19/2012 3:57:34 PM - Info bptm(pid=15712) waited for full buffer 340550 times, delayed 1260728 times
10/19/2012 3:57:34 PM - Info bpbkar32(pid=592) bpbkar waited 102 times for empty buffer, delayed 2051 times.
The data for a backup comes from the client, into the buffer, then out to tape/ disk
Think of the buffer as a bucket (or 128 buckets) and the data as water.
The rules are :
Water (data) only can be sent to an empty bucket
The bucket must become full before being tipped out to the tape drive
bpbkar reads the data off h disk and send it to the buffer
bptm takes the water out of the buffer and sends it to the drives
What the log shows, is that bpbkar never had to wait for an empty buffer (well 2051 tims, but this is a low number in this context).
However, we see that bptm had to wait 1260728 times, for a buffer to become full, so that it could be send to the DD.
So, what this tells us, is that the data has is not being sent from the 'client' to the bufes quick enough.
Martin
10-20-2012 06:48 AM
Size depend on you're pick - it's configurable parameter
10-20-2012 06:50 AM
Tarangdce - run the bpbkar command as show im my first post. This will help indetify if the SAN disk is the issue.
10-20-2012 06:51 AM
I really like the water/bucket picture
10-20-2012 12:16 PM
Seems not dbm problem.
By the way, the upcoming release should have a better catalog backup performance.
10-20-2012 10:13 PM
Hi Nocilai,
It took 6 hours 3 minutes so culprit is the disk where catalog is actually resides.
10-20-2012 10:52 PM
Looks like issue of slow catalog backup is only with Upgrades. 2 Maste server Upgrade from NBU 7.1 to 7.5.0.3 and catalog backup speed is 11-14Mbps. Below is the details.
Server 1:- 128 data buffers / SAN disk (VMAX) / 270GB catalog backup.
Server 2:- 64 data buffers / Local disk / 75GB Catalog Backup.
Fresh build NetBackup 7.5 running with 42-45Mbps and completing in 15-17 minutes.
64 data buffers / Clariion disk / 40GB Catalog Backup
10-21-2012 01:13 AM
VMAX has know issues if the storage administrator over commit the SATA disk layer. One indication of this are excelent writing speed (cache) but very bad reading speeds.
11-06-2012 07:48 PM
Thanks everyone...
The actual culprit was disk. SAN team moved perticular LUN from different FAST pool (from FC-SATA to FC-EFD) and performance improved.
Other server which has local disk, I run the defrag where as 86% fragmented files were there.
11-06-2012 09:12 PM
Nicolai was correct with 'First check the disk subsystem performance' suggestion.
Please mark his post as solution.