cancel
Showing results for 
Search instead for 
Did you mean: 

Slow DB2 DB Backups / Fast Log Backups

Volker_Spies
Level 3

Hi,

i've a question regarding DB2 Backups.

We have DB2 on AIX and the problem are slow DB2 database Backups, the Log Backups on the other hand are running very well.

DB Backups are done to SAN connected LTO5 drives and I get 50-70 MB/s. This is way to slow for the storage and the tape drives.

Log backups are network backups to a media server with DSSU connected. The logs are to small to give reliable speed counters.

I found a huge difference between the two backups regarding buffers, but first these are the buffers setting on the client:

NUMBER_DATA_BUFFERS: 256

SIZE_DATA_BUFFERS: 262144

I have never tweaked NET_SZ_BUFFER or any other buffer setting. The BACKUP DB command from the script NetBackup starts for DB2 Backups is not tweaked either, except the OPEN 8 SESSION and INCLUDE LOGS statement.

But I see a weired behavior in the backups itself.

This is from a DB Backup:

Info bpbkar(pid=59572348) dbclient(pid=59572348) wrote first buffer(size=4096)

This is form a log backup

Info bphdb(pid=65536066) dbclient(pid=65536066) wrote first buffer(size=131072)

Why is the DB backups using bpkar and a 4096k buffer and the log backup is unsing bphdb and 131072k buffer?

This would explain the slow data transfer to the tape drives.

Can someone explain this a litte further? And how can I increase this bpkar/dbclient buffer.

Volker

 

 

3 REPLIES 3

Douglas_A
Level 6
Partner Accredited Certified

If you do a no - a | grep tcp*

 

What are the send and recieve set to in this? AIX will limit the buffer size to this value so if this value is smaller than your Netbackup buffer the AIX one will take precidence and could cause a mismatch.

Another thing to check sb_max, i forget the exact reasonsing but in my AIX we had this at default "131072" and it limited out buffer sizes as well so make sure this is set to the kb equivilinet of 1mb.

Hope this helps some.

Volker_Spies
Level 3

Hi Douglas,

thanks for the reply.

Sned /recieve:

 tcp_recvspace = 16384
 tcp_sendspace = 16384

But I think my problem is not the network part, tht DB Backups go to SAN connectect tape drives.

I think I might have one of 2 bottlenecks, or both!

1. the configuration of the LTO5 drives in AIX is incorrect

Could not find much here, Symantec& IBM say: Install atape, enable variable buffer in smitty and you are ready to go.

2. some sort of buffer setting, may it be in AIX or in NBU is wrong.

The only thing I'm ablte to find regarding buffers is NUMBER_DATA_BUFFERS/SIZE_DATA_BUFFERS

 

When I look at the output of such a slow backup job I see this:

 Info bpbkar(pid=37814752) dbclient waited 470702 times for empty buffer, delayed 701816 times

As far as I know this means that NBU waits till the tapedrive has emptied the buffer? So the tape is the bottleneck?

Or can this mean that these buffers are to small so the LTO5 drive is strat/stop all the way causing these wait times?

How can I test the tape performance in AIX without NBU interferrence?

Volker

Douglas_A
Level 6
Partner Accredited Certified

Hi Volker

 

Those buffers are WAY to small to be of any use to you.. both of those really should be set much larger, something like 262144 then you will should get much better performance....

cmd: no -o -p tcp_sendspace=262144 (i think its -p to make it permanant but unsure.. you can do without and it will revert back after reboot if issues occur)

 

 

As for testing the tape without NBU interferance i dont know of any way from the client to do that there might be an AIX utility somewhere that could help out with that.. If you writing from the client direct to the tape drive youy might have your SAN team check the san and see if your traffic is routing over any uplinks or slower switches.. that could easily bottlneck your traffic especially if you have other things going over something smaller like a 2GB uplink.. the LTO5 drives all use 4GB and if you have multiple then you choke up.