Why the NET_BUFFER_SZ dramatically improve the backup speed?
Hi All,
I have a real world issue and I managed to resolve it, but I am more insteresting in the why the NET_BUFFER_SZ works.
Below is the description:
Issue
There is a newly installed NBU client at 7.5 level in the HP 11.31, IA64 platform.
When I try to backup the oracle database I found out the backup speed is slow and this scenario also happens for the files backup.
The file backup policy is defined to backup the path at /stand which has about 366M.
Error
The job detailed shows the speed is at 1mb/s for the file backup, but manually copy the files between the client and the master is faster.
The bpbrm, bptm ,bpbkar process run fine without any issue, but we see the bpbkar spend 7 minutes to send out 366M data, the speed is 366M/7=52M/minute=0.87M/second
13:00:33.197 [20457] <2> logparams: bpbkar32 -r 604800 -ru root -dt 0 -to 0 -bpstart_time 1394686961 -clnt domsgs-BackupDB -class test_stand -sched full -st FULL -bpstart_to 300 -bpend_to 300 -read_to 300 -blks_per_buffer 512 -use_otm -fso -b domsgs-BackupDB_1394686661 -kl 28 -use_ofb
13:00:33.301 [20457] <2> bpbkar SelectFile: INF - Resolved_path = /stand
13:00:33.301 [20457] <4> bpbkar SelectFile: INF - VxFS filesystem is /stand for /stand
13:00:33.335 [20457] <2> bpbkar resolve_path: INF - Actual mount point of /stand is /stand
13:00:33.353 [20457] <2> fscp_is_tracked: disabled tla_init
13:07:22.596 [20457] <2> bpbkar resolve_path: INF - Actual mount point of / is /
13:07:22.596 [20457] <4> bpbkar expand_wildcards: end backup for filelist /stand
13:07:22.596 [20457] <4> bpbkar main: INF - Client completed sending data for backup
However, run the bpbkar to just read the data from the /stand speed 5 seconds:
bpbkar -nocont -dt 0 -nofileinfo -nokeepalives /stand
15:25:00.411 [22525] <2> bpbkar SelectFile: INF - Resolved_path = /stand
15:25:00.411 [22525] <4> bpbkar SelectFile: INF - VxFS filesystem is /stand for /stand
15:25:00.412 [22525] <2> bpbkar resolve_path: INF - Actual mount point of /stand is /stand
15:25:00.419 [22525] <2> fscp_is_tracked: disabled tla_init
15:25:05.063 [22525] <2> bpbkar resolve_path: INF - Actual mount point of / is /
15:25:05.063 [22525] <4> bpbkar expand_wildcards: end backup for filelist /stand
15:25:05.063 [22525] <4> bpbkar main: INF - Client completed sending data for backup
15:25:05.063 [22525] <2> bpbkar main: INF - Total Size:366915603
Reading the bptm log, found out:
12:57:46.126 [4748.8120] <2> io_set_recvbuf: setting receive network buffer to 1049600 bytes
Job detailed shows:
bptm(pid=2572) waited for full buffer 2370 times, delayed 92131 times
Environment:
Master/media: windows 2008R2 NBU7.5
Client: HP unix 11.31 IA 64 NBU7.5
Solution:
In the client:
echo "0" > /usr/openv/netbackup/NET_BUFFER_SZ
In the master:
Create the file NET_BUFFER_SZ at /install_path/netbackup and assign the value 0 into this file.
The NET_BUFFER_SZ=0 means the OS negotiates with each other and chooses a better TCP parameters to send out the data between the client and the master.
My question is that why the NET_BUFFER_SZ=0 dramatically improve my backup speed from 1m/s to 40m/s?
Please kindly let me know deeply about this including the TCP/IP related thing, etc.
Thanks for your help in advance!
Gavin
It is all about hitting the sweet spot on all of your buffer settings - the values could be different for every backup type and environment and there is a correlation between the network and data buffers (size and number)
The 7.1 tuning guide goes through it all - gets pretty heavy but shows how all of the buffers interact with each other - sounds like you have found the perfect set of buffers for your setup
The tuning guide is here: