shared memory for netbackup
hi, i am running backup jobs with various settings. somehow, i couldn't able to submit the job with more than 2GB of shared memory buffers size.
I am using disk based storage unit (basic). number of concurrent jobs set to 24
I have set SIZE_DATA_BFFER to 256 KB (i.e 256*1024)
I couldn't set NUMBER_DATA_BUFFERS to more than 8200 (just a rought estimate). working fine with 8150 setting.
My netbackup server RAM is 256 GB
I've set shared memory at 128 GB at OS level
/usr/openv/netbackup/bin/bpbackup -p Gen_Test -s Gen_Test -w GEN_DATA GEN_KBSIZE=2000 GEN_MAXFILES=500 GEN_PERCENT_RANDOM=100
EXIT STATUS 89: problems encountered during setup of shared memory
That means, it is not allowing me use more than 2GB (approx) of shared memory per job. here is the bptm log
16:22:22.913 [7969] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
16:22:22.913 [7969] <2> read_legacy_touch_file: Found /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS; requested from (tmcommon.c.3525).
16:22:22.913 [7969] <2> read_legacy_touch_file: 8200 read ; requested from (tmcommon.c.3525).
16:22:22.913 [7969] <2> io_init: using 8200 data buffers
16:22:22.913 [7969] <2> io_init: child delay = 10, parent delay = 15 (milliseconds)
16:22:22.914 [7969] <16> create_shared_memory: could not allocate enough shared memory for backup buffers, Invalid argument
so, is there any maximum cap on amount of shared memory that can be used per netbackup job
Wow ... that's not going to work well.
Let me give you an example of settings that generally work well ...
SIZE_DATA_BUFFERS 262144 (which is 256 x 1024 )
NUMBER_DATA_BUFFERS 128 or 256
Using the settings you have, quite simply you'll run out of memory.
The equation Marianne posted ;
Shared memory needed = number of tape drives * MPX * NUMBER_DATA_BUFFERS * SIZE_DATA_BUFFERS
is not quite the true story ... It may use even more memory than this. (though this is not documented as far as I am aware)
NBU rounds up the number of streams to the nearest factor of 4..
It's easier to show by example:
From my bptm log, we see I have x12 data buffers, each with a size of 131072 ...
08:55:45.665 [18200] <2> io_init: using 131072 data buffer size
08:55:45.665 [18200] <2> io_init: CINDEX 0, sched Kbytes for monitoring = 60000
08:55:45.665 [18200] <2> io_init: using 12 data buffers
... therefore, each tape drive, or each stream to a tape drive will require 131072 x 12 = 1572864
Now this example is actually from a MPX backup with 2 streams (mpx = 2), so you might think that the amount of shared memory will be 1572864 x2 = 3145728Now, here is the catch ...
Looking down my bptm log I find these lines ...
08:55:45.748 [18200] <2> mpx_setup_shm: buf control for CINDEX 0 is ffffffff79a00000
08:55:45.748 [18200] <2> mpx_setup_shm: shared memory address for group 0 is ffffffff76800000, size is 6291456, shmid is 117440636
08:55:45.748 [18200] <2> mpx_setup_shm: shared memory address for CINDEX 0 is ffffffff76800000, group 0, num_active 1
So we see the amount of memory is 6291456Now, 6291456/ 1572864 = 4
So, what has happened, is even though I have one tape drive, and 2 streams the amount of memory NBU will allocate is the amount of memory required by x4 streams - NBU has 'rounded it up' . In fact, it will round up to the nearest factor of 4, so it you have 5 streams, it will allocated the same amount of memory as if it was 8 streams. I cannot remember the exact explanation for this, NBU has always worked this way and it is done for efficiency (though as I say, i can;t remember the exact detailed explanation).