cancel
Showing results for 
Search instead for 
Did you mean: 

shared memory for netbackup

edla_shravya
Level 4

hi, i am running backup jobs with various settings. somehow, i couldn't able to submit the job with more than 2GB of shared memory buffers size.

I am using disk based storage unit (basic). number of concurrent jobs set to 24

I have set SIZE_DATA_BFFER to 256 KB (i.e 256*1024)

I couldn't set NUMBER_DATA_BUFFERS to more than 8200 (just a rought estimate). working fine with 8150 setting.

My netbackup server RAM is 256 GB

I've set shared memory at 128 GB at OS level

 /usr/openv/netbackup/bin/bpbackup -p Gen_Test -s Gen_Test -w GEN_DATA GEN_KBSIZE=2000 GEN_MAXFILES=500 GEN_PERCENT_RANDOM=100
EXIT STATUS 89: problems encountered during setup of shared memory

That means, it is not allowing me use more than 2GB (approx) of shared memory per job. here is the bptm log

16:22:22.913 [7969] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
16:22:22.913 [7969] <2> read_legacy_touch_file: Found /usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS; requested from (tmcommon.c.3525).
16:22:22.913 [7969] <2> read_legacy_touch_file: 8200 read ; requested from (tmcommon.c.3525).
16:22:22.913 [7969] <2> io_init: using 8200 data buffers
16:22:22.913 [7969] <2> io_init: child delay = 10, parent delay = 15 (milliseconds)
16:22:22.914 [7969] <16> create_shared_memory: could not allocate enough shared memory for backup buffers, Invalid argument

so, is there any maximum cap on amount of shared memory that can be used per netbackup job



 

1 ACCEPTED SOLUTION

Accepted Solutions

mph999
Level 6
Employee Accredited

Wow ... that's not going to work well.

Let me give you an example of settings that generally work well ...

SIZE_DATA_BUFFERS  262144    (which is 256 x 1024 )

NUMBER_DATA_BUFFERS 128 or 256

Using the settings you have, quite simply you'll run out of memory.

The equation Marianne posted ;

Shared memory needed = number of tape drives * MPX * NUMBER_DATA_BUFFERS * SIZE_DATA_BUFFERS

is not quite the true story ...  It may use even more memory than this. (though this is not documented as far as I am aware)

NBU rounds up the number of streams to the nearest factor of 4..

It's easier to show by example:

From my bptm log, we see I have x12 data buffers, each with a size of 131072 ...

08:55:45.665 [18200] <2> io_init: using 131072 data buffer size
08:55:45.665 [18200] <2> io_init: CINDEX 0, sched Kbytes for monitoring = 60000
08:55:45.665 [18200] <2> io_init: using 12 data buffers


... therefore, each tape drive, or each stream to a tape drive will require 131072 x 12 = 1572864


Now this example is actually from a MPX backup with 2 streams (mpx = 2), so you might think that the amount of shared memory will be 1572864 x2 = 3145728 

Now, here is the catch ...

Looking down my bptm log I find these lines ...

08:55:45.748 [18200] <2> mpx_setup_shm: buf control for CINDEX 0 is ffffffff79a00000
08:55:45.748 [18200] <2> mpx_setup_shm: shared memory address for group 0 is ffffffff76800000, size is 6291456, shmid is 117440636
08:55:45.748 [18200] <2> mpx_setup_shm: shared memory address for CINDEX 0 is ffffffff76800000, group 0, num_active 1


So we see the amount of memory is 6291456

Now, 6291456/ 1572864 = 4

So, what has happened, is even though I have one tape drive, and 2 streams the amount of memory NBU will allocate is the amount of memory required by x4 streams - NBU has 'rounded it up' .  In fact, it will round up to the nearest factor of 4, so it you have 5 streams, it will allocated the same amount of memory as if it was 8 streams.  I cannot remember the exact explanation for this,  NBU has always worked this way and it is done for efficiency (though as I say, i can;t remember the exact detailed explanation).

View solution in original post

2 REPLIES 2

Marianne
Level 6
Partner    VIP    Accredited Certified
8200 in usr/openv/netbackup/db/config/NUMBER_DATA_BUFFERS? Way too high. Shared memory needed = number of tape drives * MPX * NUMBER_DATA_BUFFERS * SIZE_DATA_BUFFERS

mph999
Level 6
Employee Accredited

Wow ... that's not going to work well.

Let me give you an example of settings that generally work well ...

SIZE_DATA_BUFFERS  262144    (which is 256 x 1024 )

NUMBER_DATA_BUFFERS 128 or 256

Using the settings you have, quite simply you'll run out of memory.

The equation Marianne posted ;

Shared memory needed = number of tape drives * MPX * NUMBER_DATA_BUFFERS * SIZE_DATA_BUFFERS

is not quite the true story ...  It may use even more memory than this. (though this is not documented as far as I am aware)

NBU rounds up the number of streams to the nearest factor of 4..

It's easier to show by example:

From my bptm log, we see I have x12 data buffers, each with a size of 131072 ...

08:55:45.665 [18200] <2> io_init: using 131072 data buffer size
08:55:45.665 [18200] <2> io_init: CINDEX 0, sched Kbytes for monitoring = 60000
08:55:45.665 [18200] <2> io_init: using 12 data buffers


... therefore, each tape drive, or each stream to a tape drive will require 131072 x 12 = 1572864


Now this example is actually from a MPX backup with 2 streams (mpx = 2), so you might think that the amount of shared memory will be 1572864 x2 = 3145728 

Now, here is the catch ...

Looking down my bptm log I find these lines ...

08:55:45.748 [18200] <2> mpx_setup_shm: buf control for CINDEX 0 is ffffffff79a00000
08:55:45.748 [18200] <2> mpx_setup_shm: shared memory address for group 0 is ffffffff76800000, size is 6291456, shmid is 117440636
08:55:45.748 [18200] <2> mpx_setup_shm: shared memory address for CINDEX 0 is ffffffff76800000, group 0, num_active 1


So we see the amount of memory is 6291456

Now, 6291456/ 1572864 = 4

So, what has happened, is even though I have one tape drive, and 2 streams the amount of memory NBU will allocate is the amount of memory required by x4 streams - NBU has 'rounded it up' .  In fact, it will round up to the nearest factor of 4, so it you have 5 streams, it will allocated the same amount of memory as if it was 8 streams.  I cannot remember the exact explanation for this,  NBU has always worked this way and it is done for efficiency (though as I say, i can;t remember the exact detailed explanation).