Solved: Nicolai - thanks - been a

mantu2a · ‎01-29-2013

Hi all,

We have many backupd failing for a media server and the erroe log is as below :

Jan 29, 2013 3:41:00 PM - Error bpbrm (pid=10127) Could not get shared memory for bpbrm child process communication, No space left on device (28)
Jan 29, 2013 3:40:59 PM - started process bpbrm (pid=10127)
Jan 29, 2013 3:41:06 PM - end writing
problems encountered during setup of shared memory (89).

We tried reruning the backups after making some changes in NUMBER_DATA_BUFFERS value. We changed it to 16 from 32 , but still backups fail.

Please assist.

Regards

Mantu

Andy_Welburn · ‎01-29-2013

We tried reruning the backups after making some changes in NUMBER_DATA_BUFFERS value. We changed it to 16 from 32 , but still backups fail.

Did you make the changes before or after the backups started to fail?

If the former then change them back to the values they were that allowed the backups to complete successfully.

The following may be of use regarding the values required for shared memory in a Solaris 10 media server:

http://www.symantec.com/business/support/index?page=content&id=TECH56434
http://www.symantec.com/business/support/index?page=content&id=TECH62633

and possible reasons:

Status 89 error comes from one or more of the following
- Low shared memory limit (shmmax)
- Low shared memory identifiers (shmmni) to track all of the memory segments being used.
- Excessive NBU buffer sizes ( consuming resources to the limit of shmmni or shmmax )
- Excessive number concurrent job streams ( consuming resources to the limit of shmmni or shmmax )

View solution in original post

Mark_Solutions · ‎01-29-2013

Please tell us more ...

NetBackup Version and patch level and Operating System type, version and patch level

This will at least give us a starting point

Initial thoughts are lack of memory, which is buffered so may be related to the amount of physical RAM or the paging file - which in turn relates to free disk space

So tell us the details of what you have and how much RAM / free disk space you have and once we know the O/S we may be able to assist further

mantu2a · ‎01-29-2013

Hi,

The OS is : NetBackup-Solaris10 6.5.6

The disk space dels are s follows :

Filesystem            kbytes    used   avail capacity  Mounted on
/dev/vx/dsk/bootdg/rootvol
                     32380893 5810426 26246659    19%    /
/devices                   0       0       0     0%    /devices
ctfs                       0       0       0     0%    /system/contract
proc                       0       0       0     0%    /proc
mnttab                     0       0       0     0%    /etc/mnttab
swap                 45475696    1752 45473944     1%    /etc/svc/volatile
objfs                      0       0       0     0%    /system/object
sharefs                    0       0       0     0%    /etc/dfs/sharetab
/platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr/libc_psr_hwcap2.so.1
                     32380893 5810426 26246659    19%    /platform/sun4v/lib/libc_psr.so.1
/platform/SUNW,SPARC-Enterprise-T5220/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
                     32380893 5810426 26246659    19%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
fd                         0       0       0     0%    /dev/fd
/dev/vx/dsk/bootdg/var
                     42242298 11831151 29988725    29%    /var
swap                 45477064    3120 45473944     1%    /tmp
swap                 45473984      40 45473944     1%    /var/run
swap                 45473944       0 45473944     0%    /dev/vx/dmp
swap                 45473944       0 45473944     0%    /dev/vx/rdmp
/dev/vx/dsk/bootdg/extra
                     29564590   29337 29239608     1%    /extra
/dev/vx/dsk/bootdg/home
                     15483819   84947 15244034     1%    /export/home
afpddr11-nas1:/backup/src_rep/ebr/asprd223/nfs_stu001
                     33914903040 23534295552 10380607488    70%    /opt/app/ebr/asprd223/nfs_stu001
afpddr11-nas2:/backup/src_rep/ebr/asprd223/nfs_stu002
                     33914903040 23534295552 10380607488    70%    /opt/app/ebr/asprd223/nfs_stu002
afpddr11-nas3:/backup/src_rep/ebr/asprd223/nfs_stu003
                     33914903040 23534295552 10380607488    70%    /opt/app/ebr/asprd223/nfs_stu003
afpddr11-nas4:/backup/src_rep/ebr/asprd223/nfs_stu004
                     33914903040 23534295552 10380607488    70%    /opt/app/ebr/asprd223/nfs_stu004
/dev/vx/dsk/openvdg/openv
                     20410368 10312523 9467008    53%    /opt/openv
afbi01n1-rep.aldc.att.com:/vol/v00_fs01_admin/q00_fs01_admin/sun
                     52428800 18143744 34285056    35%    /nas/usr/sbc

Regards

Mantu

Mark_Solutions · ‎01-29-2013

So /tmp, /dev/vx/*, /var/run and /export/home are all full - some being swap files - so looks like you are running out of swap capability which is made worse with the /tmp being almost full also

You need to tidy up the system and try and allocate some more disk space for /tmp (as it does get used by NetBackup) and also sort out plenty of swap for your pagining operations

Hope this helps - sure someone else will be along soon to explain all of the file systems better than me

Andy_Welburn · ‎01-29-2013

We tried reruning the backups after making some changes in NUMBER_DATA_BUFFERS value. We changed it to 16 from 32 , but still backups fail.

Did you make the changes before or after the backups started to fail?

If the former then change them back to the values they were that allowed the backups to complete successfully.

The following may be of use regarding the values required for shared memory in a Solaris 10 media server:

http://www.symantec.com/business/support/index?page=content&id=TECH56434
http://www.symantec.com/business/support/index?page=content&id=TECH62633

and possible reasons:

Status 89 error comes from one or more of the following
- Low shared memory limit (shmmax)
- Low shared memory identifiers (shmmni) to track all of the memory segments being used.
- Excessive NBU buffer sizes ( consuming resources to the limit of shmmni or shmmax )
- Excessive number concurrent job streams ( consuming resources to the limit of shmmni or shmmax )

Nicolai · ‎01-29-2013

Hey Mark. I am no Solaris expert, but file system /tmp is only 1% full not 100%.

Mark_Solutions · ‎01-29-2013

Nicolai - thanks - been a long week! - So available and capacity are two columns - DOH!

Still think it is a paging issue somewhere which is usually disk space or paging related - i am better on Windows!

Yasuhisa_Ishika · ‎01-29-2013

To check tunables relating to shared memory, please post output of following commands if possible.

# cat /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS

# cat /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK

# ps -o pid,ppid,project,args

# prctl <PID of inetd>

# ipcs -A

VOX

problems encountered during setup of shared memory (89)