01-29-2013 02:33 AM
Hi all,
We have many backupd failing for a media server and the erroe log is as below :
Jan 29, 2013 3:41:00 PM - Error bpbrm (pid=10127) Could not get shared memory for bpbrm child process communication, No space left on device (28)
Jan 29, 2013 3:40:59 PM - started process bpbrm (pid=10127)
Jan 29, 2013 3:41:06 PM - end writing
problems encountered during setup of shared memory (89).
We tried reruning the backups after making some changes in NUMBER_DATA_BUFFERS value. We changed it to 16 from 32 , but still backups fail.
Please assist.
Regards
Mantu
Solved! Go to Solution.
01-29-2013 03:27 AM
We tried reruning the backups after making some changes in NUMBER_DATA_BUFFERS value. We changed it to 16 from 32 , but still backups fail.
Did you make the changes before or after the backups started to fail?
If the former then change them back to the values they were that allowed the backups to complete successfully.
The following may be of use regarding the values required for shared memory in a Solaris 10 media server:
http://www.symantec.com/business/support/index?page=content&id=TECH56434
http://www.symantec.com/business/support/index?page=content&id=TECH62633
and possible reasons:
Status 89 error comes from one or more of the following
- Low shared memory limit (shmmax)
- Low shared memory identifiers (shmmni) to track all of the memory segments being used.
- Excessive NBU buffer sizes ( consuming resources to the limit of shmmni or shmmax )
- Excessive number concurrent job streams ( consuming resources to the limit of shmmni or shmmax )
01-29-2013 02:40 AM
Please tell us more ...
NetBackup Version and patch level and Operating System type, version and patch level
This will at least give us a starting point
Initial thoughts are lack of memory, which is buffered so may be related to the amount of physical RAM or the paging file - which in turn relates to free disk space
So tell us the details of what you have and how much RAM / free disk space you have and once we know the O/S we may be able to assist further
01-29-2013 03:01 AM
Hi,
The OS is : NetBackup-Solaris10 6.5.6
The disk space dels are s follows :
Filesystem kbytes used avail capacity Mounted on /dev/vx/dsk/bootdg/rootvol 32380893 5810426 26246659 19% / /devices 0 0 0 0% /devices ctfs 0 0 0 0% /system/contract proc 0 0 0 0% /proc mnttab 0 0 0 0% /etc/mnttab swap 45475696 1752 45473944 1% /etc/svc/volatile objfs 0 0 0 0% /system/object sharefs 0 0 0 0% /etc/dfs/sharetab /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr/libc_psr_hwcap2.so.1 32380893 5810426 26246659 19% /platform/sun4v/lib/libc_psr.so.1 /platform/SUNW,SPARC-Enterprise-T5220/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1 32380893 5810426 26246659 19% /platform/sun4v/lib/sparcv9/libc_psr.so.1 fd 0 0 0 0% /dev/fd /dev/vx/dsk/bootdg/var 42242298 11831151 29988725 29% /var swap 45477064 3120 45473944 1% /tmp swap 45473984 40 45473944 1% /var/run swap 45473944 0 45473944 0% /dev/vx/dmp swap 45473944 0 45473944 0% /dev/vx/rdmp /dev/vx/dsk/bootdg/extra 29564590 29337 29239608 1% /extra /dev/vx/dsk/bootdg/home 15483819 84947 15244034 1% /export/home afpddr11-nas1:/backup/src_rep/ebr/asprd223/nfs_stu001 33914903040 23534295552 10380607488 70% /opt/app/ebr/asprd223/nfs_stu001 afpddr11-nas2:/backup/src_rep/ebr/asprd223/nfs_stu002 33914903040 23534295552 10380607488 70% /opt/app/ebr/asprd223/nfs_stu002 afpddr11-nas3:/backup/src_rep/ebr/asprd223/nfs_stu003 33914903040 23534295552 10380607488 70% /opt/app/ebr/asprd223/nfs_stu003 afpddr11-nas4:/backup/src_rep/ebr/asprd223/nfs_stu004 33914903040 23534295552 10380607488 70% /opt/app/ebr/asprd223/nfs_stu004 /dev/vx/dsk/openvdg/openv 20410368 10312523 9467008 53% /opt/openv afbi01n1-rep.aldc.att.com:/vol/v00_fs01_admin/q00_fs01_admin/sun 52428800 18143744 34285056 35% /nas/usr/sbc
Regards
Mantu
01-29-2013 03:13 AM
So /tmp, /dev/vx/*, /var/run and /export/home are all full - some being swap files - so looks like you are running out of swap capability which is made worse with the /tmp being almost full also
You need to tidy up the system and try and allocate some more disk space for /tmp (as it does get used by NetBackup) and also sort out plenty of swap for your pagining operations
Hope this helps - sure someone else will be along soon to explain all of the file systems better than me
01-29-2013 03:27 AM
We tried reruning the backups after making some changes in NUMBER_DATA_BUFFERS value. We changed it to 16 from 32 , but still backups fail.
Did you make the changes before or after the backups started to fail?
If the former then change them back to the values they were that allowed the backups to complete successfully.
The following may be of use regarding the values required for shared memory in a Solaris 10 media server:
http://www.symantec.com/business/support/index?page=content&id=TECH56434
http://www.symantec.com/business/support/index?page=content&id=TECH62633
and possible reasons:
Status 89 error comes from one or more of the following
- Low shared memory limit (shmmax)
- Low shared memory identifiers (shmmni) to track all of the memory segments being used.
- Excessive NBU buffer sizes ( consuming resources to the limit of shmmni or shmmax )
- Excessive number concurrent job streams ( consuming resources to the limit of shmmni or shmmax )
01-29-2013 03:30 AM
Hey Mark. I am no Solaris expert, but file system /tmp is only 1% full not 100%.
01-29-2013 03:40 AM
Nicolai - thanks - been a long week! - So available and capacity are two columns - DOH!
Still think it is a paging issue somewhere which is usually disk space or paging related - i am better on Windows!
01-29-2013 05:30 AM
To check tunables relating to shared memory, please post output of following commands if possible.
# cat /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS
# cat /usr/openv/netbackup/db/config/SIZE_DATA_BUFFERS_DISK
# ps -o pid,ppid,project,args
# prctl <PID of inetd>
# ipcs -A