cancel
Showing results for 
Search instead for 
Did you mean: 

NBU Master server hangs and needs restart

mrmadej
Level 4
Partner Accredited
Hi all Masters,
 
 
 
I have a little problem at my customer site with NBU master server which hangs from time to time. 
 
New jobs are not starting, Activity Monitor is not updating. 
 
In the server.log i found the below entries:
 
I. 06/17 05:06:53. Disconnecting shared memory client, process id not found
I. 06/17 05:06:53. Disconnected SharedMemory client's AppInfo: IP=xx.xx.xx.xx;HOST=xxx.xx.xxl;OSUSER=root;OS='Linux 2.6.18-308.1.1.el5 #1 SMP Fri Feb 17 16:51:01 EST 2012 x';EXE=/usr/openv/netbackup/bin/bpdbm;PID=0x2463;VERSION=11.0.1.2745;API=ODBC;TIMEZONEADJUSTMENT=120
 
 
In messages:
 
Jun 17 05:06:53 tygrys SQLAnywhere(nb_master): Disconnecting shared memory client, process id not found
Jun 17 05:06:53 tygrys SQLAnywhere(nb_master): Disconnected SharedMemory client's AppInfo: IP=xx.xx.xx.xx;HOST=xxx.xx.xx;OSUSER=root;OS='Linux 2.6.18-308.1.1.el5 #1 SMP Fri Feb 17 16:51:01 EST 2012 x';EXE=/usr/openv/netbackup/bin/bpdbm;PID=0x2463;VERSION=11.0.1.2745;API=ODBC;TIMEZONEADJUSTMENT=120
 
On Symantec websites i found the TN (TECH65028) but it is related to old veriosn 6.x.
 
My env is:
 
NBU: 7.5.0.4
 
OS: RedHat 5.8
 
Kernel setting:
 
kernel.sem = 250 32000 32 128
kernel.msgmnb = 65536
kernel.msgmni = 16
kernel.msgmax = 65536
kernel.shmmni = 4096
kernel.shmall = 4294967296
kernel.shmmax = 34359738368
 
server.conf content:
 
-n NB_master
 
   -x tcpip(LocalOnly=YES;ServerPort=13785) -gtc 8 -gp 4096 -dt /opt/netbck/sybasetmp -gd DBA -gk DBA -gl DBA -ti 0 -c 512M -ch 6000M -cl 512M -zl -os 1M  -o /usr/openv/db//log/server.log
 
-ud -m
 
 
 
Any ideas will be appreciated
 
 
 
Regards
 
Madej
3 REPLIES 3

Yasuhisa_Ishika
Level 6
Partner Accredited Certified

first of all, determine which NetBackup daemon crashed before restarting NetBackup. Run "bpps -x" and check which daemon disappears.

one more thing. If you hace never enabled process core generation, enable it by following TECH52289.

http://www.symantec.com/docs/TECH52289

BTW, your current kernel.sem and kernel.msgmni has insufficient value. Have a look on TECH167095 and set proper value.

http://www.symantec.com/docs/TECH167095

Brook_Humphrey
Level 4
Employee Accredited Certified

Yes and in 7.5.0.4 there was a logging issue with the db/error log where it was logging to agressively and whould cause performance issues you may want to look at upgrading to 7.5.0.5

If you have any further issues please let me know.

Thanks

mrmadej
Level 4
Partner Accredited
Thanks for responses.
 
I changed the kernel.msgmni and kernel.sem parameters as Yasuhisa Ishikawa suggested. 
 
The whole server was restarted and now is running well - since 20 hours, ~4500 jobs.
 
I opened the case with support because i am obligated to do this.
 
Regarding the upgrade to 7.5.0.5 i have to discuss this with customer. It is not too easy to do this in this environment.
 
 
 
Until the previous Monday i did not have any issues with NBU since i made the upgrade from 7.0.1 to 7.5.0.4.
 
Regards
 
Madej