cancel
Showing results for 
Search instead for 
Did you mean: 

Anyone run into nbars memory leak on NetBackup Solaris master?

elanmbx
Level 6

I've got a slow memory leak in the nbars binary.  Eventually gets big enough (>6GB) that I have to restart NetBackup services on the master.  The leak is pretty slow - I can go several weeks before it becomes an issue and I need to restart.

Just wondering if anyone else has run into this leak on their Solaris master?

Solaris 10:NetBackup Enterprise 7.6.0.3

26 REPLIES 26

revarooo
Level 6
Employee

There are a few instances of nbars memory leak, but nothing for 7.5.0.3 springs up.

You could try adding the following to your /usr/openv/netbackup/bp.conf (no restart is necessary)

NBARS_DISCOVERY_TIMER = 300

nbars is set to do a discovery ever 1 second - this will slow it down to every 5 minutes and should slow down the memory leak from happening.

If that setting gives any undesired results, remove it.

 

When you are having the issue, it might be worth checking the nbars log (oid 362) to see if anything suspect is showing up:

vxlogview -i 362 -d all -t 04:00:00 > nbars.txt  

(this will capture the last 4 hours of nbars logs)

You can always stop nbars on it's own without restarting the whole of NetBackup

nbars -terminate

then run: nbars   to re-run

 

If the above does not help, I would suggest logging a call with Symantec.

 

jim_dalton
Level 6

Just taken a look at mine, you many have hit upon something there. I'm 7.6.0.1 (note not 7.5.0.3 as written by revaroo!) and top shows nbars as the topmost when size ordered, size 5029MB, RES 821M. Interesting.

Jim

jim_dalton
Level 6

Mine is now size 5401M and res 894M . Looks like a leak to me.

elanmbx
Level 6

Yup.  Mine is 7540.  I'm going to restart nbars...

elanmbx
Level 6

Leak is pretty consistent - 8MB/hr

NBARS_DISCOVERY_TIMER setting has no effect from what I can tell...

Marianne
Level 6
Partner    VIP    Accredited Certified

I do not see anyone offering a solution here.

Have you logged a Support call yet?

elanmbx
Level 6

I had a case opened but God help me it was not pleasant.  I eventually gave up because of lack of resources on my end (and the fact that it was fairly easily "manageable") - I'm HOPING that it will be fixed in 7.6.1.  If it is not, I will open a new case.

Marianne
Level 6
Partner    VIP    Accredited Certified

Just for info - NBU 7.6.1 was released a couple of days ago:

 http://www.symantec.com/docs/TECH224245 

elanmbx
Level 6

Yep.  My master was upgraded to 7.6.1 a week ago.  I am monitoring to see if this issue was addressed in the most recent release...
 

elanmbx
Level 6

Although... looking at it... I don't think the memory leak has been fixed:

load averages:  0.32,  0.47,  0.55;                    up 252+04:38:50                                                                                                                      15:11:11
281 processes: 276 sleeping, 2 zombie, 1 stopped, 2 on cpu
CPU states: 99.9% idle,  0.1% user,  0.1% kernel,  0.0% iowait,  0.0% swap
Memory: 32G phys mem, 7386M free mem, 32G total swap, 32G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 15754 root     105  59    0 3255M 3143M cpu/28  54.7H  0.07% dbsrv16
 16278 root      16  59    0 1565M 1212M sleep  197:59  0.00% nbars
 16204 root      18  59    0  793M  718M sleep   74:08  0.00% nbpem

I will continue to monitor and see...

jim_dalton
Level 6

I dont understand that mph999: if theres no call you wont fix it? Symantec have zero proactive support? Symantec with all their resources wont (not even cant) monitor the mem usage of one of their own processes on a system that is declared on the compatibility matrix? That doesnt put Symantec in a good light does it?

For all the free support that gets handed out by customers on this forum..!

Jim 

mph999
Level 6
Employee Accredited

Unfortunately, if it's not reported we can't fix it ...

(By this I mean if for what ever reason we don't know about it, we can't fix something we don't know about)

If we find the issure internally, then yes, it would be addressed.

 

If the issue is still there as you suggest, get another call logged.

If you run into issues, ask the tse to contact me (Martin Holt) and I'll get involved.

Other option, post case number up here ...

mph999
Level 6
Employee Accredited

I've eddited my post to be more clear.

What I meant was if we are unaware of an issue, we can't fix something we don't know about.

If we find things internally, then they do get addressed.

How this particular issue got missed I have no idea - maybe it's one of thoses issues that only becomes apparent in real life systems under particular conditions.  

jim_dalton
Level 6

Good  I look forward to a fix! Jim

m_karampasis
Level 4
Partner Accredited

Dear all,

We are experiencing the same issue on our master server (AIX 6.1 - NetBackup 7.6.0.1). We have an open case with Symantec and IBM for more than one year. Symantec's engineer believes that the issue relies on the IBM side. Following you will find the latest response from IBM:

Hello,

Comparing memory usage from prior to shutting down the Netbackup application, restarting it and letting it run for several days (from April 6 through May 6th.)  I see growth in the memory usage of several of the Netbackup processes but probably the most notable is what I see in the nbars process.  This process always has 18 threads and while its in-use memory remains fairly constant, its virtual memory size increases and its paging space consumption increase at an even faster rate.   

As there is nothing else I can do from the AIX side of things and since this PMR has aged to nearly a year I would like to close it now.  If, after consulting with Symantec, they believe the problem lies within AIX and they are able to point to proof of that, we can open a new PMR and go from there.  Are you agreeable to closing this PMR now and leaving the investigation to Symantec?

Regards,


 

 

mph999
Level 6
Employee Accredited

Do you have the case number .... I'll look to see why it's taking so long.

m_karampasis
Level 4
Partner Accredited

The case number is 08333096 and after the email from IBM regarding the nbars process, Symantec's engineer created a new one with Case number 08958771.

 

Thank you in regards.

 

 

mph999
Level 6
Employee Accredited

Many thanks.

elanmbx
Level 6

It seems unlikely that it would be an issue with the OS, since this seems to be an issue on *both* AIX and Solaris 10.

At this point I'm *hoping* that it is fixed with NetBackup 7.7.  If not, I suppose I will open another case.