12-03-2014 07:38 PM
I've got a slow memory leak in the nbars binary. Eventually gets big enough (>6GB) that I have to restart NetBackup services on the master. The leak is pretty slow - I can go several weeks before it becomes an issue and I need to restart.
Just wondering if anyone else has run into this leak on their Solaris master?
Solaris 10:NetBackup Enterprise 7.6.0.3
12-04-2014 01:14 AM
There are a few instances of nbars memory leak, but nothing for 7.5.0.3 springs up.
You could try adding the following to your /usr/openv/netbackup/bp.conf (no restart is necessary)
NBARS_DISCOVERY_TIMER = 300
nbars is set to do a discovery ever 1 second - this will slow it down to every 5 minutes and should slow down the memory leak from happening.
If that setting gives any undesired results, remove it.
When you are having the issue, it might be worth checking the nbars log (oid 362) to see if anything suspect is showing up:
vxlogview -i 362 -d all -t 04:00:00 > nbars.txt
(this will capture the last 4 hours of nbars logs)
You can always stop nbars on it's own without restarting the whole of NetBackup
nbars -terminate
then run: nbars to re-run
If the above does not help, I would suggest logging a call with Symantec.
12-04-2014 03:14 AM
Just taken a look at mine, you many have hit upon something there. I'm 7.6.0.1 (note not 7.5.0.3 as written by revaroo!) and top shows nbars as the topmost when size ordered, size 5029MB, RES 821M. Interesting.
Jim
12-12-2014 03:33 AM
Mine is now size 5401M and res 894M . Looks like a leak to me.
12-12-2014 12:09 PM
Yup. Mine is 7540. I'm going to restart nbars...
12-15-2014 12:52 PM
Leak is pretty consistent - 8MB/hr
NBARS_DISCOVERY_TIMER setting has no effect from what I can tell...
01-05-2015 11:43 PM
I do not see anyone offering a solution here.
Have you logged a Support call yet?
01-06-2015 08:54 AM
I had a case opened but God help me it was not pleasant. I eventually gave up because of lack of resources on my end (and the fact that it was fairly easily "manageable") - I'm HOPING that it will be fixed in 7.6.1. If it is not, I will open a new case.
01-22-2015 12:10 AM
Just for info - NBU 7.6.1 was released a couple of days ago:
01-22-2015 02:10 PM
Yep. My master was upgraded to 7.6.1 a week ago. I am monitoring to see if this issue was addressed in the most recent release...
01-22-2015 02:12 PM
Although... looking at it... I don't think the memory leak has been fixed:
load averages: 0.32, 0.47, 0.55; up 252+04:38:50 15:11:11 281 processes: 276 sleeping, 2 zombie, 1 stopped, 2 on cpu CPU states: 99.9% idle, 0.1% user, 0.1% kernel, 0.0% iowait, 0.0% swap Memory: 32G phys mem, 7386M free mem, 32G total swap, 32G free swap PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND 15754 root 105 59 0 3255M 3143M cpu/28 54.7H 0.07% dbsrv16 16278 root 16 59 0 1565M 1212M sleep 197:59 0.00% nbars 16204 root 18 59 0 793M 718M sleep 74:08 0.00% nbpem
I will continue to monitor and see...
01-23-2015 03:15 AM
I dont understand that mph999: if theres no call you wont fix it? Symantec have zero proactive support? Symantec with all their resources wont (not even cant) monitor the mem usage of one of their own processes on a system that is declared on the compatibility matrix? That doesnt put Symantec in a good light does it?
For all the free support that gets handed out by customers on this forum..!
Jim
01-23-2015 03:40 AM
Unfortunately, if it's not reported we can't fix it ...
(By this I mean if for what ever reason we don't know about it, we can't fix something we don't know about)
If we find the issure internally, then yes, it would be addressed.
If the issue is still there as you suggest, get another call logged.
If you run into issues, ask the tse to contact me (Martin Holt) and I'll get involved.
Other option, post case number up here ...
01-23-2015 03:43 AM
I've eddited my post to be more clear.
What I meant was if we are unaware of an issue, we can't fix something we don't know about.
If we find things internally, then they do get addressed.
How this particular issue got missed I have no idea - maybe it's one of thoses issues that only becomes apparent in real life systems under particular conditions.
01-23-2015 04:43 AM
Good I look forward to a fix! Jim
06-09-2015 03:46 AM
Dear all,
We are experiencing the same issue on our master server (AIX 6.1 - NetBackup 7.6.0.1). We have an open case with Symantec and IBM for more than one year. Symantec's engineer believes that the issue relies on the IBM side. Following you will find the latest response from IBM:
Hello,
Comparing memory usage from prior to shutting down the Netbackup application, restarting it and letting it run for several days (from April 6 through May 6th.) I see growth in the memory usage of several of the Netbackup processes but probably the most notable is what I see in the nbars process. This process always has 18 threads and while its in-use memory remains fairly constant, its virtual memory size increases and its paging space consumption increase at an even faster rate.
As there is nothing else I can do from the AIX side of things and since this PMR has aged to nearly a year I would like to close it now. If, after consulting with Symantec, they believe the problem lies within AIX and they are able to point to proof of that, we can open a new PMR and go from there. Are you agreeable to closing this PMR now and leaving the investigation to Symantec?
Regards,
06-16-2015 01:36 AM
Do you have the case number .... I'll look to see why it's taking so long.
06-16-2015 01:41 AM
The case number is 08333096 and after the email from IBM regarding the nbars process, Symantec's engineer created a new one with Case number 08958771.
Thank you in regards.
06-16-2015 02:18 AM
Many thanks.
06-16-2015 02:28 PM
It seems unlikely that it would be an issue with the OS, since this seems to be an issue on *both* AIX and Solaris 10.
At this point I'm *hoping* that it is fixed with NetBackup 7.7. If not, I suppose I will open another case.