cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup Appliance 5230 memory usage

tarmizi
Level 5
Partner Accredited Certified

Hi all,

We have installed Netbackup Appliance 5230 (v2.6.1.2) at our customer side for almost 2 years now. About 5 months ago, we installed 2 SAN client on AIX client to experdite oracle backup time. For almost 4 months, everything is goung smooth with backup for 800GB within 1 hour 30 minutes. But suddenly one day, they noticed that backup was very slow and apparently the backup is not on Fiber Transport but on LAN. 

At first, we thought the zoning in the SAN switch is the cause but it turns to the nbftsrvr service was stopped in the middle of the backup and what we did was simply ran bp.start_all. After a afew week, it happened again and we did the same start all services but now the backup for SAN client went very slow. For 1TB of data, backup time takes 5-6 hours. and it happened again last week and luckily I was at the customer site and noticed it.

Then, I checked on the appliance and ran command Top to see the usage of the memory for the appliance and I can see that the RAM (128 GB) was used up to 127 GB and most of it consume by a service, name spoold. Some time, memory left is 700+MB only. After some deep checking in /msdp/data/dp1/pdvol/log/spoold, I saw these:

/msdp/data/dp1/pdvol/log/spoold # cat spoold.log.20160910165808

September 10 16:27:06 INFO [139705691690752]: CPU Usage Overall : 20.61
September 10 16:27:06 INFO [139705691690752]: CPU Usage Self : 104.72
September 10 16:27:06 INFO [139705691690752]: Memory : 128587.34 MB real, 66671.99 MB swap
September 10 16:27:06 INFO [139705691690752]: Memory Usage : 99.18 % real (73.89 % buffers/cached), 20.85 % swap, 24 % total
 

And also these:

September 10 06:32:50 INFO [139706029340416]: dataStorePrefetchEntry pool:9, has no enough mem for entry 5954(size:98790, mem:272629198), current prefetching batch stops!

Currently, MaxCacheSize for spoold is 75%, and users can change to an acceptable  value as stated in https://www.veritas.com/support/en_US/article.000024174 but it doesn't tell what to do after the changes and what is the impact to the MSDP.

Hoping I can get some input from other experienced implementor regarding the spoold memory usage.

Thank you

2 ACCEPTED SOLUTIONS

Accepted Solutions

Systems_Team
Level 6
   VIP   

Hi Tarmizi,

I've had a look through the appliance release notes for versions later than 2.6.1.2 to see if there are any known high memory usage issues.

I found this one, but the process consuming excessive memory here is SPAD, but as it is part of dedupe, SPOOLD is also involved:

https://www.veritas.com/support/en_US/article.000080715

There is an EEB for this issue on your version, and it was also scheduled to be fixed in 2.7.1.  Might be worth raising a support case with Veritas first to see if this is your issue.

Hope this helps,

Steve

View solution in original post

Marianne
Level 6
Partner    VIP    Accredited Certified

Please schedule an upgrade sooner rather than later...

All NBU versions up to 7.6.0.x (incl 2.6.0.x) will reach EOSL within a couple of months.

 

View solution in original post

3 REPLIES 3

Systems_Team
Level 6
   VIP   

Hi Tarmizi,

I've had a look through the appliance release notes for versions later than 2.6.1.2 to see if there are any known high memory usage issues.

I found this one, but the process consuming excessive memory here is SPAD, but as it is part of dedupe, SPOOLD is also involved:

https://www.veritas.com/support/en_US/article.000080715

There is an EEB for this issue on your version, and it was also scheduled to be fixed in 2.7.1.  Might be worth raising a support case with Veritas first to see if this is your issue.

Hope this helps,

Steve

Marianne
Level 6
Partner    VIP    Accredited Certified

Please schedule an upgrade sooner rather than later...

All NBU versions up to 7.6.0.x (incl 2.6.0.x) will reach EOSL within a couple of months.

 

tarmizi
Level 5
Partner Accredited Certified

Hi all,

I have talk to my management to upgrade as per said by Marianne and as per suggested in Systems_team given link. I hope they will decide quickly. I will mark Marianne's and Systems_team post as solution

Thank you