04-10-2014 03:05 AM
I have a master, media and all clients running on NetBackup 126.96.36.199.
Master and media servers OS: Windows 2008 R2
Client OS: Windows 2003(64 bit)- Virtual server
I have issue with one particular SQL server which we backup using the NetBackup SQL agent. I sometimes get to see large number of bpclntcmd.exe processes running on the server (above 15 threads) and these all add up to 100% of the CPU causing performance issues on a production server and hence rendering it unaccessible.
This issue has repeated around 6 times within a span of 1 year now and I am beng approached for the cause of the issue. Logging a call did not help much as I get only one reply saying "Upgrade NetBackup to latest version" which is definitely not the answer I am looking for. I will upgrade, but what is the cause? Why is it only one or two particular clients that face this issue whereas we have others with similar configurations(OS, network, SQL etc) running successfully without any issue?
I do not have much logs handy as I have least logging enabled on the clients. Resolved issue by killing the processes manually. Also, observed a process called "progress.exe" when listing backup processes using bpps. What is this process?
Has anyone come across this and has got a fix apart from the "upgrade"??
04-10-2014 03:19 AM
To add to it, no backups were active during the time of the incident.
04-10-2014 03:25 AM
- there are potential workaround, but the T/N seems to indicate that the solution will be upgrade - i.e. fixed in 188.8.131.52
*EDIT* saw your additional post so maybe not
04-10-2014 03:42 AM
progress.exe is interesting as this is the Job Activity monitor used from the BAR GUI to see how restore jobs are going.
It may well be that the NetBackup Job Tracker also uses this and if active on those clients it could be causing your issue.
Whenever i install a Windows client i use the Custom install option as job tracker is not enabled by default .. if you do a typical installation it does enable job tracker.
If it is active you should see it as a tray icon on the client - right click to exit it and then remove it from the startup folder / Run key in the registry
I wasn't aware that it used progress.exe though but it may be tied to it
04-10-2014 03:42 AM
I've nerver heard such issue that bpclntcmd runs without any manual operation.
Are there any clue in debug logs that shows when and who initiate bpclntcmd? it is better to enable all debug logs by mklogdir.bat and set verbose level to 5 if never.
04-10-2014 10:45 AM
You mentioned you have had this for a period of time now. TECH214830 covers NetBackup 7.0 – 7.6.0. and does state that bpclntcmd -resync_host_cache process is started by bpcd.
You also mentioned that this issue has repeated around 6 times within a span of 1 year now. That's very few times for 365 days. I believe this may tech note applies.
The one major issue of troubleshooting this is that this is a known intermittent issue that is not very easy to replicate. So much so that it spans that whole NetBackup 7.x line.
We do have an EEB for this issue, but in order to even apply that, you would have to get to 184.108.40.206.
04-11-2014 01:44 AM
Thank you for the valuable update. Can you please share the EEB details here or I would need to log a call for it? Earlier, I was told to upgrade to 220.127.116.11.
If there is an option of upgrading to 18.104.22.168 and applying an EEB on it, I would like to explore it!
08-13-2014 04:50 AM
Has the issue been fixed. We did have the same problem now and working on the RCA for it.
The site admin had to reboot to fix the issue during weekend.
Let us know if you had fixed it !!
08-13-2014 04:56 AM
No. We have not got a fix yet. Though it is rare, we still have not confirmed that it will not occur again.
09-29-2014 03:27 AM
We have upgraded the problematic clients to 22.214.171.124 and the servers are under observation. I hope the issue has really been fixed under 126.96.36.199.
09-30-2014 12:51 PM
Yes , 188.8.131.52 contains the EEB. Hopfully this does resolve the issue.