cancel
Showing results for 
Search instead for 
Did you mean: 

All jobs, including discovery jobs fail due to ERROR - Error connecting to services / authentication

verita_engineer
Level 2

Hello guys,

We have a problem about our Netbackup platform.  We've faced that problem for 1 year. But in the beginning of that problem we just changed the timeout value of clients from host properties and then we didndt receive any error just jobs hang on system for 15-30 min.

Our platform’s properties are like that:

We’re using Netbackup 8.0 and our Master server Windows, 2012 R2 (it was 2008 r2, we upgraded our master server with catalog backup but after that os upgrade we still face the issue), we have 25 media servers, 21 of them are communicating with WAN with master server, others are with LAN. 2 of them are Linux and the others windows.

We have more or less 45000 job count per day on activity monitor and we decreased the value via decreasing the frequency of tlog jobs. It was like 55000-60000.

Our platform is getting bigger day by day and now we have bigger data then before. For this reason, jobs started to hang more then before also and increasing timeout of clients didnt work after some time.

We got 811,10,52 and more timeout and hang errors on Netbackup. We made a lot of changes, i wrote some of our changes below.

-We opened a case about it. We worked with Veritas engineers like for 5 times but we couldnt find anyhting.

-We changed master server OS from 2008 R2 to 2012 R2 via catalog restore. After that restore we upgraded our Netbackup platform from 7.7.2 to 8.0. Now our master server and media servers are 8.0 but we still have 7.7.2 clients.

-We increased our master server sources. They’re now 48 GB memory and 24 processors.

-We thought that it’s about image cleanup jobs because we see when image cleanup jobs work, system hangs. But after that we realized that system is hanging also without image cleanup jobs.

-We have more or less 15 media servers that are on remote location. We re backing up servers on that remote location to that media servers and we have also duplication jobs to our central datacenter (where our master server resides) We realized that while that disk to disk duplication jobs are working, our system is hanging. If we stop the duplication jobs, we can see the system is working without any hang problem.

Now we’re thinking some solution about that duplication jobs but before that we want to learn if there is anyone live that problem before and solve it without any operational changes?

Thank you for your help.

1 REPLY 1

verita_engineer
Level 2

1. make sure your nbwmc service is running correctly.

The NetBackup Web Management Console (nbwmc) process manages requests for certificate and host management.

This process now also controls the configuration parameters that are related to NetBackup Cloud Storage.

The process is installed as a NetBackup service on Windows and runs as a standard daemon on UNIX.

2. If not,plz restart that service. use netbackup stop/start

3.For further configuration,  you can make a move of Device Configuration as well. Sometims the tape and library needs to rebuild the connection to EMM service.