cancel
Showing results for 
Search instead for 
Did you mean: 

User initiated jobs complete per NBU Master but client BAR still shows job as 'in progress

BR549
Level 4

I have run into an odd situation on one of my NetBackup 7.5.0.7 client servers.  Whenever a User initiated backup from the client, whether by the BAR tool or a PowerShell script, it launches successfully and according to the Veritas NBU Master Server the job completes successfully.  However, the client does not show that job completed successfully and still shows it as "in progress".

 

I have attached a document to this inquiry that shows the details from the NBU Master server and from the client server where it was initiated. The client server is the only one that is having this problem. Does anyone have any suggestions as to how to remediate this issue or why it is occurring?  

Yes, I know that 7.5.0.7 is well beyond EOL and we do have a plan to upgrade to at least 8.0 in the near future.  I would like to get this issue resolved before that upgrade takes place if possible.

1 ACCEPTED SOLUTION

Accepted Solutions

Finally got the issue resolved by an unusual process.

Tried repairing the 7.5.0.7 installation but it would error out with a 1606 error.  Tried uninstalling the NBU 7.5.0.7 client but it also error out with a 1606 error.

The fix:

Installed NBU 8.0 client. Removed NBU 8.0 client, as master server is currently at 7.5.0.7. Verified a clean removal of Veritas NBU client software. Installed NBU 7.5.0.7 onto the server with latest patches required.  Tested local user initiated backup and got good status back.  Had user do a test of their MySQL PowerShell backup script and it also got a good status back that the job completed successfully.  Therefore I consider this issue as being closed at the current time.

Thanks, for all the suggestions.  Yes, the environment will be updated to NBU 8.0 in the not too distant future.

View solution in original post

6 REPLIES 6

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Seems to me that there is some sort of an intermittent connection issue between the media server and the client. 
Look at this: 

Error bpbrm(pid=13548) cannot connect to f05h9m-c04, status = 25

and then 5 seconds later: 

10/8/2017 7:15:19 AM - connected; connect time: 00:05:00

You need to check bpbrm on the media server (logging level 3) to see what happened during the 1st connection attempt.
And the 2nd successful connection. Check IP address and port number.
This seems to be purely network issues.
The same network issue may be the cause of delayed or intermittent comms issue between the client and master where successful status is not sent back to bprd on the master. I cannot say for sure without logs.
Try to repeat telnet tests (at least 2 in a row) on port 1556 from the media server to the client.
Is that successful each time?

This is certainly not something that should delay NBU upgrade.

You can see in job details that bpbkar on the client exited with status 0: 

10/8/2017 7:15:46 AM - Info bpbkar32(pid=1212) done. status: 0: the requested operation was successfully completed

So, check bpbkar log on the client (create the folder if it does not exist). 
If the status in job monitor shows status 0 and you can restore from it, then all you need to look at is the initial status 25.

Marianne - Thank for your repsonse. I took a look in the bpbkar logs on the client and did a search for EXIT STATUS within the logs and did not see any 25 errors.  I then did a search with just Status and only saw O successfully completed .  I took a look at the media bpbrm logs and found Exit Status 25 errors to the client server and set the logging level for this service to 3.  I did the telnet test on port 1556 multiple times and it completed successfully each time I ran the test.

The attached zipped file contains the same 3 day period from 8 Oct to current time on 10 Oct for the client's bpbkar and media bpbrm log files. For the bpbrm the logging level was set to 0 when the data was gathered for the logs.  The 25 error, cannot connect on socket, is still appearing in the bpbrm log of the media server during this period.

Again thank you for your assistance in this matter.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Yes. It is bpbrm that reports the status 25 when it cannot connect to the client.
Once it has connected. bpbkar backs up the data and exits with status 0.

From bpbrm - media server tries to connect to client:

01:27:39.698 [4836.17756] <2> logconnections: BPCD CONNECT FROM 172.18.118.71.63711 TO 172.18.118.78.1556

01:32:39.795 [4836.17756] <16> connect_to_service: connect failed STATUS (18) CONNECT_FAILED
status: FAILED, (11) TIMEOUT; system: (10036) A blocking operation is currently executing. ; FROM 0.0.0.0 TO 172.18.118.78 172.18.118.78 vnetd VIA pbx
status: FAILED, (11) TIMEOUT; system: (10036) A blocking operation is currently executing. ; FROM 0.0.0.0 TO 172.18.118.78 172.18.118.78 vnetd VIA vnetd

10036 is a Windows socket error on the client that causes NBU connection to fail intermittently.
You need to work with server/Windows admin too find out what is causing this.

Once this issue is sorted out, I believe that the BAR issue still showing job 'in progress' will also disappear.

bpbrm shows that the job is successful and disconnects from nbjm.

PS:
NBU 8.1 introduces support for agent-based MySQL.

Thanks for the additional information that indicates that the issue resides on the client side of the process.  I am researching how to determine what process is specifically causing this issue to occur and hopefully will find a fairly simple fix for the issue.

Once I find the solution I will post the results to this thread.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Once again - this is certainly not something that should delay NBU upgrade.

Finally got the issue resolved by an unusual process.

Tried repairing the 7.5.0.7 installation but it would error out with a 1606 error.  Tried uninstalling the NBU 7.5.0.7 client but it also error out with a 1606 error.

The fix:

Installed NBU 8.0 client. Removed NBU 8.0 client, as master server is currently at 7.5.0.7. Verified a clean removal of Veritas NBU client software. Installed NBU 7.5.0.7 onto the server with latest patches required.  Tested local user initiated backup and got good status back.  Had user do a test of their MySQL PowerShell backup script and it also got a good status back that the job completed successfully.  Therefore I consider this issue as being closed at the current time.

Thanks, for all the suggestions.  Yes, the environment will be updated to NBU 8.0 in the not too distant future.