I have a very odd problem I'd appreciate some assistance with.
Master = Windows 2008 R2, NBU 7.7.3
Media = CentOS 7, NBU 7.7.3
Client = Any
Backups of databases taken with the NetBackup Database Agents (MS-SQL and Oracle) will start, load tapes into the tape drives and go to a status of "being writing". However, no actual data is written to NetBackup, and afer a while (timeout), the backup job goes to status failed with various error codes - we see codes 13, 150, 811, 909 and so on. The same problem is seen on all backups taken with a NetBackup DB agent.
We can see the behaviour logged in the bptm log on the Media Server, we see:
14:31:29.373  <4> write_backup: waiting for client data or brm message
And then nothing happens until it hits the timeout and then fails the job!
Backups of the File System on the same servers to the same tapes in the same drives on the same Media Server work perfectly either triggered from the NetBackup Admin Console (scheduled or manual), or made using a "user backup" triggered from the Client using the BAR tool.
I suspect this may be a firewall ports issue. The Master Server (but not the Media Server and Client) is behind VMware NSX. We are sure we've opened the correct ports in the NSX configuration following the guidance in this document:
We've done all of the recommended testing using bpclntcmd, telnet and so on, these work correctly.
I'd appreciate being pointed in the direction of which logs to look at in NetBackup to see what's going wrong.
How about the filesytem backups on the same clients that are failing for DB backups? are they running fine?
if filesytem backups are running fine, that atleast confirm that there is a good communication between media server and client. in this case please try another test backup by using master server as media server and see how it works( this is the ensure FS backups & commucication is good between clients and master servers), if this test also successfull i would not see any firewall issue..however if you see FS backup failures when using master server as media server i would fix this before looking into DB backup failures.
in case FS backups are sucessfull from both master server & media servers, go ahead look into bpcd & dbclient logs of the client for any clues.
You have also posted over here about Basic Disk Backups:
I have listed the logs that you need to check in order to track the process flow.
Please have another look at my reply in that post?
Comms with database backups is different to filesystem backups.
When the script starts on the client, a request is sent directly to bprd on the master server.
This does not happen with filesystem backups. So, troubleshooting of database backups needs to start with looking at the client's dbclient that is initiating the backup request, followed by bprd on the master to see if the request is received and how the request is interpreted.
To simulate the client -> master connection request, run this command on a database client:
Please share the output as well as the relevant lines in bprd log.
Next, we need to check bpbrm log on the media server to verify comms with client.
bptm will log data received from client and bpbrm will show metadata received from the client.
Best to have level 3 logging for bpbrm and bptm.
dbclient log will show when data is sent to the media server.