cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle RMAN backup fails to validate after replication

DPeaco
Moderator
Moderator
   VIP   

Netbackup 7.6.0.1 running on a Netbackup appliance 5330.

Linux client - runs RMAN DB backup to NBU master node A (on appliance) to advanced disk.

Node A later replicates this backup to Node B in a different city. 

DBA says the DB backups are good but the validate fails. I've had them try a validate again Node A and it fails, I had them try a validate against Node B and it fails.

Has anyone had this issue and have since resolved it? Am I replicating the backup from Node A too soon so that the validate can run against Node A and then replicate to Node B for offsite backup?

Please advise.

Thanks,
Dennis
1 ACCEPTED SOLUTION

Accepted Solutions

Michael_G_Ander
Level 6
Certified

Think you should stop focusing on the replication as this is a pure Netbackup thing after the Oracle backup has finished on the primary master.

Have you checked that the validation is not trying to run against what you called the old shared backup solution ?

Still sure you have some sort of name resolution issue.

Please go through the steps in

https://www-secure.symantec.com/connect/forums/general-connectivity-troubleshooting-status-23242558-...

and

https://www-secure.symantec.com/connect/forums/general-database-backup-error-troubleshooting

 

 

 

 

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

View solution in original post

12 REPLIES 12

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
I am not an Oracle dba, so I don't know what exactly 'validate ' means in Oracle terms... maybe show us the command and the output as well as where the command is run? From NBU point of view, only the client that performed the backup can list and restore it. Hostname where request is coming from must match the hostname in NBU policy /catalog exactly. Any other name must be treated as a redirected restore and 'altnames' files must be created on the master. In the meantime, bprd log on the master server where dba is trying to validate / list Oracle backup, will help to see the request received by the master and how it was interpreted.

Michael_G_Ander
Level 6
Certified

In my experience it often have to do with name resolution or timeout when the validate after the backup fails.

If you are refering to use the rman validate command, the troubleshooting process is the same as for restore.

The logs I would use is on the client is bphdb and dbclient as always for database, in this case I would also create the bplist log

 

 

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I agree that dbclient log will help, but bprd on the master is crucial to see IP address of incoming request, how the master resolves it to a hostname and looks for this client name in NBU config.

If NB_ORA_CLIENT is not specified in the client request, the the CLIENT_NAME in bp.conf on the requesting client is sent to the master. dbclient log will help to identify all of these entries.

Michael_G_Ander
Level 6
Certified

Can see I have been been unclear, meant use these client side logs together with the bprd log from the master.

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

DPeaco
Moderator
Moderator
   VIP   

From the client side:

14:37:43.599 [26654] <4> VxBSAQueryObject: INF - No match was found for query
14:37:43.599 [26654] <2> int_FindBackupImage: INF - dkq45fq2_1_2 not found
14:37:43.599 [26654] <16> int_StartJob: ERR - Backup file <dkq45fq2_1_2> not found in NetBackup catalog


Every file attemtped to "validate" via Oracle RMAN gave the same log entries until the job failed.

RMAN backup runs on Master Node A from client Node Z

Master Node A then replicates the backup to Master Node B in a different city for offsite.

As soon as the backup finishes, Node Z tries to do the RMAN validate to make sure that the RMAN backup was good.....but it fails.

The new backup solution is all running on Netbackup Appliances and we are having these specific issues.

All these backups were running on an older backup solution without issue. The difference is that we replicate  the backup off in the new solution and the old solution was kept at the same master server but put to tape.

I'm a bit confused because I thought that both master nodes would know about the RMAN Backup Master Node A because it was the original master and Master Node B because the backup was replicated to it.

So....enlighten me if this isn't how it really works. :)

Thanks,
Dennis

Michael_G_Ander
Level 6
Certified

if it is every file, you mostly likely a name resolution issue with the client, did the bprd and bplist logs show the expected client name ?

Have you run the connectity tests ? like bpclntcmd -pn and so on ?

Sorry, but we cannot help with name resolutoin, when you call the involved machines A,B & Z

 

 

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

DPeaco
Moderator
Moderator
   VIP   

Michael,

The backups run perfectly. No issue with the backups. It's the validate that fails.

Thanks,
Dennis

Michael_G_Ander
Level 6
Certified

Validate is I as far as I know an integral part of an Oracle backup, which rman uses to update the control file and rman catalog with information about the backup.

Don't understand how a DBA can say the backup is good when the validation of it have failed.

And if the validate fails, and you probably cannot restore as restore and validate uses a lot of the same function calls on the client, like listing the backup set files in the netbackup catalog.

 

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

DPeaco
Moderator
Moderator
   VIP   

Michael,

I understand what you are saying. The standard file system backups for this same backup client run flawlessly on the new Master Node A. It's the Oracle piece that we are working through. :)

I am continuing to look through logs.

Thanks,
Dennis

DPeaco
Moderator
Moderator
   VIP   

Here's the story:

romeo was being backed up on an old shared backup solution going to tape. We are migrating everybody off of the old shared solutions and putting them on our new shared backup solutions.

New master & media servers are Netbackup Appliances (5230's).

master1 with media1 and media 2 (total of 3 appliances)

master2 with media3 is in another city and we replicate from our location to the other location to use as our "offsite" backups.

We move the backup policies from "romeo" over to master1. File system backups run fine and replicate to master2. RMAN backup runs fine to master1 and replicates to master2. DBA says the backup ran fine and tries to run a "validate" against the backup. The validate fails. I thought it was because the backup was replicated to master2 from master1. All the data movement is done via SLP.

So....could my issue be the replication of the rman backup from master1 to master2 and the validate is trying to run against master1? I've also got the DBA's to try and run a validate against master2 but that failed as well. Host name resolution works perfectly and all other backups of these clients run perfectly, it's just the RMAN pieces that I'm trying to sort out.

Thanks,
Dennis

Michael_G_Ander
Level 6
Certified

Think you should stop focusing on the replication as this is a pure Netbackup thing after the Oracle backup has finished on the primary master.

Have you checked that the validation is not trying to run against what you called the old shared backup solution ?

Still sure you have some sort of name resolution issue.

Please go through the steps in

https://www-secure.symantec.com/connect/forums/general-connectivity-troubleshooting-status-23242558-...

and

https://www-secure.symantec.com/connect/forums/general-database-backup-error-troubleshooting

 

 

 

 

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

DPeaco
Moderator
Moderator
   VIP   

Thank you Michael. I'll do this.

Thanks,
Dennis