02-07-2018 08:57 AM
We are trying to restore oracle database from one server to another. It all starts well but this is where it stuck.
Error bpbrm (pid=11257) cannot get server's peername on client rome_pres-bu
Error bpbrm (pid=11257) listen for client protocol error - couldn't write necessary information on /usr/openv/netbackup/logs/user_ops/dbext/logs/10774.0.1517947887
Solved! Go to Solution.
02-15-2018 10:52 AM
And here is the fix that worked.....
After plenty of troubleshooting it was found that the DBA were not using same dbid as of source client in the restore script on destination client. Also the SID was incorrect. The restore is running smooth at the moment.
The DBA is moving to Oracle 12c and proposing to do backups their own instead of using NetBackup so that they can have control on their own on backup/restore. They are little nervous with the restore issue but it's completely due to their poor script.
What are the benefits of using NetBackup for Database backups (to tape/disk) against DBA doing their own? This will help me talk about NetBackup in our upcoming meeting with them.
02-07-2018 09:21 AM
02-07-2018 10:00 AM
We are running with older version of NetBackup 6.5.6 on master, media and client.
Connectivity looks good from all side. Yes, we are restoring to different client and using different media server. We have changed client name on destination server with the name of original server and tried but all different attempts comes to an end at the error mentioned in my previous post.
Restore RMAN script is created by DBA guys and we have configured policy to call that script.
02-07-2018 04:14 PM
02-08-2018 03:36 AM
Please tell us about "using different media server".
Have you changed image or tape ownership to the "different media server"?
to troubleshoot the error, you need to trace the process flow via the logs.
Since you are not getting past this initial comms process, we need to check all of these logs to see where it's going wrong.
If you need assistance with reading the logs, please copy them to .txt files (dbclient.txt, bpcd.txt, bprd.txt, bpbrm.txt) and upload as attachments.
02-08-2018 06:21 AM
This is just to understand how things are in placed.
Master Server A
Source Client B using Media Server C
Destination Client D using Media Server E
Last week, we attempted the restore on D using E and it got stuck as i shared in my first post.
This week, we renamed D to B and used E for restore but it failed again at the same point. (Attaching logs for this attempt from Feb 6th)
02-08-2018 08:18 AM
Attaching restore script but this is called by below small script so that it can be a oracle user.
#!/bin/bash
su - oracle -c /usr/openv/netbackup/scripts/restore.sh
02-09-2018 12:02 AM
Seems your Oracle dba is not trying to restore from NBU but rather from local directory on the Oracle server:
restore controlfile from '/u01/app/oracle/product/10.2.0.5.0/db_1/dbs/iwhprod_bkup_control_db.20180120090005'
But they are specifying the channel as NBU:
SBT_TAPE' parms='SBT_LIBRARY=/usr/openv/netbackup/bin/libobk.so64.1'
This is why we see this info message in dbclient log:
VxBSAQueryObject: INF - Object </u01/app/oracle/product/10.2.0.5.0/db_1/dbs/iwhprod_bkup_control_db.20180120090005> was not found in the NetBackup catalog.
So, the dba firstly needs to determine where exactly the control file was backed up to and whether rman catalog or autobackup was used.
NBU cannot restore from disk locatation that was not written (backed up) by NBU.
PS:
In future, please only post logs as requested. We really don't need a zip of entire log folder.
We only need the log for the restore that failed for each of the processes mentioned.
Files with .txt extentions can be easily opened on any device - including mobile. Any other format can only be downloaded from a computer.
02-09-2018 01:40 PM
Thanks a lot Marriane! I am sorry about uploading all logs and in zip format.
I will try to bring this up with DBA. In the mean time, i am testing with using FQDN name of source client under altnames. We already had empty file with name No.Restrictions. Now i am planning to use empty destination client name fqdn.
master> ls /usr/openv/netbackup/db/altnames/
No.Restrictions romepres-sol10-bu.eus.elnk.net
Also, DBA changed script to match the controle file and we tried three times restore again this AM with restore always stuck at the same point :(
02-10-2018 05:55 AM
02-10-2018 09:32 PM
02-13-2018 06:16 AM
@Amol_Nair - We already tested the option with no luck and it fails at the same point. You can see the take1 logs.
@Marianne - thanks again Marianne for your input. It was just a part of test (in our take2) that we used two different names of destination client in the bp.conf.
Master > atlbaxxxxmaster, Source Client > rome_prexx-bu using media server virt-phoxxx-bu, destination client > romepresxxxsol10-bu using media server virt-hooxxx-bu
I'll uplodad bprd logs from the date of our restore testing from master.
02-13-2018 06:40 AM
Source client name as CLIENT_NAME in bp.conf is good.
We now need to see bprd log to see why the master server is rejecting the restore request.
Important that we see one set of logs for the same restore attempt - firstly dbclient and bprd.
Only when all looks good in these logs will we go to bpbrm on the media server.
I am still confused about the restore via different media server.
How are you doing that?
Have you moved tape/image ownership to the other media server?
Or added FORCE_RESTORE... entry in master's bp.conf?
02-13-2018 07:39 AM
So you mean the bplist command failed? Or did it run successfully and list the files backed up.
When you attempt RMAN restore in the backend it would execute the bplist command before the actual restore is initiated and any failure with the bplist command would result in the job failing.
You mentioned the destination client as "destination client > romepresxxxsol10-bu" and the source client name as "Source Client > rome_prexx-bu"
**With an underscore (_)
The names I see in the logs for take 1 shows me the name with hyphen (-).. no names with underscores in it for the bplist request sent to the master server.
Search for these keywords in the take1 log and you should be able to find the entries I am referring to
---------------------
14:24:37.307 [2440] <4> dbc_GetServerClientConfig:
14:24:37.309 [2440] <4> sendRequest: request:
14:26:38.026 [2440] <4> serverResponse:
---------------------
Once you get the bplist command working to list all the Oracle images it would be a lot easier to simply use the same entries as SEND parameters in the restore script. So if the bplist command itself is failing at the moment, I would suggest that it would be a better idea to focus on that command. If the bprd logs show an error like an invalid name is being used to make the request (usually in case of alternate client restore and when No.Restrictions file or altnames are not there on the master server) then the same error would also be reported when you attempt the bplist command.
You can start by running the bplist command on the master server. There you do not need any extra entries, nor can you get an invalid server made the request error. Once you get the bplist command working on the master server using the same names in -S and -C parameter, you can then attempt to run it on the client machine as the root user as well as the oracle user.
Running the command as oracle user is what is required but at times if there is some permission related issue, the command may work as the root user but fail as the oracle user, hence suggesting both of them.
02-13-2018 09:23 AM
yes, source client name in bp.conf of destination is there.
rome_prexx-bu is non-global server on media server virt-phooxx-bu
romeprexx-sol10-bu is non-global server on media server virt-hooxx-bu
source client with its media server and destination client with its media server are on different network so these are the reason we are using different media server for restore.
yes, i have used force restore in bp.conf of master
---------------------------------------------------------------
bplist is successfull
Don't confuse in the source and client name since they look similar (source client is rome_prxx-bu and destination client is rompresx-sol10-bu)
we have No.Restriction in place already on master.
02-13-2018 11:12 PM
We are still waiting to see bprd log....
02-15-2018 10:52 AM
And here is the fix that worked.....
After plenty of troubleshooting it was found that the DBA were not using same dbid as of source client in the restore script on destination client. Also the SID was incorrect. The restore is running smooth at the moment.
The DBA is moving to Oracle 12c and proposing to do backups their own instead of using NetBackup so that they can have control on their own on backup/restore. They are little nervous with the restore issue but it's completely due to their poor script.
What are the benefits of using NetBackup for Database backups (to tape/disk) against DBA doing their own? This will help me talk about NetBackup in our upcoming meeting with them.
02-15-2018 10:41 PM
Please get proof of all these mistakes when you go into your meeting.
We all agree that they are trying to blame NetBackup for their mess-ups.
The outcome would be no different if they restored from disk backup.
I guess with 'own backup' they want to use disk instead of NBU sbt_tape.
My issues with that:
The additional space needed for this.
The additional task to keep an eye on disk usage and manual cleanup of older backups.
Backing up to disk followed by file-level backup to NBU will take twice as long.
Scheduling becomes an issue as disk backup must be finished before file-level backup can start.
If disk backup has been removed and restore from NBU is needed, the restore will take twice as long because of the 2-step process.
Just one more thing - you really need to upgrade NBU.
If anything goes wrong on NBU side, you cannot log Support calls with Veritas.
As you have experienced, we do not have the time and in-depth experience to assist here on VOX.
Highest level logs and insight into the Oracle side is needed, which we here on VOX simply do not have the time for.
All the best for your meeting!
02-16-2018 07:12 AM
thanks Marianne for your inputs. you are right, they don't want NetBackup as mediator between their database and backup target. they want to use rman to backup to disk.
yesterday's restore surprised them with taking just 6hrs for approx 9TB with 4 channels running at a time. The reason we used 4 channels is due to the same number of channels are configured in rman backup script. so here is the question and we are planning to test,
Question1-Does rman restore script need to have same number of channels as like rman backup script? If not then i think we can restore in less time.
Question2( Includes 3 questions :) )-All channels completed but parent failed with status code 6. DBA find a successfull restore and applied archives. Failing parent restore job doesn't make any difference? Is it considered as success? Or should i try restore (as test) restore from another date?
yeah, i have the notes what happended before. DBA were not even using dbid in the previous rman restore scripts.
we may upgrade and have datadomain too.
02-18-2018 10:13 PM