Highlighted

Some oracle restores are failing

Our master servers are version 8.2 on linux boxes. Our DBAs perform there own restores. For the last couple weeks, some of their restores are failing. Both the unsuccessful and successful ones are coming from the same client, the same storage (5330 appliance), and attempting to restore to the same server. These are cross datacenter restores (the source is at one location; the destination are to another).

Since there are so many common factors between the "good" and "bad" ones, we're having trouble figuring it out.

From NB Detailed status for a failed one:

"Apr 23, 2020 12:25:16 PM - Error bpbrm (pid=278942) socket read failed: errno = 110 - Connection timed out
Apr 23, 2020 12:25:18 PM - Error bptm (pid=278943) media manager terminated by parent process
Apr 23, 2020 12:25:29 PM - Warning bptm (pid=278944) cannot connect to bpcd on destclientt.company.com: cannot connect on socket (25)
Apr 23, 2020 12:25:29 PM - Info appl-5330-xex02 (pid=278943) StorageServer=PureDisk:appl-5330-xex02; Report=PDDO Stats for
(bkmapp-5330-web02): read: 26850816 KB, CR received: 11669966 KB, CR received over FC: 0 KB, dedup: 0.0%
Apr 23, 2020 12:25:30 PM - Info dbclient (pid=58573) done. status: 13: file read failed
Apr 23, 2020 12:25:30 PM - Error bpbrm (pid=278942) client restore EXIT STATUS 13: file read failed
Apr 23, 2020 12:25:43 PM - restored from image sourceclientt.company.com_1587271313; restore time: 0:47:31
Apr 23, 2020 12:25:43 PM - end Restore; elapsed time 0:47:32

Oracle policy restore error (2801)"


From DBA log:

"channel ch01: reading from backup piece client_ONLINE_20200418_2200.p1.s1495584.1038102099
channel ch02: ORA-27192: skgfcls: sbtclose2 returned error - failed to close file
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
VxBSAEndTxn: Failed with error:
The sequence of calls is incorrect.
ORA-19870: error while restoring backup piece client_ONLINE_20200418_2200.p1.s1495568.1038100738
ORA-19501: read error on file "client_ONLINE_20200418_2200.p1.s1495568.1038100738", block number 9"

Anyone encountered anything like this? If so, how was it resolved?

Tags (3)
5 Replies
Highlighted

Re: Some oracle restores are failing

Apr 23, 2020 12:25:29 PM - Warning bptm (pid=278944) cannot connect to bpcd on destclientt.company.com: cannot connect on socket (25)

Try to things

1: run bptestbpcd from master to destclientt.company.com - what port respond on connections requests (13782, 13724 or 1556)?

2: try running a ping test from appliance to destclientt.company.com for a hour or two. Result should be close to 0 lost ping packages.

If a Oracle listener on the oracle host occupy port 1556, tell the DBA to change the listener port. Port 1556 is a IANA reserved port for Netbackup.

Highlighted

Re: Some oracle restores are failing

I'll try that. Thanks. But would that explain the intermittent issue? This morning I watched as the DBA ran the script. It'll error out on a specific file he's looking to restore, but then jump to an earlier file and it will be successful. (As mentioned--same client, same type of file , same storage, same destination, etc.)

Highlighted

Re: Some oracle restores are failing

@DoubleP 

You will need to dig into logs to see where exactly the failure is.

1. bprd on the master server
2. dbclient and rman log on Oracle client
3. bpbrm and bptm on media server

Highlighted

Re: Some oracle restores are failing

Thanks. We'll be doing that

Highlighted

Re: Some oracle restores are failing

I think you have two issues.

One is a general network connect issue - as my previous post.

The issue by not being able to find a file, sound like the DBA hasn't run a RMAN crosscheck. By default RMAN belive all backup piece exist for eternity, but that not true. So to "sync" RMAN and Netbackup, the DBA need to run the rman command "RMAN corsscheck". RMAN will then know what backup piece to pick from Netbackup, in order to full fill the restore request.

Pls see:

RMAN crosscheck

https://docs.oracle.com/html/E10643_07/rcmsynta015.htm

Best Practice for maintaining a consistent backup catalog between Oracle RMAN and NetBackup

https://www.veritas.com/support/en_US/article.100017876