cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle RMAN backups not writing logs to dbclient

rjrumfelt
Level 6
I am troubleshooting a backup failure for one of our hot db backups.  The DBA has verified that the libraries have been linked correctly, and to make sure, he relinked them.  I've got VERBOSE = 5 in the bp.conf file.

The flat file backups to this host work without an issue, however the sparse details we get in the activity monitor suggest it is some type of connectivity issue (child streams fail 58 with the folloiwng message:

11/09/2009 14:50:40 - Error bprd (pid=9942) Unable to write progress log </usr/openv/netbackup/logs/user_ops/dbext/logs/16212.0.1257777824> on client <clientname-bkup>. Policy=NONE Sched=NONE
11/09/2009 14:50:40 - Error bprd (pid=9942) CLIENT <clientname-bkup>  POLICY NONE  SCHED NONE  EXIT STATUS 58 (can't connect to client)

Although it does not look like it from the error above, the necessary NetBackup parameters are being passed to Oracle, as I can see the master server name listed for "NB_ORA_SERV" and I can see the policy listed for "NB_ORA_POLICY" from the RMAN erro log.

The rman error stack also indicates the same type of connectivity issue.  I'm curious as to what could be preventing the logs to be written to the /usr/openv/netbackup/logs/dbclient directory on the host.  It has the same permissions set as the bpcd directory, which is having logs written to it (the permissions are 755).  I'm assuming that for some reason, NBU and Oracle are not communicating as they should, but I'm having difficulty nailing down where the disconnect might be.

RMAN log:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch01 channel at 11/09/2009 14:58:50
ORA-19506: failed to create sequential file, name="bk_96_1_702485024", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
   VxBSACreateObject: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
 
RMAN>

Also, I've made sure the NB_ORA_SCHED parameter does not need to be passed to Oraclce, as I deleted one of the two schedules I had set up, so there is only one schedule that I am attempting to back up.

Any thoughts on the issue?
1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified
You need to stop/start NetBackup after creating bprd directory on master.
You can restart bprd as follows:
bprdreq -terminate ;initbprd

Only client-initiated actions will be affected during stop/start (not running backups).

View solution in original post

21 REPLIES 21

rjrumfelt
Level 6
So to add to it, I think there may be an issue with one of the SBT_TAPE channels in Oracle, however I have no clue as to how to track that down.

Nicolai
Moderator
Moderator
Partner    VIP   
I think you get the "Unable to write progress log" message because Netbackup can't connect to the client´'s bpcd daemon (multiple network cards ?)

Try to run this rman script, it's boiled down to minimum. If it works, we know we have connectivity

rman nocatalog

connect target
run {
allocate channel ch00 type 'sbt_tape';
send 'NB_ORA_POLICY={YOUR_POLICY}, NB_ORA_SCHED=´{YOUR_SCHED}';
backup current controlfile;
release channel ch00;
}

But do verify Oracle has write permissions to to /usr/openv/netbackup/logs/user_ops/dbext

The security guys we have don't like 777 permissions so I always do a "chown root:dba dbext" and "chmod 775 dbext"





Andy_Welburn
Level 6
look at the top of the post as opposed to the bottom :)

Just under the 'Forums' tab?

rjrumfelt
Level 6

setting the "REQUIRED_INTERFACE" variable in the bp.conf file to the backup interface of the client.  I'll pass your script on to the DBA and see what he can do with it.
 

Nicolai
Moderator
Moderator
Partner    VIP   
If your'e master/media server is on the same subnetwork as the Oracle host  REQUIRED_INTERFACE should do it, if not, you need to add a static route.

Also makre sure you in Netbackup and RMAN specify the DNS hostname of the NIC you intend to use.


rjrumfelt
Level 6
as client.

/usr/openv/netbackup/logs/user_openv/dbext has permissions set to 777


Could it be a firewall/routing issue even though I can connect and backup without issue when attemping a flat file backup of the system?

Marianne
Level 6
Partner    VIP    Accredited Certified
Does dbclient have 777 permission? Backups will fail if directory is present and oracle user does not have write access.

Nicolai
Moderator
Moderator
Partner    VIP   
The way file system backup and Oracle backup works are very different.  Is it possible (as a test)  to unplum (un-configure) the secondary NIC to see if it changed anything.  If that's not possible you can use tcpdump to see the traffic flow.



rjrumfelt
Level 6
I did not see that tab up there until you pointed it out

:)

Marianne
Level 6
Partner    VIP    Accredited Certified
I agree with Nicolai - have a look at bprd log file on the master to see IP address of client that's connecting to master to request backup. Then see if master can resolve that IP address back to the same hostname that is listed in the policy config. Also check reverse lookup on the media server for incoming IP address that you see in bprd log on the master.

rjrumfelt
Level 6
the secondary nic)

in the meantime, below is a snippet from dbclient, when things start heading downhill:

16:56:11.808 [22197] <4> readCommMessages: Entering readCommMessages
17:11:16.701 [22198] <16> readCommFile: ERR - timed out after 900 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/22198.0.1257785771
17:11:16.702 [22198] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/22198.0.1257785771>
17:11:16.702 [22198] <16> CreateNewImage: ERR - serverResponse() failed
17:11:16.702 [22198] <4> closeApi: entering closeApi.
17:11:16.702 [22198] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
17:11:16.702 [22198] <16> VxBSACreateObject: ERR - Could not create new image with file /bk_102_1_702492971.
17:11:16.702 [22198] <2> xbsa_ProcessError: INF - entering
17:11:16.702 [22198] <2> xbsa_ProcessError: INF - leaving
17:11:16.702 [22198] <16> xbsa_CreateObject: ERR - VxBSACreateObject: Failed with error: Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
17:11:16.702 [22198] <2> xbsa_CreateObject: INF - leaving (3)

but looking at /usr/openv/netbackup/logs/user_ops/dbext/logs, I've got permissions blown out (777):



****EDIT****

bpbrm and bpcd are also clean on the master server

rjrumfelt
Level 6
bprd isnt logging.   I've got the directory there, and the verbosity on the master is cranked up to 5. 

Marianne
Level 6
Partner    VIP    Accredited Certified
This error sends me straight to bprd log on the master:
<16> CreateNewImage: ERR - serverResponse() failed
and
Communication with the server has not been initiated or the server status has not been retrieved from the server


rjrumfelt
Level 6
I'm not getting anything generated under bprd on the master.  The directory is blank.  Double checked and VERBOSE = 5 is still in the bp.conf file.

Marianne
Level 6
Partner    VIP    Accredited Certified
You need to stop/start NetBackup after creating bprd directory on master.
You can restart bprd as follows:
bprdreq -terminate ;initbprd

Only client-initiated actions will be affected during stop/start (not running backups).

rjrumfelt
Level 6
So it is now working correctly.

The fix?  Bouncing bprd.  I bounced bprd per Marianne's suggestion above, in order to get my bprd logs running.  An unintended side affect was that bouncing bprd actually fixed the issue.  Not sure how, but it did - so I'm going to give Marianne the nod, and thank you Nicolai for your suggestions as well.


****EDIT****

Now does anyone know why bouncing bprd might have resolved the problem?

Marianne
Level 6
Partner    VIP    Accredited Certified
I expected bprd to point out lookup issue when master server tries to resolve incoming IP address from client.
The only logical explanation is that bprd might have been in some hung state.... I'm as stunned as you are!!

Nicolai
Moderator
Moderator
Partner    VIP   
Running LDAP authentication ?

If the LDAP server has been unresponsive bad things happens with Netbackup - Most issues point on network connectivity but when you bounce Netbackup all problems are gone.

Claudio_Veronez
Level 6
Partner Accredited
Have U used all topics to solve it?


networking was part of the problem?

I'm having the same problem.. and didn't work for me.


Thanks