cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

Intermittent Backup Problems (RMAN, Oracle, Netbackup, SAP)

nlsme3
Not applicable

Hi,

We recently rolled out a new way of performing our DB and redolog backups based on RMAN with Netbackup. We implemented it on approx. 50 server (not all on the same day). After a couple of days on some of those systems (until now on 5 of them) issues have arisen. These are all SAP Systems performing their backup with RMAN (backup_dev_type = rman_util).

The symptoms are as follows. backup with RMAN is implemented and DB Level 0 backup is running fine. Subsequent differental increment DB backups and redolog backups run as well. Suddenly a redologs backup fails (until now the problem is first seen with redolog backups, but since we run ofcourse more redolog backups as DB backups, it could be just coincidence). Not just the backup fails, but from then on no new DB connections can be established (the running processes run just fine). We can just connect with user SYS, logging on with SYSTEM is also not possible. We need to restart the DB (more precisely, restart the Oracle Service) to get around the issue.

Unfortunately we had not configured the logfile directory in the netbackup\logs folder (we have done that now, we will need to wait until the problem occurs again, which might only occur in a couple of days).

The logfile of the first failing redologs backup is shown below.

BR0278E Command output of 'G:\oracle\SSS\11203\BIN\rman nocatalog':
Recovery Manager: Release 11.2.0.3.0 - Production on Sun Jul 1 02:30:02 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
RMAN>
RMAN> connect target *
connected to target database: SSS (DBID=4128056176)
using target database control file instead of recovery catalog
RMAN> **end-of-file**
RMAN>
host command complete
RMAN> 2> 3> 4> 5>
allocated channel: sbt_1
channel sbt_1: SID=1508 device type=SBT_TAPE
channel sbt_1: Veritas NetBackup for Oracle - Release 7.1 (20110203)
Starting backup at 01-JUL-12
channel sbt_1: starting archived log backup set
channel sbt_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=3232 RECID=3206 STAMP=787457523
channel sbt_1: starting piece 1 at 01-JUL-12
released channel: sbt_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on sbt_1 channel at 07/01/2012 02:30:05
ORA-19506: failed to create sequential file, name="SSS_aeiwtbpg.731_1", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
   VxBSACreateObject: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve
RMAN>
specification does not match any backup in the repository
RMAN>
Recovery Manager complete.
BR0200I BR_TRACE: location BrRmanCall-41, commands for RMAN in: G:\oracle\SSS\saparch\.aeiwtbpg.cmd
'@G:\oracle\SSS\saparch\..aeiwtbpg..cmd^'
'host 'G:\usr\sap\SSS\SYS\exe\uc\NTAMD64\brtools.exe -f delete G:\oracle\SSS\saparch\..aeiwtbpg..cmd';^'
'run { allocate channel sbt_1 device type 'SBT_TAPE'^'
'parms 'BLKSIZE=1048576 ENV=(NB_ORA_SAP=G:\oracle\SSS\11203\database\initSSS.utl)';^'
'backup tag aeiwtbpg format 'SSS_aeiwtbpg.%s_%p' filesperset 1 check logical^'
'archivelog from sequence 3232 until sequence 3232 thread 1;^'
'release channel sbt_1; }^'
'list backup of archivelog from sequence 3232 until sequence 3232 thread 1 tag aeiwtbpg;^'
'exit;^'
BR0280I BRARCHIVE time stamp: 2012-07-01 02.30.06
BR0279E Return code from 'G:\oracle\SSS\11203\BIN\rman nocatalog': 1
BR0522I 1 of 1 file / save set processed by RMAN
BR0536E RMAN call for database instance SSS failed
BR0280I BRARCHIVE time stamp: 2012-07-01 02.30.06
BR0543E Offline redolog backup using RMAN failed
BR0016I 0 offline redolog files processed, total size 0.000 MB
BR0280I BRARCHIVE time stamp: 2012-07-01 02.30.06

 

Environment:

  • OS: Windows 2008 Server R2
  • Netbackup: 7.1.0.2 (on client as well as on server)
  • Oracle: 11.2.0.3 (with OLTP table compression)
  • SAP: BR*TOOLS 7.01 as well as 7.20

 

Has anyone seen this issue before? Does anyone have a clue how to fix this? Especially the fact that we need to restart the DB to resolve the issue bothers me.

 

Thanks a lot in advance!

Cheers.

4 REPLIES 4

Yogesh9881
Level 6
Accredited

can you pls post output of below command. run on master server

<install_path>\NetBackup\bin\admincmd\bppllist <netbackup_policy> -L

 

Nicolai
Moderator
Moderator
Partner    VIP   

I have 10 years experience with SAP backup - primary on UNIX but I have never heard of a issues like that. Sound like a Oracle bug for me.

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Anything in Windows Application log?

Have you enabled NBU logs in the meantime? (dbclient as a start).

The following could be a timeout or Comms failure:

Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the server

To troubleshoot, you will need dbclient log on Oracle client as well as the bprd log on the master (NBU needs to be restarted to enable bprd log).

Without logs, anything from our side is mere guessing...

Will_Restore
Level 6

We need to restart the DB (more precisely, restart the Oracle Service) to get around the issue.

 

I would log a support case with Oracle, ASAP !