cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle Database backup hungs!

Arshad_Khateeb
Level 5
Certified

Configuration Setup

Master Server - SunOS 5.9     UNIX    Master Server    6.5.6    Connected

Media Server - SunOS 5.10     UNIX    Media Server    6.5.6    Connected

Client - SunOS 5.9     UNIX    Client    6.5.6    Connected

 

Since last month we are experiencing this oracle db backup issue. All the child job used to complete except parent. It just gets hung.

dbclient log

> tail log.051915 
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - Media Information for Backup File : <iwhprod_20150518214547_db_a8q7ao2l_1_1>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - Media Sharing Mode : <Multiple Concurrent Users>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - File Ordering Mode : <Sequential file access>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - Media ID : <AL4417>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - File Creation Date and Time : <1432045250>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - File Expiration Date and Time : <1440080450>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - Comment : <Backup ID : rome_pres-bu_1432045250>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - File Creation Method : <Stream>
10:52:20.027 [10505] <2> int_DumpSbtInfo: INF - leaving
10:52:20.027 [10505] <2> sbtinfo2: INF - leaving

 

rman log

> tail iwhprod_20150518214547_db_bkup_inc0.log
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00
RMAN-06731: command backup:100.0% complete, time left 00:00:00

 

 

 

26 REPLIES 26

Arshad_Khateeb
Level 5
Certified

- Hotbackup

- RMAN Script

- Oracle 10.2.0.5

- you mean rman trace logs ?

- Nope

 

BTW, we will be having Incr DB backups today. We have both NBU and Oracle Logs enabled. I'll share with you guys once the backup start/finishes.

Varunthilak_B
Level 3
Certified

Yes Please, Also send us your RMAN script. If you are using DB authentication please send us tnsnames.ora file.

Arshad_Khateeb
Level 5
Certified

The change we did is setting DBMto file to 120. We had a good backup over the weekend but the return code on RMAN side is 1 instead of 0.

DBA says that Tracing is turned on within the script, thus the reason for the RETURN CODE=1.  Once we turn tracing off, this error will not appear unless there is a failure.

Looks like this fix worked but we have a scheduled backup again on Thursday. We will see how it goes. I'll keep you all posted.

Arshad_Khateeb
Level 5
Certified

The backups stuck again.

Next run is on Saturday. We'll be disabling Catalog during this.

This is turning a very intresting troubleshooting and i hope you guys must be waiting to see what fixes this issue (apart from upgrading this old environment)

Updates to follow...

Nicolai
Moderator
Moderator
Partner    VIP   

Thanks for the update

Arshad_Khateeb
Level 5
Certified

hip hip hooray!

The database backup this weekend was successful.

We still need to make sure that Catalog Backups were indeed the change that worked?

Communication issue between client and the Master NBU server was solved but I will withhold final judgment until we can repeat this success. There have been a number of changes made to both Oracle and NBU but this last change was specifically directed at making sure the NBU database was quiet enough to listen for RMAN to finish the backup. It suggests there may be a performance issue on the NBU master server. We’ll keep the change in place for Thursday’s incremental backup.

Arshad_Khateeb
Level 5
Certified

Guess what, the issue got resolved.

The root cause is still mystery. We applied couple of fixes from Netbackup, excluded catalog, oracle did some changes to RMAN too, Network guys might have done something as well (but we are not informed), etc, etc.

Finaly, who's fix worked thats not known. We have reverted back the changes we did from Netbackup and the backup still works.

To conclude this thread, i can surely tell that this is not a Netbackup issue ;)