10-18-2012 02:12 PM
We have a NBU 7.5.0.3 environment (Windows master) that we are havng issues with our RMAN's. When they run, the fail and generate an NBU error 130 which is just about as vague as you can get. I am able to run a standard backup and grab files from the server without issue though.
The client bpcd logs shows the following error:
16:57:22.160 [18754] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=0 FILE=/usr/openv/var/host_cache/1ff/ffffffff+bpcd,1,1,0,1,0+.txt
The dbclient log shows the following:
13:47:21.718 [13983] <2> int_GetMMInfo: INF - Initialized Signal
13:47:21.718 [13983] <2> int_GetMMInfo: INF - support for Proxy Copy enabled
13:47:23.361 [13979] <2> conf_update_time: ../../libvlibs/nbconf_glue.cpp.436: errno: 2 2 0x00000002
13:47:23.361 [13979] <2> conf_update_time: ../../libvlibs/nbconf_glue.cpp.437: path: /home/oracle/bp.conf
13:47:23.361 [13979] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/062/f06de062+0,1,402,0,1,0+srvycnbmas01.txt
13:47:23.458 [13979] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/062/f06de062+veritas_pbx,1,20,2,1,0+srvycnbmas01.txt
13:47:23.507 [13979] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/0a4/6b4e98a4+0,1,402,0,1,0+10.246.123.110.txt
13:47:23.507 [13979] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/062/f06de062+vnetd,1,20,2,1,0+srvycnbmas01.txt
13:47:23.543 [13977] <2> conf_update_time: ../../libvlibs/nbconf_glue.cpp.436: errno: 2 2 0x00000002
13:47:23.544 [13977] <2> conf_update_time: ../../libvlibs/nbconf_glue.cpp.437: path: /home/oracle/bp.conf
13:47:23.544 [13977] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/062/f06de062+0,1,402,0,1,0+srvycnbmas01.txt
13:47:23.555 [13979] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/062/f06de062+bprd,1,20,2,1,0+srvycnbmas01.txt
14:47:23.607 [13979] <2> vnet_pbxConnect: pbxConnectEx Succeeded
13:47:23.607 [13979] <2> logconnections: BPRD CONNECT FROM 10.246.18.113.1853 TO 10.246.123.110.1556 fd = 21
13:47:23.640 [13977] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/062/f06de062+veritas_pbx,1,20,2,1,0+srvycnbmas01.txt
13:47:23.690 [13977] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/0a4/6b4e98a4+0,1,402,0,1,0+10.246.123.110.txt
Any suggestions? Any other information you require?
10-19-2012 02:05 AM
10-19-2012 05:46 AM
I'll grab the logs for you shortly...our DBA group manages the RMAN's and they are pretty typical RMAN scripts. We have a few that seem to work fine, and others that have never worked since we migrated over to our new 7.5.0.3 environment. If I migrate them back to the 7.0.1 enviroment that they used to work on, they now fail with the same error code. Tried re-installing the client with no luck either. I am able to run a backup from my master and grab files on the server, but the other direction fails.
10-19-2012 08:00 AM
Attached are the BPCD and the DBCLIENT logs. I'm going to try re-linking them to see if that will do anything. Any other suggestions would be appreciated.
Dan
10-22-2012 05:54 AM
Does anyone have any ideas? If not, I will open a ticket.
10-22-2012 03:23 PM
What's the O/S of the clients? (Guessing /usr/openv means maybe not Windows)
How full are the disks?
10-23-2012 01:33 AM
Info in dbclient log:
Veritas NetBackup for Oracle - Release 7.5 (2012060523) System name: Linux Node name: SRVYCORAP28 Release: 2.6.18-164.15.1.el5 Version: #1 SMP Mon Mar 1 10:56:08 EST 2010 Machine: x86_64 User name: oracle Client Host: SRVYCORAP28
10-23-2012 01:58 AM
All I can see in dbclient log is a timeout:
08:29:03.671 [23100] <16> readCommFile: ERR - timed out after 900 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/23100.0.1350656041
Large databases normally require Client Read and Client Connect timeouts of at least 1800.
Please double-check that all of these log folders on the client have 777 permissions (I have seen similar errors when Oracle user account did not have write permission on log folders):
dbclient
bphdb
user_ops
10-23-2012 10:23 AM
Hey Chris,
One of the servers giving us problems is a brand new RHEL 6 build with very little data sitting on it. I have checked the space on the other servers and we are good, but good suggestion.
Dan
10-23-2012 10:25 AM
We tried changing it to 777 with out any luck and the client in this example is a new database so shouldn't be having a timeout issue.
We did find that re-linking the database seems to be working (on two we have tried). We will keep testing and I will advise.
10-23-2012 11:57 AM
After a bit more testing, we are noticing that any change to the script (in our case a Schedule change) and it fails again. Do we need to re-link everytime we make a change to the client? Never have in the past.
10-26-2012 04:48 PM
10-29-2012 06:16 AM
It looks like rebooting the client has fixed the two giving us grief, but I will keep an eye on things and keep everyone updated if this occurs during the remaining upgrades. Thanks for all your help.