12-02-2018 07:39 PM
Hello All,
Oracle backups are failing for one of the server with error 54.
Its a virtual IP hosted on one of the physical machine. which will backup to a network media server.
Few streams are getting sucessfull without any issue but few are failing with 54.
bptest to that virtual IP is showing like below when i do with verbose option:
20:42:32.818 [27678] <8> retry_getaddrinfo_for_real: [vnet_addrinfo.c:1082] getaddrinfo() failed RV=8 NAME=uspebrms01 SVC=0 errno=0
20:42:32.818 [27678] <8> retry_getaddrinfo: [vnet_addrinfo.c:922] retry_getaddrinfo_for_real failed RV=8 NAME=uspebrms01 SVC=0
20:42:32.818 [27678] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1732] retry_getaddrinfo() failed RV=8 NAME=uspebrms01 SVC=NULL
20:42:32.821 [27678] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1360] vnet_cached_getaddrinfo_and_update() failed 6 0x6
20:42:32.821 [27678] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2971] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME1=uspebrms01
20:42:34.150 [27678] <8> retry_getaddrinfo_for_real: [vnet_addrinfo.c:1082] getaddrinfo() failed RV=8 NAME=uspebr02 SVC=0 errno=0
20:42:34.150 [27678] <8> retry_getaddrinfo: [vnet_addrinfo.c:922] retry_getaddrinfo_for_real failed RV=8 NAME=uspebr02 SVC=0
20:42:34.150 [27678] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1732] retry_getaddrinfo() failed RV=8 NAME=uspebr02 SVC=NULL
20:42:34.153 [27678] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1360] vnet_cached_getaddrinfo_and_update() failed 6 0x6
20:42:34.153 [27678] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2971] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME1=uspebr02
20:42:34.157 [27678] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1637] in failed cache ERR=8 NAME=uspebrms01 SVC=NULL
20:42:34.157 [27678] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1360] vnet_cached_getaddrinfo_and_update() failed 6 0x6
20:42:34.157 [27678] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2978] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=uspebrms01
20:42:34.157 [27678] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1637] in failed cache ERR=8 NAME=uspebr02 SVC=NULL
20:42:34.157 [27678] <8> vnet_cached_getaddrinfo: [vnet_addrinfo.c:1360] vnet_cached_getaddrinfo_and_update() failed 6 0x6
20:42:34.157 [27678] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2978] vnet_cached_getaddrinfo() failed STAT=6 RV=8 NAME2=uspebr0220:42:34.627 [27678] <8> vnet_vnetd_push_ipaddr: [vnet_vnetd.c:1820] bptr[i] 135 0x87
20:42:34.628 [27678] <8> vnet_vnetd_push_ipaddr: [vnet_vnetd.c:1820] bptr[i] 209 0xd1
20:42:34.628 [27678] <8> vnet_vnetd_push_ipaddr: [vnet_vnetd.c:1820] bptr[i] 14 0xe
20:42:34.628 [27678] <8> vnet_vnetd_push_ipaddr: [vnet_vnetd.c:1820] bptr[i] 11 0xb
Could someone suggest what might be the issue?
I have verified below things:
Virtual IP exists on the host, and its not an app_cluster, its a single node database.
All entries exists on bp.conf file, ping, telnet to all NBU ports are connecting, traceroute happening and it belongs to same subnet to media server.
Services are fine. all ports are in listen state.
File level backups for physical host are completing fine.
media server: Solaris, NBU version 7.7.2
Please let me know if any further information is required.
12-03-2018 03:17 AM
Can u please run name resolution for all interfaces of master/media from client and viceversa suing bpclntcmd and nslookup command.
I am suspecting name resolution issue here, which can be fixed by DNS or /etc/host entries.
12-03-2018 03:52 AM
I am missing something - why would the Oracle server have a virtual IP if not part of a cluster?
bptestbpcd does not seem to show all of the output.
Could you please share everything? (including the initial command that shows client hostname)
Where did you run bptestbpcd from?
Master or media server?
Important to run command from master and media server.
The bptestbpcd snippet contains 3 hostnames:
uspebrms01
uspebr02
uspebr0220
Could you tell us the roll of each of these names?
Have you checked output of these commands on the client?
bpclntcmd -pn
bpclntcmd -self
If all hostname lookup/connection is good, then there is one more thing to check:
Timeouts on master and media server.
For large databases, it is sometimes necessary to increase CLIENT_CONNECT_TIMEOUT and CLIENT_CONNECT_TIMEOUT on master and media server.
Try 1800 (default is 300).
12-14-2018 09:22 AM