01-15-2015 09:29 AM
We have a Solaris NBU master, and Solaris and Linux clients, each of which have a NIC on the public network and a NIC on the backup network,
The Solaris and linux clients are all configured with a bp.conf of:
SERVER = NBU master backup NIC
CLIENT = client backup NIC
and policy has been setup (one for each O/S) with the client using the backup NIC and Solaris clients work, but Linux does not as the host will not connect
If we change bp.conf to use public NIC for SERVER and CLIENT and change client in policy to use public NIC, then Linux will connect.
I have check all the obvious stuff between NBU master and Linux client:
Any ideas what else to check?
The NBU master is 7.5.0.6 on Solaris 10
The Solaris client is 7.5 on Solaris 10
The Linux client is 7.0.1 on Redhat 6.4
Mike
01-15-2015 11:12 AM
Are you using REQUIRED_INTERFACE entry in bp.conf file on linux servers ?
CLIENT_NAME = <name> - this specifies your client name
REQUIRED_INTERFACE = "Backup-vlan-IP-Address"
This forces ALL netbackup traffic through the IP address mentioned against REQUIRED_INTERFACE
01-15-2015 11:36 AM
I have tried both REQUIRED_INTERFACE and PREFERRED_INTERFACE and client will still not connect if I add using hostname that resolves to backup interface.
Mike
01-15-2015 11:45 AM
Required Interface can be used when the desired results is compatible with the configuration of the operating system and networking layers and devices.
Please read the tech doc: Required Interface and best practices (TECH135376). This will give you a deatiled checklist of things to watch out for.
01-15-2015 12:37 PM
I have already read this. As I understand, you use REQUIRED_INTERFACE and PREFERRED_INTERFACE if your NBU master does not have a backup interface so that communication between client and NBU master can happen on public network while backup data goes over the backup network.
We do not need this (I only tried out of desperation) as the NBU master is on the backup network, so ALL traffic should be able to use the backup network.
Are there any logs in NBU to say what the issue is. If I do a tcpdump on the client, I can see the request come from the NBU master, but the client does not respond.
I have tried 4 different Linux clients, some VMs some phyical and get the same issue on all of them.
Mike
01-15-2015 05:07 PM
Does the redhat linux have iptables configuerd and running, which may have the route for public NIC but not the backup NIC?
01-16-2015 12:28 AM
I agree with Watsons
Use "iptables -F" to flush all active tables
What does a traceroute from the client master show, any change that the subnet mask on the Linux client is wrong configured ?
tcpdump is a pretty good tool to identify where traffic is going.
01-16-2015 03:06 AM
iptables is running and has the following rules:
# iptables -L WARNING: All config files need .conf: /etc/modprobe.d/fc-hba.conf.old, it will be ignored in a future release. Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination
I have subsituted IP addresses so output below has:
nbu_master (192.10.10.176)
nbu_master_bck (10.10.10.238)
lnx_client (164.10.10.246)
lnx_client_bck (10.10.10.244)
Output of traceroute and telnet to ports:
From NBU client: [root@lnx_client netbackup]# traceroute nbu_master traceroute to nbu_master (192.10.10.176), 30 hops max, 60 byte packets 1 164.10.10.130 (164.10.10.130) 0.233 ms 0.277 ms 0.318 ms 2 nbu_master.domain (192.10.10.176) 0.239 ms 0.249 ms 0.261 ms [root@lnx_client netbackup]# traceroute nbu_master_bck traceroute to nbu_master_bck (10.10.10.238), 30 hops max, 60 byte packets 1 nbu_master_bck.domain (10.10.10.238) 0.973 ms 0.992 ms 0.986 ms [root@lnx_client netbackup]# telnet nbu_master_bck 1556 Trying 10.10.10.238... Connected to nbu_master_bck. Escape character is '^]'. ^] telnet> quit Connection closed. [root@lnx_client netbackup]# telnet nbu_master_bck 13782 Trying 10.10.10.238... Connected to nbu_master_bck. Escape character is '^]'. ^CConnection closed by foreign host. From NBU master: root@nbu_master # traceroute lnx_client traceroute: Warning: Multiple interfaces found; using 192.10.10.176 @ igb0 traceroute to lnx_client (164.10.10.246), 30 hops max, 40 byte packets 1 164.10.10.130 (164.10.10.130) 0.370 ms 0.267 ms 0.243 ms 2 lnx_client.domain (164.10.10.246) 0.215 ms 0.191 ms 0.177 ms root@nbu_master # traceroute lnx_client_bck traceroute: Warning: Multiple interfaces found; using 10.10.10.238 @ igb1 traceroute to lnx_client_bck (10.10.10.244), 30 hops max, 40 byte packets 1 lnx_client_bck.domain (10.10.10.244) 0.419 ms 0.232 ms 0.203 ms root@nbu_master # telnet lnx_client_bck 1556 Trying 10.10.10.244... Connected to lnx_client_bck.domain. Escape character is '^]'. ^] telnet> quit Connection to lnx_client_bck.domain closed. root@nbu_master # telnet lnx_client_bck 13782 Trying 10.10.10.244... Connected to lnx_client_bck.domain. Escape character is '^]'. ^] telnet> quit Connection to lnx_client_bck.domain closed.
So as you can see the backup IPs for both hosts are in the same subnet, so there are no routes involved and the traffic just goes from interface toi interface
For the public network, the NBU master and Linux client are in different networks so require a route to communicate
If this is a network issue, then can anyone explain what I need to check at the network level as everything seems to be working fine, else what can I check from an NBU config perspective.
Mike
01-16-2015 03:58 AM
I would use the bpcd log together with the bptestpcd and bpclntcmd commands as descibed here:
both with and without the firewal running.
01-16-2015 04:47 AM
Hi Mike
Best to disable iptables on Linux client.
As from NBU 7.0.1, NBU will first try to connect using PBX (port 1556).
If that fails, vnetd will be tried (port 13724).
If vnetd fails as well, bpcd will be tried (port 13782).
This is how I would troubleshoot:
Check on client that bpcd, vnetd and pbx_exchange are running.
Create bpcd log folder on client (under /usr/openv/netbackup/logs).
Troubleshoot connectivity using NBU commands as per Michael's suggestions.
Check forward and reverse name lookup between client and servers using bpclntcmd.
See http://www.symantec.com/docs/TECH27430
Test connectivity using bptestbpcd from master and media server:
bptestbpcd -client lnx_client_bck -debug -verbose
Post output of command as well as client's bpcd log.
If nothing is logged in bpcd, iptables or company firewall is blocking port connection.
01-16-2015 04:58 AM
If you can telnet to the required ports then the firewall is not the issue. Can you connect to the client from the Admin GUI? Host Properties > Client > Client-bkp >Properties
01-16-2015 06:42 AM
That is the way I view it, firewall is not an issue if I can telnet to ports, unless there are other ports that NBU uses or NBU uses other protocols other than tcp (for exampe if NBU uses udp).
Anyway I stopped iptables and still same issue
Connecting to the client from the Admin GUI - Host Properties > Client > Client-bkp > Refresh Selected (Properties is greyed out) is what I am doing to verify it if works and it works with lnx_client, but not lnx_client_bck ( I also change bpconf to match and restarted netbackup)
So with bp.conf:
SERVER = nbu_master_bck
CLIENT_NAME = lnx_client_bck
MEDIA_SERVER = nbu_media_bck
Then if I stop and stop netbackup (service netbackup stop and service netbackup start) and add client lnx_client_bck" to a policy, then when I click on "Refresh Selected" for client in Host Properties the status says "cannot connect on socket", but with bpconf:
SERVER = nbu_master
CLIENT_NAME = lnx_client
MEDIA_SERVER = nbu_media_bck
Then if I delete client in policy and recreate as lnx_client, then client connects when I click on "Refresh Selected"
Output of forward and reverse lookup:
root@nbu_master # ./bpclntcmd -hn lnx_client_bck host lnx_client_bck: lnx_client_bck.domain at 10.10.10.244 aliases: lnx_client_bck.domain lnx_client_bck 10.10.10.244 root@nbu_master # ./bpclntcmd -hn nbu_master_bck host nbu_master_bck: nbu_master_bck at 10.10.10.238 aliases: nbu_master_bck 10.10.10.238 root@nbu_master # ./bpclntcmd -hn 10.10.10.244 host 10.10.10.244: lnx_client_bck.domain at 10.10.10.244 aliases: lnx_client_bck.domain 10.10.10.244 root@nbu_master # ./bpclntcmd -hn 10.10.10.238 host 10.10.10.238: nbu_master_bck at 10.10.10.238 aliases: nbu_master_bck 10.10.10.238 [root@lnx_client bin]# ./bpclntcmd -hn lnx_client_bck host lnx_client_bck: lnx_client_bck at 10.10.10.244 aliases: lnx_client_bck 10.10.10.244 [root@lnx_client bin]# ./bpclntcmd -hn nbu_master_bck host nbu_master_bck: nbu_master_bck at 10.10.10.238 aliases: nbu_master_bck 10.10.10.238 [root@lnx_client bin]# ./bpclntcmd -hn 10.10.10.244 host 10.10.10.244: lnx_client_bck.domain at 10.10.10.244 aliases: lnx_client_bck.domain 10.10.10.244 [root@lnx_client bin]# ./bpclntcmd -hn 10.10.10.238 host 10.10.10.238: nbu_master_bck.domain at 10.10.10.238 aliases: nbu_master_bck.domain 10.10.10.238
Actually client connects via NBU vip as NBU ,aster is clustered (there is a public and bck VIP) and bpclntcmd works for the VIP too as does telneting on port:
Output from bptestbpcd :
root@nbu_master # bptestbpcd -client lnx_client_bck -debug -verbose 14:14:42.493 [20317] <2> bptestbpcd: VERBOSE = 0 14:14:42.494 [20317] <2> ConnectionCache::connectAndCache: Acquiring new connection for host nbu_mast_vip, query type 223 14:14:42.499 [20317] <2> logconnections: BPDBM CONNECT FROM 192.10.10.183.40627 TO 192.10.10.183.13721 fd = 4 14:14:42.592 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:922] VxSS magic 1450726227 0x56785353 14:14:42.593 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:923] *use_vxss 0 0x0 14:14:42.613 [20317] <8> check_if_machine_cred: [vnet_vxss_helper.c:1735] authenticated peer machine name nbu_mast_vip 14:14:42.614 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1017] PEER INFORMATION 14:14:42.614 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1018] Name nbu_mast_vip 14:14:42.614 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1019] Domain NBU_Machines@nbu_mast_vip 14:14:42.614 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1020] Expiry Jan 13 09:28:29 2016 GMT 14:14:42.614 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1021] Issued by /CN=broker/OU=root@nbu_mast_vip/O=vx 14:14:42.616 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1026] Local INFORMATION 14:14:42.616 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1027] Name root 14:14:42.616 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1028] Domain nbu_mast_vip 14:14:42.616 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1029] Expiry Nov 27 08:49:13 2014 GMT 14:14:42.617 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1030] Issued by /CN=broker/OU=root@nbu_mast_vip/O=vx 14:14:42.617 [20317] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1034] Using VxSS authentication 1 0x1 14:14:42.944 [20317] <2> db_CLIENTsend: reset client protocol version from 0 to 8 14:14:43.009 [20317] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227 14:14:43.012 [20317] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : other_host 14:14:43.013 [20317] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : other_host 14:14:43.014 [20317] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : other_host 14:17:55.043 [20317] <8> async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91 14:18:07.045 [20317] <8> async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91 14:18:19.047 [20317] <8> async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91
What does:
14:14:43.009 [20317] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227
mean from above?
From client:
# ps -ef | egrep "bp|vnet|pbx"
root 5625 1 0 Jan13 ? 00:00:00 /opt/VRTSpbx/bin/pbx_exchange
root 57954 43610 0 14:28 pts/0 00:00:00 egrep bp|vnet|pbx
root 60172 1 0 13:48 ? 00:00:00 /usr/openv/netbackup/bin/vnetd -standalone
root 60175 1 0 13:48 ? 00:00:00 /usr/openv/netbackup/bin/bpcd -standalone
I don't get anything created in /usr/openv/netbackup/logs/bpcd when I click on "Refresh Selected" for lnx_client_bck.
tcpdump from lnx_client when I click on "Refresh Selected" for lnx_client_bck gives:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
14:31:07.964130 IP nbu_mast_vip.domain.30833 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 394969139, win 32770, options [mss 1460,nop,nop,TS val 855551417 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:09.102579 IP nbu_mast_vip.domain.30833 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 394969139, win 32770, options [mss 1460,nop,nop,TS val 855551531 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:11.372748 IP nbu_mast_vip.domain.30833 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 394969139, win 32770, options [mss 1460,nop,nop,TS val 855551758 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:15.893077 IP nbu_mast_vip.domain.30833 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 394969139, win 32770, options [mss 1460,nop,nop,TS val 855552209 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:17.965195 IP nbu_mast_vip.domain.33929 > lnx_client_bck.domain.vnetd: Flags [S], seq 408876887, win 32770, options [mss 1460,nop,nop,TS val 855552417 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:19.103285 IP nbu_mast_vip.domain.33929 > lnx_client_bck.domain.vnetd: Flags [S], seq 408876887, win 32770, options [mss 1460,nop,nop,TS val 855552531 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:21.373398 IP nbu_mast_vip.domain.33929 > lnx_client_bck.domain.vnetd: Flags [S], seq 408876887, win 32770, options [mss 1460,nop,nop,TS val 855552758 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:24.915167 IP nbu_mast_vip.domain.30833 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 394969139, win 32770, options [mss 1460,nop,nop,TS val 855553112 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:25.893792 IP nbu_mast_vip.domain.33929 > lnx_client_bck.domain.vnetd: Flags [S], seq 408876887, win 32770, options [mss 1460,nop,nop,TS val 855553209 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:27.967769 IP nbu_mast_vip.domain.meter > lnx_client_bck.domain.bpcd: Flags [S], seq 421468782, win 32770, options [mss 1460,nop,nop,TS val 855553417 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:29.103935 IP nbu_mast_vip.domain.meter > lnx_client_bck.domain.bpcd: Flags [S], seq 421468782, win 32770, options [mss 1460,nop,nop,TS val 855553531 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:31.374039 IP nbu_mast_vip.domain.meter > lnx_client_bck.domain.bpcd: Flags [S], seq 421468782, win 32770, options [mss 1460,nop,nop,TS val 855553758 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:34.914353 IP nbu_mast_vip.domain.33929 > lnx_client_bck.domain.vnetd: Flags [S], seq 408876887, win 32770, options [mss 1460,nop,nop,TS val 855554112 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:35.894376 IP nbu_mast_vip.domain.meter > lnx_client_bck.domain.bpcd: Flags [S], seq 421468782, win 32770, options [mss 1460,nop,nop,TS val 855554209 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:40.969334 IP nbu_mast_vip.domain.30833 > lnx_client_bck.domain.veritas_pbx: Flags [R], seq 394969140, win 32770, length 0
14:31:40.969550 IP nbu_mast_vip.domain.33929 > lnx_client_bck.domain.vnetd: Flags [R], seq 408876888, win 32770, length 0
14:31:40.969750 IP nbu_mast_vip.domain.meter > lnx_client_bck.domain.bpcd: Flags [R], seq 421468783, win 32770, length 0
14:31:41.402170 IP nbu_mast_vip.domain.37983 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 439646784, win 32770, options [mss 1460,nop,nop,TS val 855554760 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:42.534831 IP nbu_mast_vip.domain.37983 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 439646784, win 32770, options [mss 1460,nop,nop,TS val 855554874 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:44.804929 IP nbu_mast_vip.domain.37983 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 439646784, win 32770, options [mss 1460,nop,nop,TS val 855555101 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:49.325172 IP nbu_mast_vip.domain.37983 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 439646784, win 32770, options [mss 1460,nop,nop,TS val 855555553 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:51.403007 IP nbu_mast_vip.domain.39648 > lnx_client_bck.domain.vnetd: Flags [S], seq 454198869, win 32770, options [mss 1460,nop,nop,TS val 855555760 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:52.535442 IP nbu_mast_vip.domain.39648 > lnx_client_bck.domain.vnetd: Flags [S], seq 454198869, win 32770, options [mss 1460,nop,nop,TS val 855555874 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:54.805609 IP nbu_mast_vip.domain.39648 > lnx_client_bck.domain.vnetd: Flags [S], seq 454198869, win 32770, options [mss 1460,nop,nop,TS val 855556101 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:58.347183 IP nbu_mast_vip.domain.37983 > lnx_client_bck.domain.veritas_pbx: Flags [S], seq 439646784, win 32770, options [mss 1460,nop,nop,TS val 855556455 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:31:59.325936 IP nbu_mast_vip.domain.39648 > lnx_client_bck.domain.vnetd: Flags [S], seq 454198869, win 32770, options [mss 1460,nop,nop,TS val 855556553 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:32:01.404457 IP nbu_mast_vip.domain.778 > lnx_client_bck.domain.bpcd: Flags [S], seq 466613952, win 32770, options [mss 1460,nop,nop,TS val 855556760 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:32:02.536118 IP nbu_mast_vip.domain.778 > lnx_client_bck.domain.bpcd: Flags [S], seq 466613952, win 32770, options [mss 1460,nop,nop,TS val 855556874 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:32:04.806181 IP nbu_mast_vip.domain.778 > lnx_client_bck.domain.bpcd: Flags [S], seq 466613952, win 32770, options [mss 1460,nop,nop,TS val 855557101 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:32:08.346538 IP nbu_mast_vip.domain.39648 > lnx_client_bck.domain.vnetd: Flags [S], seq 454198869, win 32770, options [mss 1460,nop,nop,TS val 855557455 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:32:09.326573 IP nbu_mast_vip.domain.778 > lnx_client_bck.domain.bpcd: Flags [S], seq 466613952, win 32770, options [mss 1460,nop,nop,TS val 855557553 ecr 0,nop,wscale 7,nop,nop,sackOK], length 0
14:32:14.406279 IP nbu_mast_vip.domain.37983 > lnx_client_bck.domain.veritas_pbx: Flags [R], seq 439646785, win 32770, length 0
So packets are coming through from NBU master to client so I am not sure what socket connection is failing
Mike
01-16-2015 06:55 AM
I get message:
14:47:17.654 [609] <2> db_CLIENTsend: reset client protocol version from 0 to 8
on succesful connection to lnx_client - below is output:
root@nbu_master # bptestbpcd -client lnx_client -debug -verbose 14:47:17.204 [609] <2> bptestbpcd: VERBOSE = 0 14:47:17.205 [609] <2> ConnectionCache::connectAndCache: Acquiring new connection for host nbu_mast_vip, query type 223 14:47:17.210 [609] <2> logconnections: BPDBM CONNECT FROM 192.10.10.183.14735 TO 192.10.10.183.13721 fd = 4 14:47:17.300 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:922] VxSS magic 1450726227 0x56785353 14:47:17.300 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:923] *use_vxss 0 0x0 14:47:17.321 [609] <8> check_if_machine_cred: [vnet_vxss_helper.c:1735] authenticated peer machine name nbu_mast_vip 14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1017] PEER INFORMATION 14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1018] Name nbu_mast_vip 14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1019] Domain NBU_Machines@nbu_mast_vip 14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1020] Expiry Jan 13 09:28:29 2016 GMT 14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1021] Issued by /CN=broker/OU=root@nbu_mast_vip/O=vx 14:47:17.324 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1026] Local INFORMATION 14:47:17.324 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1027] Name root 14:47:17.324 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1028] Domain nbu_mast_vip 14:47:17.324 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1029] Expiry Nov 27 08:49:13 2014 GMT 14:47:17.324 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1030] Issued by /CN=broker/OU=root@nbu_mast_vip/O=vx 14:47:17.324 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1034] Using VxSS authentication 1 0x1 14:47:17.654 [609] <2> db_CLIENTsend: reset client protocol version from 0 to 8 14:47:17.718 [609] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227 14:47:17.721 [609] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : gbahex130 14:47:17.722 [609] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : gbahex130 14:47:17.723 [609] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : gbahex130 14:47:17.725 [609] <2> vnet_pbxConnect: pbxConnectEx Succeeded 14:47:17.726 [609] <2> logconnections: BPCD CONNECT FROM 192.10.10.183.14786 TO 164.10.10.246.1556 fd = 4 14:47:17.728 [609] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : gbahex130 14:47:17.728 [609] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : gbahex130 14:47:17.730 [609] <2> vnet_pbxConnect: pbxConnectEx Succeeded 14:47:17.731 [609] <8> do_pbx_service: [vnet_connect.c:2108] via PBX VNETD CONNECT FROM 192.10.10.183.14797 TO 164.10.10.246.1556 fd = 10 14:47:17.731 [609] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:443] VN_REQUEST_CONNECT_FORWARD_SOCKET 10 0xa 14:47:17.772 [609] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:460] ipc_string /tmp/vnet-51722421419637771337000000000-97hEON 14:47:17.891 [609] <2> ConnectToBPCD: bpcd_connect_and_verify(lnx_client, lnx_client) failed: 46 <16>bptestbpcd main: Function ConnectToBPCD(lnx_client) failed: 46 14:47:17.892 [609] <16> bptestbpcd main: Function ConnectToBPCD(lnx_client) failed: 46 <2>bptestbpcd: server not allowed access 14:47:17.892 [609] <2> bptestbpcd: server not allowed access <2>bptestbpcd: EXIT status = 46 14:47:17.892 [609] <2> bptestbpcd: EXIT status = 46 server not allowed access root@nbu_master #
So outputs when using bptestbpcd to lnx_client and lnx_client_bck are the same except at the end where for bck interface I get:
14:14:43.009 [20317] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227 14:17:55.043 [20317] <8> async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91 14:18:07.045 [20317] <8> async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91 14:18:19.047 [20317] <8> async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91 ^C
and it doesn't come back so I have to <ctrl><c> and with public interface I get:
14:47:17.718 [609] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227 14:47:17.725 [609] <2> vnet_pbxConnect: pbxConnectEx Succeeded 14:47:17.726 [609] <2> logconnections: BPCD CONNECT FROM 192.10.10.183.14786 TO 164.10.10.246.1556 fd = 4 14:47:17.730 [609] <2> vnet_pbxConnect: pbxConnectEx Succeeded 14:47:17.731 [609] <8> do_pbx_service: [vnet_connect.c:2108] via PBX VNETD CONNECT FROM 192.10.10.183.14797 TO 164.10.10.246.1556 fd = 10 14:47:17.731 [609] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:443] VN_REQUEST_CONNECT_FORWARD_SOCKET 10 0xa 14:47:17.772 [609] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:460] ipc_string /tmp/vnet-51722421419637771337000000000-97hEON 14:47:17.891 [609] <2> ConnectToBPCD: bpcd_connect_and_verify(lnx_client, lnx_client) failed: 46 <16>bptestbpcd main: Function ConnectToBPCD(lnx_client) failed: 46 14:47:17.892 [609] <16> bptestbpcd main: Function ConnectToBPCD(lnx_client) failed: 46 <2>bptestbpcd: server not allowed access 14:47:17.892 [609] <2> bptestbpcd: server not allowed access <2>bptestbpcd: EXIT status = 46 14:47:17.892 [609] <2> bptestbpcd: EXIT status = 46 server not allowed access
So although this connects, it is still showing errors - are these errors normal?
Mike
01-16-2015 08:08 AM
You are now getting a Server Not Allowed Access error that indicates that something is off in the bp.conf file it is your nbu_mast_vip name that is conflicting I believe.
14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1017] PEER INFORMATION
14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1018] Name nbu_mast_vip
14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1019] Domain NBU_Machines@nbu_mast_vip
14:47:17.321 [609] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1020] Expiry Jan 13 09:28:29 2016 GMT
01-19-2015 04:09 AM
No these errors is not normal, and I would not call the connection to lnx_client successful as long as the bptestbpcd gives anything but 0.
As your master is clustered, I would recommend to put both the logical and physical node bck names as SERVER entries in bp.conf.
Would like to see your bpclntcmd -pn and bpclntcmd -self output from the lnx_client together with the route table
01-19-2015 06:18 AM
"server not allowed access" message was because I had not changed bp.conf to use public interface when running bptestbpcd, so bp.conf still had _bck interface in it - so below shows successful connection when bp.conf is using public interface:
root@nbu_master # bptestbpcd -client lnx_client -debug -verbose 13:47:17.321 [13498] <2> bptestbpcd: VERBOSE = 0 13:47:17.322 [13498] <2> ConnectionCache::connectAndCache: Acquiring new connection for host nbu_mast_vip, query type 223 13:47:17.327 [13498] <2> logconnections: BPDBM CONNECT FROM 192.10.10.183.17093 TO 192.10.10.183.13721 fd = 4 13:47:17.424 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:922] VxSS magic 1450726227 0x56785353 13:47:17.424 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:923] *use_vxss 0 0x0 13:47:17.445 [13498] <8> check_if_machine_cred: [vnet_vxss_helper.c:1735] authenticated peer machine name nbu_mast_vip 13:47:17.446 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1017] PEER INFORMATION 13:47:17.446 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1018] Name nbu_mast_vip 13:47:17.446 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1019] Domain NBU_Machines@nbu_mast_vip 13:47:17.446 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1020] Expiry Jan 13 09:28:29 2016 GMT 13:47:17.446 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1021] Issued by /CN=broker/OU=root@nbu_mast_vip/O=vx 13:47:17.448 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1026] Local INFORMATION 13:47:17.448 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1027] Name root 13:47:17.448 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1028] Domain nbu_mast_vip 13:47:17.448 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1029] Expiry Nov 27 08:49:13 2014 GMT 13:47:17.448 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1030] Issued by /CN=broker/OU=root@nbu_mast_vip/O=vx 13:47:17.448 [13498] <8> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:1034] Using VxSS authentication 1 0x1 13:47:17.786 [13498] <2> db_CLIENTsend: reset client protocol version from 0 to 8 13:47:17.851 [13498] <2> db_getCLIENT: db_CLIENTreceive: no entity was found 227 13:47:17.851 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/1fc/b1d75ffc+0,1,50,0,2,0+lnx_client.txt 13:47:17.858 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/1fc/b1d75ffc+veritas_pbx,1,4,2,2,0+lnx_client.txt 13:47:17.905 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/0e3/42701ee3+0,1,50,0,2,0+164.10.10.246.txt 13:47:17.906 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:17.907 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/1fc/b1d75ffc+vnetd,1,4,2,2,0+lnx_client.txt 13:47:17.978 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:17.979 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/1fc/b1d75ffc+bpcd,1,4,2,2,0+lnx_client.txt 13:47:18.057 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:18.059 [13498] <2> vnet_pbxConnect: pbxConnectEx Succeeded 13:47:18.060 [13498] <2> logconnections: BPCD CONNECT FROM 192.10.10.183.17274 TO 164.10.10.246.1556 fd = 4 13:47:18.061 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/0e3/42701ee3+veritas_pbx,1,4,2,2,0+164.10.10.246.t xt 13:47:18.063 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:18.063 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/0e3/42701ee3+vnetd,1,4,2,2,0+164.10.10.246.txt 13:47:18.065 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:18.066 [13498] <2> vnet_pbxConnect: pbxConnectEx Succeeded 13:47:18.068 [13498] <8> do_pbx_service: [vnet_connect.c:2108] via PBX VNETD CONNECT FROM 192.10.10.183.17282 TO 164.10.10.246.1556 fd = 10 13:47:18.069 [13498] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:443] VN_REQUEST_CONNECT_FORWARD_SOCKET 10 0xa 13:47:18.109 [13498] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:460] ipc_string /tmp/vnet-06413421675238108347000000000-VhWyP9 13:47:18.223 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/169/19e4e369+0,1,50,0,2,0+lnx_client.domain.txt 1 1 1 192.10.10.183:17274 -> 164.10.10.246:1556 192.10.10.183:17282 -> 164.10.10.246:1556 13:47:18.340 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/1ff/ffffffff+vnetd,1,8,0,2,0+.txt 13:47:18.459 [13498] <2> bpcr_get_peername_rqst: Server peername length = 24 13:47:18.579 [13498] <2> bpcr_get_hostname_rqst: Server hostname length = 9 13:47:18.699 [13498] <2> bpcr_get_clientname_rqst: Server clientname length = 9 13:47:18.819 [13498] <2> bpcr_get_version_rqst: bpcd version: 07010000 13:47:18.939 [13498] <2> bpcr_get_platform_rqst: Server platform length = 14 13:47:19.059 [13498] <2> bpcr_get_version_rqst: bpcd version: 07010000 13:47:19.179 [13498] <2> bpcr_patch_version_rqst: theRest == > < 13:47:19.180 [13498] <2> bpcr_get_version_rqst: bpcd version: 07010000 13:47:19.299 [13498] <2> bpcr_patch_version_rqst: theRest == > < 13:47:19.300 [13498] <2> bpcr_get_version_rqst: bpcd version: 07010000 13:47:19.420 [13498] <8> file_to_cache_item: [vnet_addrinfo.c:6574] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/04b/baf4ae4b+0,1,50,0,2,0+gbahex156_bck.txt PEER_NAME = nbu_mast_vip.domain HOST_NAME = lnx_client CLIENT_NAME = lnx_client VERSION = 0x07010000 PLATFORM = linuxR_x86_2.6 PATCH_VERSION = 7.0.1.0 SERVER_PATCH_VERSION = -1.-1.-1.-1 MASTER_SERVER = nbu_mast_vip EMM_SERVER = nbu_mast_vip NB_MACHINE_TYPE = CLIENT 13:47:19.425 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:19.426 [13498] <2> copy_preferred_network_list: ../../libvlibs/nbconf.c.1610: pseudo PN for : storage_mgr_host 13:47:19.427 [13498] <2> vnet_pbxConnect: pbxConnectEx Succeeded 13:47:19.429 [13498] <8> do_pbx_service: [vnet_connect.c:2108] via PBX VNETD CONNECT FROM 192.10.10.183.17522 TO 164.10.10.246.1556 fd = 11 13:47:19.429 [13498] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:443] VN_REQUEST_CONNECT_FORWARD_SOCKET 10 0xa 13:47:19.470 [13498] <8> vnet_vnetd_connect_forward_socket_begin: [vnet_vnetd.c:460] ipc_string /tmp/vnet-06416421675239469332000000000-JbDGLC 192.10.10.183:17522 -> 164.10.10.246:1556 <2>bptestbpcd: EXIT status = 0 13:47:19.592 [13498] <2> bptestbpcd: EXIT status = 0
I don't know NBU very well, but it seems to me there could be 2 issues:
I think a firewall being the issue is the least likely, but it is feasible there is some device on the network I don't know about, so if anyone knows of any other ports to check , please let me know.
So looks to me to be some NBU conf issue, but I am certainly no expert in NBU.
Does anyone know what NBU is trying to do when getting error
async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91
Mike
Mike
01-19-2015 08:45 AM
More information as requested by Michael (hostnames and IPs have been substiuted by sed as in all other outputs in this post):
[root@lnx_client bin]# cat ../bp.conf SERVER = nbu_mast_vip_bck SERVER = nbu_master_bck CLIENT_NAME = lnx_client_bck MEDIA_SERVER = nbu_media_bck [root@lnx_client bin]# ./bpclntcmd -pn expecting response from server nbu_mast_vip_bck lnx_client_bck.domain lnx_client_bck 10.10.10.244 33160 [root@lnx_client bin]# ./bpclntcmd -self yp_get_default_domain failed: (12) Local domain name not set NIS does not seem to be running: (1) Request arguments bad gethostname() returned: lnx_client_bck host lnx_client_bck: lnx_client_bck at 10.10.10.244 aliases: lnx_client_bck 10.10.10.244 [root@lnx_client bin]# netstat -r Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.2.0 * 255.255.255.252 U 0 0 0 bond1 164.10.10.128 * 255.255.255.128 U 0 0 0 bond0 192.168.1.0 * 255.255.255.0 U 0 0 0 bond2 10.10.10.0 * 255.255.255.0 U 0 0 0 eth0 link-local * 255.255.0.0 U 0 0 0 eth0 link-local * 255.255.0.0 U 0 0 0 bond0 link-local * 255.255.0.0 U 0 0 0 bond1 link-local * 255.255.0.0 U 0 0 0 bond2 default 164.10.10.129 0.0.0.0 UG 0 0 0 bond0 [root@lnx_client bin]# netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.2.0 0.0.0.0 255.255.255.252 U 0 0 0 bond1 164.10.10.128 0.0.0.0 255.255.255.128 U 0 0 0 bond0 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 bond2 10.10.10.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 bond0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 bond1 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 bond2 0.0.0.0 164.10.10.129 0.0.0.0 UG 0 0 0 bond0
I ran "./bpclntcmd -self" when bp.conf was using public interface and I get the same:
yp_get_default_domain failed: (12) Local domain name not set
NIS does not seem to be running: (1) Request arguments bad
So I guess this is not an issue - just telling me NIS is not running.
Mike
01-19-2015 11:16 PM
There is something not quite right with the network here, things like
async_connect: [vnet_connect.c:1653] getsockopt SO_ERROR returned 145 0x91
is usually caused by firewalls or a connection going through another netcard than you expect which I guess is the case here.
01-21-2015 08:18 AM
I believe I have found the issue in https://www-secure.symantec.com/connect/forums/i-can-do-back-two-different-networks-havent-connectio...
On Linux the backup network and public network are isolated so you can only reach linux_bck from the NBU master if the NBU master uses the _bck interface, but on Solaris, the 2 networks are not isolated so you can reach the backup network from the public network - the traceroutes below demonstate this:
root@nbu_master # traceroute sol_client_bck traceroute: Warning: Multiple interfaces found; using 10.10.10.238 @ igb1 traceroute to sol_client_bck (10.10.10.105), 30 hops max, 40 byte packets 1 sol_client_bck.domain (10.10.10.105) 0.532 ms 0.245 ms 0.226 ms root@nbu_master # traceroute lnx_client_bck traceroute: Warning: Multiple interfaces found; using 10.10.10.238 @ igb1 traceroute to lnx_client_bck (10.10.10.244), 30 hops max, 40 byte packets 1 lnx_client_bck.domain (10.10.10.244) 0.849 ms 0.341 ms 0.330 ms
Above, a standard traceroute works as the standard routing tables is used and so the destination IP is 10.10.10 and there is an interface, igb1 in this subnet, so this interface is chosen so there is no route necessary between public and backup network as traffic is isolated on backup network using JUST the backup network interfaces and NBU media has a backup interface on this same network.
But if you use -i interface on traceroute:
-i interface
Specifies the interface through which traceroute should send packets. By
default, the interface is selected according to the routing table.
then you can force traffic over the wrong interface - i.e an interface that standard networting would not choose:
root@nbu_master # traceroute -i igb0 sol_client_bck traceroute to sol_client_bck (10.10.10.105), 30 hops max, 40 byte packets 1 sol_client_bck.domain (10.10.10.105) 0.624 ms 0.214 ms 0.226 ms root@nbu_master # traceroute -i igb0 lnx_client_bck traceroute to lnx_client_bck (10.10.10.244), 30 hops max, 40 byte packets 1 * * * 2 * * * 3 * * * 4 * * * 5 * * * 6 * * * 7 * * * 8 * * * 9 * * * 10 * * * 11 * * * 12 * * * 13 * * * 14 * * * 15 * *^C root@nbu_master #
So above I have forced communication to backup IP over the public interface, ig0 and this works for the Solaris network as there must be a route between the public and backup networks (a route on a switch in the network, not a route in the host). Forcing lnx_client_bck to use the public interface does not work as there is no route between the public and backup network (the linux clientsare on a different public network to the solaris clients)
I don't understand why NBU is bypassing standard networks on the host and is forcing traffic through the public network when using a backup IP. As far as I am concerned, isolated networks are the correct setup - I don't want backup network leaking on to the public network so it should be isolated - am I missing something here?
Mike
02-04-2015 01:46 AM
Thanks for the feedback Mike.
Customers normally forget to add /etc/notrouter on Solaris servers when miltiple NICs are added.
Something to remember for future reference - not sure if I would think about this if this was my customer.