cancel
Showing results for 
Search instead for 
Did you mean: 

NBU 5230 3.2 MR1 - issue with ping

quebek
Moderator
Moderator
   VIP    Certified

Hello

Recently I did upgrade NBU on this box to 3.2 and than did apply Maintenance Release 1 pack. after few days I did notice I am unable to ping one of its clients. NBU appliance as well this client is in the same subnet - so no routing involved, but still pings does not work. I did open a case with VRTS and waiting for their inputs...

NBU app is having a bond in 802.3ad mode made from two 10 Gbps NICs ... on top of that I laid down vlan 10.

vlan10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.60.208.47 netmask 255.255.255.0 broadcast 10.60.208.255
inet6 fe80::21e:67ff:fef2:cbd5 prefixlen 64 scopeid 0x20<link>
ether 00:1e:67:f2:cb:d5 txqueuelen 1000 (Ethernet)

routing table

netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.60.208.1 0.0.0.0 UG 0 0 0 vlan10
10.60.208.0 0.0.0.0 255.255.255.0 U 0 0 0 vlan10
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 bond0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 vlan10
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 vlan20
192.168.229.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0

I am able to ping GW

ping 10.60.208.1
PING 10.60.208.1 (10.60.208.1) 56(84) bytes of data.
64 bytes from 10.60.208.1: icmp_seq=1 ttl=255 time=1.29 ms
^C
--- 10.60.208.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.290/1.290/1.290/0.000 ms

 but I am unable to ping 10.60.208.5
PING 10.60.208.5 (10.60.208.5) 56(84) bytes of data.
^C
--- 10.60.208.5 ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 4999ms

Below is client config

Windows IP Configuration


Ethernet adapter Live Network Connection:

Connection-specific DNS Suffix . :  some.com
IPv4 Address. . . . . . . . . . . : 10.60.208.5
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 10.60.208.1

IPv4 Route Table
===========================================================================
Active Routes:
Network Destination Netmask Gateway Interface Metric
0.0.0.0 0.0.0.0 10.60.208.1 10.60.208.5 271
10.60.208.0 255.255.255.0 On-link 10.60.208.5 271
10.60.208.5 255.255.255.255 On-link 10.60.208.5 271
10.60.208.255 255.255.255.255 On-link 10.60.208.5 271
127.0.0.0 255.0.0.0 On-link 127.0.0.1 331
127.0.0.1 255.255.255.255 On-link 127.0.0.1 331
127.255.255.255 255.255.255.255 On-link 127.0.0.1 331
224.0.0.0 240.0.0.0 On-link 127.0.0.1 331
224.0.0.0 240.0.0.0 On-link 10.60.208.5 271
255.255.255.255 255.255.255.255 On-link 127.0.0.1 331
255.255.255.255 255.255.255.255 On-link 10.60.208.5 271
===========================================================================
Persistent Routes:
Network Address Netmask Gateway Address Metric
0.0.0.0 0.0.0.0 10.60.208.1 Default
===========================================================================

IPv6 Route Table
===========================================================================
Active Routes:
If Metric Network Destination Gateway
1 331 ::1/128 On-link
1 331 ff00::/8 On-link
===========================================================================
Persistent Routes:
None

There is no Firewall on this client - it is a VM running windows server 2016... 

I can't see the NBU app IP in arp -a outcome - not sure what to do .... do you have any idea?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

quebek
Moderator
Moderator
   VIP    Certified

Today I started to map what VMs are residing on which ESXi and was pinging these. It turned out I can ping few and few from the same subnet from the same ESXi can't.

I forced network team to verify LACP status on their end, and it turned out that one of the ports in this port channel was suspended (down) while NBU appliance was showing for both UP status.

So I did elevate to CLI and run ifdown bond0 followed by ifup bond0

Than I had to add again default gw

And everything is working back fine…

I am just wondering why appliance did not notice the remote port being in down state.

 

Nonetheless thank you for your assistance.

View solution in original post

6 REPLIES 6

jnardello
Moderator
Moderator
   VIP    Certified
The immediate questions that spring to mind : 1) Can the client ping the appliance back successfully? 2) ping is certainly handy but it's not actually required for NBU to function - what kind of backup errors are you seeing ? Can you telnet to the PBX, bpcd, and/or vnetd ports Client-->Media Server ? Media Server-->Client ? 3) Can the appliance ping anything else on that subnet above the client's IP (i.e. x.x.x.6, etc. ) ? 4) Are backups of everything else on that subnet working except for that one particular client ? 5) Have you run any of the bpclntcmd or bptestbpcd commands between the two boxes for more info ? 6) How is the client's connection to the Master ? Same issues ? My knee-jerk reaction is to blame the client as Media Server issues tend to be more of the everything-works-or-nothing-works type of thing, particularly where networks are involved.

quebek
Moderator
Moderator
   VIP    Certified

Hello

Please find my answers below:

1) Can the client ping the appliance back successfully?

No it can not.

2) ping is certainly handy but it's not actually required for NBU to function - what kind of backup errors are you seeing ? Can you telnet to the PBX, bpcd, and/or vnetd ports Client-->Media Server ? Media Server-->Client ?

No, I cant connect from neither client (2 master) nor master (to client) - if I do ping from client I cant see in arp -a MAC address from master... Something is NOK

3) Can the appliance ping anything else on that subnet above the client's IP (i.e. x.x.x.6, etc. ) ?

Yes, appliance can ping GW and some other clients within same subnet - this is really weird.

4) Are backups of everything else on that subnet working except for that one particular client ?

Almost all clients are VMs and I am using vmware policy type to back up these... Problem arised when I was willing to restore to this client some files using BAR on it....

5) Have you run any of the bpclntcmd or bptestbpcd commands between the two boxes for more info ?

No - if nothing is working this will not help...

6) How is the client's connection to the Master ? 

vm based backup....

 

Also I did on Fri a reboot of this appliance - then I was able to ping everything within the same subnet... but I was unable to ping default GW, so my VM based backups was jeopardized - so I rebooted it again and ended in the previous state. Could backup using vmware but was unable to restore... etc... I am really lost....

vtas_chas
Level 6
Employee

If you're using a VMWare policy type, then backing up the VM isn't an indicator of network connectivity b/t the appliance and the VM.  I suspect you have a VLAN problem or a vSwitch problem where the vSwitch your client connects through isn't actually connected to the same physical network as the appliance.  That's what the client not being able to ping the appliance but the appliance networking functioning for other paths tells me.

Are the VM and the appliance going through different gateways?  Is there a route anywhere to route those networks?

You say there is no firewall, but with Win2016 I've seen lots of admins insist for DAYS that the firewall is turned off only to find out 'oops, on this one server something went wrong with GPO and it's turned on'.  I'd actively go allow NBU ports through, and check that ping responses aren't disabled.

Are you 100% certain the VLAN setup for the appliance NIC(s) and this VM are 100% the same?  Again, lots of time troubleshooting things on this branch of the tree to find days later "oops, we fat-fingered VLAN 1040 when it was supposed to be 1049" or the like.

 

Charles
VCS, NBU & Appliances

jnardello
Moderator
Moderator
   VIP    Certified
Just to check, did this work before the upgrade or wasn't it something you'd tested out so you're unsure ? The lack of a gateway connection post-reboot is a bit concerning. I would recommend getting your network team involved, might be worth checking to make sure both of your Media Server ports are actually (still) configured in a port channel. If they error counts aren't currently zero on them, have the network tech reset them so that you can do another check later if needed to see if they're registering issues. You may also want to manually drop each Media Server NIC in turn to make sure that each of them is functional individually and can reach the GW - bonds do a great job of hiding issues sometimes. =) I'd also second Charles' recommendation about triple-checking the firewall status.

quebek
Moderator
Moderator
   VIP    Certified

Hi

Thanks... But I am also part of admin group and I was on this wintel box and did check firewall.cpl - everything is out there turned off. Both servers are in the same vlan 10, same gateway for both 10.60.208.1....

Network team was involved... per them all is OK. Now wintel/vm admins are on it... I did request to reboot that box... and see - you know this is windows so for sure this won't harm.

quebek
Moderator
Moderator
   VIP    Certified

Today I started to map what VMs are residing on which ESXi and was pinging these. It turned out I can ping few and few from the same subnet from the same ESXi can't.

I forced network team to verify LACP status on their end, and it turned out that one of the ports in this port channel was suspended (down) while NBU appliance was showing for both UP status.

So I did elevate to CLI and run ifdown bond0 followed by ifup bond0

Than I had to add again default gw

And everything is working back fine…

I am just wondering why appliance did not notice the remote port being in down state.

 

Nonetheless thank you for your assistance.