cancel
Showing results for 
Search instead for 
Did you mean: 

Configured bond on 5220, appliance lost existing network connectivity

cmorrall
Level 4

Hi,

I've been setting up a new NetBackup 5220 appliance for a few days. The initial setup went farily smoothly, I connected my laptop to NIC1 (192.168.1.1) and set NIC2 to my management IP address, 10.x.y.102. I continued on with the configuration from the comfort of my desk chair.

Now, for the actual backup traffic I want to use the remaining 4 1Gbps ports as a LACP bond and the networking guys configured a corresponding port channel on the cisco side of things.

In the web gui I could see eth2-5 as "connected up" and I set an IP address of 10.x.y.103 for a bond with these four NICs. Type 802.3ad.

Immediately after I click Apply, the web gui froze. So bad, my IE session got completely stuck.

Now nothing replies on ping, not 10.x.y.102 and certainly not 10.x.y.103. I obviously can't connect to the web gui or via ssh to my management ip.

I haven't configured IPMI yet, but I suppose I need to do that when I visit the site. I don't have first hand physical access to the equipment, so it's a bit of chore.

One question; maybe blindingly obvious to network folks. Is it problem with having both the management NIC2 and my intended bond on the same subnet: 10.x.y.0/24 ?

16 REPLIES 16

Mark_Solutions
Level 6
Partner Accredited Certified

Generally if you want a seperate management port i would use eth0 - which is the default management port.

You can then bond eth1 through to eth5 as your backup network.

To be honest, unless you need a seperate replication network, i prefer to keep an appliance with just 1 IP address which would allow you to configure all 6 ports in a single bond and then use the same IP address to manage the appliance.

You do need it configured first but after that you can unconfigure eth0 and add it to the bond.

I always do my network configurations using the CLISH and not the web GUI (as the network gets reset during such operations) and access the CLISH via the IPMI remote control, again as the network reset never throws you off the system.

I would get your IPMI up and running first so that you can connect back in (you need Java 6 or higher installed on the PC that you connect  to it) and do any work via the CLISH

To actually answer your question now ... there should be no problem in bonding different IP addresses on different sets of ports .. but the appliance does only hold one gateway address so do the main network config last - again from the CLISH if you can via IPMI.

But as i said - the less IP addresses the better - if you can just use one it will make life much easier for you and do it all via the CLISH while connected to the IPMI

Hope this helps

Rusty_Major
Level 4
Partner

I agree that less IPs is easier to manage, but in some cases that may not be possible. You definitely need the IPMI IP configured for console access in case the other network configurations get broken. I agree that eth0 (NIC1 on the physical label) should be used for in-band management, if needed. The other interfaces, either 10Gb or 1Gb, should be used for your backup traffic and bonded if you have that luxury.

You'll have to be careful here to get the routing setup correctly and be cautious as you can see the bonding tries to think for you and whacked your routing.

Mark_Solutions
Level 6
Partner Accredited Certified

I had also meant to say that if your switch has not been configured correctly or the connections are not all on the same network segement then 802.3ad bonding will not work and so you will loose network connection

For routing try and use hosts files where possible - i have set up many appliances with multiple IP addresses and they all work fine just using hosts files

cmorrall
Level 4

Contuing on, I am on site and indeed, when configuring the bond the default route got switched to using "bond0" as the interface. However, LACP wasn't quite ready on the switch side, meaning I got completely cut off.

By hooking up my laptop to eth0 I could remove the bond0 and start again.

Ok, now I have eth0 remaining at the default 192.168.1.1 address

Eth1 is my management ip 10.x.y.102.

The remaining ports I configured to a bond0 with interfaces eth2, 3, 4 and 5. I set another ip to this bond: 10.10.x.75, meaning it's on a different subnet to my management ip.

I also added a route, since the default route is on my management network.

Immediately, the appliance stopped responding to management on eth1, my management ip.

The ip still replies to ping, but I can't use ssh or the web gui. These functions now seem to only accesible via the bond0 ip address.

Can I somehow instruct the appliance to use a specific interface/ip for management functions? By the looks of it, management functions is the last configured interface or the one with lowest ip address, but this is just guessing.

Mark_Solutions
Level 6
Partner Accredited Certified

I would suggest you remove your route (not sure where you added that) and see if that resolves it

I will do some additional checking and get back to you shortly

cmorrall
Level 4

I removed the route for the 10.10.x.75 bond0 ip. Indeed, I now can reach the appliance with ssh via the original ip.

Curious...

Mark_Solutions
Level 6
Partner Accredited Certified

It may just take one route for everything - which blocks things!

Did you set a gateway for each connection?

You would need to do it from the CLISH

Gateway Add 191.168.2.1 192.168.2.0 255.255.255.0

or maybe

Gateway Add 191.168.2.1 192.168.2.0 255.255.255.0 eth0

In general i find it best to keep things simple - but will double check this and get back to you

cmorrall
Level 4

Yes of course I'd like to keep things simple, however the reality is that I have to deal with segmented networks, firewalls, access restrictions, keeping data and management seperate as required by my customers.

 

I used the Gateway Add to add the 10.10.0.0 network to the gateway on that network. That worked as such, I could still ping both ip addresses on my appliance, but the management access magically moved from eth1 to bond0 when adding the gateway.

Since I could ping both addresses; routing was obviously still working. Perhaps when adding a gateway the networking stack gets momentarily restarted and the netbackup processes binds to the first available configured interface. Just a guess here, but the b in bond0 comes before the e in eth1 sorted alfabetically. Also, the intended backup data ip has a lower value 10.10.x.x than management 10.127.x.x.

Currently, I can successfully do a test backup. Incidently, my test client is on the backup data network 10.10.x.x. I had to add the data network ip 10.10.8.75 in the servers list, it wouldn't accept the 10.127.x.x ip.

Mark_Solutions
Level 6
Partner Accredited Certified

When you make a change to a network it does re-start that network - hence the usefulness of the IPMI interface which always stays up and leaves you in control of the CLISH.

Could you clarify what you mean in your last sentance - where were you trying to add this IP address - which servers list?

cmorrall
Level 4

I've configured the IPMI now, so I should be covered from cutting off my own access. However, the question still remains why/how the management access actually binds to multiple interfaces.

As for the last sentence, the servers list on the NetBackup client, the list of which servers are allowed to take a backup. I'm migrating from an old master to a new master, moving client by client.

Mark_Solutions
Level 6
Partner Accredited Certified

As far as i can tell to set a gateway for each you would use:

Gateway Add 191.168.2.1 192.168.2.0 255.255.255.0 eth0

Gateway Add 10.10.x.x 10.10.x.0 255.x.x.x eth1

Gateway Add 10.127.x.x 10.127.x.0 255.x.x.x bond0

This will reset each network after you have done each so you do need to wait say 30 seconds

Also, as far as i am aware, you would be able to putty to any of these addresses and get into the CLISH or indeed the web gui for the appliance

Hope this helps

Mark_Solutions
Level 6
Partner Accredited Certified

The appliance is just a server - it responds to login however you connect to it and so you can get to the clish on any interface

Although the eth0 is "classed" as a management interface that is just a term and no such dedicated interface connect exists or is required - it is just a SUSE Linux server really with its main connection when loggin in as admin being the CLISH

You do not need to give it a dedicated management interface as any connection will be fine for management purposes

Rusty_Major
Level 4
Partner

I have setup many appliances like you have mentioned.

An IPMI interface, an inband mgmt interface on eth0 (NIC1) and bonded the 10Gb or 1Gb interfaces for the backup traffic. I always set up the interfaces in the IPMI console because when doing so, one or the other setting will mess up mgmt access, etc. I go back and fix the routes after all interfaces are configured and then things are happy.

It doesn't 'bind' management functions to a specific interface, it just messes up the routing when you create the bond. Manually fix that through the IPMI and you'll be in good shape.

Mark_Solutions
Level 6
Partner Accredited Certified

Are you all sorted on this one now or do you have anything still outstanding?

If you are sorted then please close off the thread using the Mark as solution option which helps others looking for a similar matter in the future

Thanks

cmorrall
Level 4

I took another stab at this yesterday. As far as I can tell, when I have the bond0 interface not configured for any routing, the appliance sort of works. I can manage it from the management ip bound to eth1.

When I add a gateway, either from the CLISH or web gui, management traffic is cut off from eth1 and instead available on bond0, at least for the web gui for the appliance itself.

However, using the Java console and pointing that to the bond0 ip address, I can log on but then the Java console just displays the GUI with an hourglass pointer. I can't manage anything, and I left it for well over 30 minutes.

I've been tinkering and tinkering, restarting the NetBackup processes and even the entire appliance, but I can't get this to work the way I thought it would work. At some point, when I connected with the Java console, the appliance appeared as a NetBackup client. I only got the "Backup, Archive and restore" portion, not the full GUI.

Think I'll just open a support case to get some movement on this.

Mark_Solutions
Level 6
Partner Accredited Certified

Can you tell us how you are configuring the gateway on each interface (CLISH or GUI?)

Are you using different names for each interface and if so have you added alias's to NetBackup for them?

Tell us exactly what you have done and how and what you have in its hosts file so that we can get a better picture of things

Thanks