Lost connection of instances after reboot Nova-Network host......

The network connection of instance has been lost after restart Nova-Network host 


In my consideration , all services should not affect others while restart/stop service within cloud platform...

That means instances connection should be alive while admin restart nova-network host , but it won't work as your expectation. I faced this issue one month ago. And I spend around 1 hour to understand what's going on with this problem.

In regular nova deployment , nova-network host is the gateway of all instances.  This Linux network box works like a router.  So that has a ARP table over the box.

In my test , if the flat_interface(or vlan_interface)  do not auto up after reboot , the box will lose ARP table. And you can not ping or ssh  instance anymore.  While you up the nic manually , you have to wait for ARP rebuild .



There's a correct ARP info
Address                  HWtype  HWaddress           Flags Mask            Iface
192.168.1.1              ether   00:13:49:d0:dd:9c   C                     eth0

Sometimes , It's really crazy for me . I ever wait for over 24 hours.


ARPING will help you discover the host.
arping   X.X.X.X


If the network configuration is correct after reboot . It never happened to me.
In the above environment , I always up the flat_interface manually . Luckily , the ARP table is complete with auto up the flat_interface.

Due to the flat_interface which is eth1 need to be auto-up without any network parameters for nova-network to handle the bridge configuration. There's the approach to auto-up nic at boot up.


Add to /etc/network/interfaces
auto eth1
iface inet eth1 static
address 0.0.0.0
netmask 0.0.0.0

Another code less option(From Stackops )
In /etc/network/interfaces

up ifconfig eth1 0.0.0.0


Advanced skill(From Mr. Unknown)
edit /etc/init/nova-network.conf
add this in the pre-start script:
ip link set eth1 up

As the kindly mention:
This would be rather better, nova-network will start only when eth1 has been activated.
And remember, the compute nodes also need to create bridge,you can add the same command in /etc/init/nova-compute.conf on compute nodes.

Comments

  1. In the StackOps distro we add this in /etc/network/interfaces to the interface (management -and mandatory- network):

    up ifconfig eth1 0.0.0.0

    Just an alternative that can save some lines of code ;-)

    ReplyDelete
  2. Haven't see ya been a long time ....

    Thanks for your suggestion .. obviously , your code is better. I'll have a test with it.

    ReplyDelete
  3. You can edit /etc/init/nova-network.conf
    add this in the pre-start script:
    ip link set eth1 up

    This would be rather better, nova-network will start only when eth1 has been activated.

    And remember, the compute nodes also need to create bridge,you can add the same command in /etc/init/nova-compute.conf on compute nodes.

    ReplyDelete
  4. wow , Mr, Unknown...

    Awesome , this approach is really nice.
    Additional question is about dnsmasq , Why that I always have to killall dnsmasq after bootup nova-network host.

    Is there a better way to improve it ?

    Thanks
    Hugo Kuo

    ReplyDelete
  5. You can simply make dnsmasq not auto-start when system is booting:
    update-rc.d -f dnsmasq remove

    ReplyDelete
  6. Oh,I forgot to explain the reason.
    dnsmasq will start automaticly before nova-network while system is booting, which cause the dnsmasq started by nova-network cannot bound to interface with an error output "address already in use". You can find the error log in /var/log/nova/nova-network.log

    ReplyDelete
  7. Upstart initialization let's you do really smart things. Our nova-network.conf upstart initialization script checks network and filesystem as follows:

    start on (net-device-up IFACE!=lo and local-filesystems and started networking)

    So, nova-network only starts if networking is up and running and if the network devices up is not the loop interface. The filesystems stuff is not very relevant in nova-network, but it's important in nova-compute (NFS mounts...)

    ReplyDelete

Post a Comment

Popular posts from this blog