There was an error in this gadget

Wednesday, May 11, 2011

PREROUTING 169.254.169.254 ..Should not be set on ComputeNode & 500 internal error

iptables rule for redirecting metadata request to API server on Nova-Compute node
Effect : failed to SSH

I believe that most of testers of OpenStack NOVA will face a problem with instance can ping but failed to SSH.Well , there r many possibilities .

First you can check console output of instance.If your problem is on retrieving metadata , and your topology is like this

Make sure the problem is not on nova-network host. To connect your laptop on Service Network switch then curl 169.254.169.254 . If your laptop could get it , that means nova-network is correct.The problem must on your compute-node 

Explain
In normal condition , instance route to API server from nova-network.
Instance request 169.254.169.254:80   ------> redirect to nova_api_ip:8773 on nova-network -----> get metadata.

What if PREROUTING rule is been set on compute node? 
Instance request 169.254.169.254:80 ------> redirect to nova_api_ip:8773 on compute node -----> br100 -----> loca ---->  can not connect to nova_api_ip:8773

There's the approach to solve this problem , del PREROUTING rule on compute node.
#iptables -t nat -D  PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.1.2:8773


=======500 internal error======
Add a POSTROUTING rule on nova-network
#iptables -t nat -A  nova-network-POSTROUTING -s 10.0.0.0/12 -d 192.168.1.0/24 -j ACCEPT







22 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Hi Hugo,
    this helped me a lot.

    Could you explain why VM cannot access to the API node if the PREROUTING rules on it?

    In case of VLAN mode, tcpdump trace shows that the packets destined to API server could not proceed to the br100 or any further. It could only pass vnet0 interface and disappears.

    ReplyDelete
  4. Hello 선현문
    It's 4AM . You work hard man....

    I want to give u some suggestions, but I did not try VlanMode actually.
    According to different network topology , it might a little bit different.

    There're some recommends from my experience.
    1. Check the iptable on compute-nodes. If there's any rule PREROUTING 169.254.169.254 , if yes . delete it.

    2. Check brctl , if ethN(instance_network NIC) has been bridge to br100 on compute-node.

    3.Connect you laptop to the network which is for instances. then curl 169.254.169.254....

    4.Make sure that API node has route back to instances.

    hope it help.....

    Hugo Kuo

    ReplyDelete
  5. Hi Hugo,
    Thanks :)

    It works fine with the POSTROUGING rule on network node plus routing rule on API node, which redirects vnet traffic to the corresponding network node(I had two, that was the problem).

    I learned a lot about network these days.

    What I still don't understand is why PREROUTING rule make vnet traffic to pass local network.

    On my system, which is VLAN mode,
    compute node brctl shows br100 has four ports which are vlan100, vnet0 and vnet1. I guess the traffic from vnet0 must go out through ethernet interface according to the PREROUTING rule and routing table.

    Do you know why PREROUTING rule is not working ?

    ReplyDelete
  6. Make it clear....

    Do ya mean the PREROUTING 169.254.169.254 to Nova-API rule on compute node is not working ?

    I don't know about your current network connection...

    Maybe we can talk about that in detail.

    ReplyDelete
  7. Hi Hugo,
    Sorry for my poor English.

    I'm using VLAN mode and meta-data service works fine with POSTROUTING rule + routing rule on API node.
    But if I add PREROUTING rule on compute node, it does not work and I don't know the reason. It looks right answer for me for instance to access meta-data server.

    Hope this clear.

    ReplyDelete
  8. Hello Hyunsun Moon~

    I'm not sure if this is correct in my own consideration. There's a diagram.
    http://dl.dropbox.com/u/16209558/meta2.png

    ReplyDelete
  9. Hi Hugo,
    Thanks for the diagram. It helps me make the issue clear.

    ReplyDelete
  10. Hi Hugo:
    I have 169.254.169.254 problem, and i have created one question on the openstack, but nobody answered, so i have to bother you to help me out:
    Would you please have a look at the problem?

    https://answers.launchpad.net/nova/+question/171069

    thanks

    ReplyDelete
  11. Hi Hugo:
    thank you for your reply:)
    and i have checked my setting , and br100 is in promics mode
    and eth0 also bridge to br100.

    now i am try to VNC to the instance , but need the password, my instance image is maverick-server-cloudimg-i386.tar.gz. and i have tried ubuntu and ubuntu , but i doesn't work.
    do you have any idea about this?
    thanks again.

    ReplyDelete
  12. Hi Hugo:
    I just ran vncviewer to VNC to the instance, it is ok?
    or there is other method to VNC without any authentication?
    thanks

    ReplyDelete
  13. In my suggestion , you can fire up TTY image at beginning . The default account of TTY is "root" without password.

    While you get into TTY instance , you can check for that issue.

    ReplyDelete
  14. Hi Hugo:
    Ok, i will try the tty linux image, then come back to you.

    thanks again

    ReplyDelete
  15. Hi Hugo:
    I have logined into the tty instance, it seems the instance is using the public IP, which got from host machines:
    stty: /dev/console
    udhcpc (v1.17.2) started
    Sending discover...
    Sending select for 10.140.xxx.155...
    Lease of 10.140.xxx.155 obtained, lease time 21600
    but i think it should be 10.0.0.X. as configured.
    so how can i block the public IP?
    so it means i need set the flag inject address?
    thanks

    ReplyDelete
  16. Obviously , the instance try to discovery DHCP server .
    and it got the public which assigned by public network , due to you just using one NIC for testing . The packet broad cast to external network ,

    Try to inject that . But the better way is add one more nice for instance network .
    Will make thing easier for you.

    ReplyDelete
  17. Hi Hugo:
    in the tty guest host, i see that there is static ipaddress 10.0.0.5 in /etc/networks/interface, but not know why it doesn't take effect.

    thanks

    ReplyDelete
  18. Maybe i need to test with same environment .....
    could you try to add one more NIC first ?

    ReplyDelete
  19. Hi Hugo:
    Good morning, sorry for the late response, cause there are some personal things to do at weekend. How to add one more NIC? If physical NIC, i just have one on my hand:).
    root@tiger-desktop:/var/log# ps -ef|grep dns
    nobody 1906 1 0 Aug26 ? 00:00:02 dnsmasq --strict-order --bind-interfaces --pid-file=/var/run/libvirt/network/default.pid --conf-file= --except-interface lo --listen-address 192.168.122.1 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-lease-max=253 --dhcp-no-override

    here are the output of dnsmasq on the host machine.
    is that right?

    thanks again.

    ReplyDelete
  20. Hi, Hugo:
    I have change the ifcfg-eth0 in the /etc/sysconfig/network-scripts/ to ipaddress 10.0.0.5.
    and service network restart

    then got the ip address. maybe /etc/network/interfaces, doesn't take effect in tty linux.

    but i can not ping 10.0.0.5 from host and from guest can not ping gate way 10.0.0.1.
    really strange.

    ReplyDelete
  21. hi Hugo:
    I use all in one mode, It seems that there is not ipaddress 10.0.0.1. shall i allocate that address?

    thanks

    ReplyDelete