Revision history [back]

Neutron Router - Not working

Hi,

I'm not sure if this is related to neutron's network on the hyperv node running cloudbase nova driver or if this is related/isolated to the controller itself...

I am currently testing out OpenStack using a packstack AIO (Ocata) installation, however, I'm having a hard time to get networking up and running using neutron's router...

The packstack instance is running on a CentOS 7.3 hyperv guest with 6x NICs connected:

eth0 = openstack data/management network
eth1 = controller management ip
eth2 = br-public-network
eth3 = br-local-network
eth4 = br-load-balancing-network
eth5 = br-del-corp

I'm using OpenVSwitch (-vs- linuxbridge) and therefore, created the following bridges before running the packstack installer:

ovs-vsctl add-br br-public-network ; ovs-vsctl add-port br-public-network eth2
ovs-vsctl add-br br-local-network ; ovs-vsctl add-port br-local-network eth3
ovs-vsctl add-br br-load-balancing-network ; ovs-vsctl add-port br-load-balancing-network eth4
ovs-vsctl add-br br-del-corp ; ovs-vsctl add-port br-del-corp eth5

My packstack's answer file has the following details in regards to neutron/ovs:

CONFIG_NEUTRON_L3_EXT_BRIDGE=br-public-network
CONFIG_NEUTRON_ML2_TYPE_DRIVERS=vlan,flat
CONFIG_NEUTRON_ML2_TENANT_NETWORK_TYPES=vlan
CONFIG_NEUTRON_ML2_MECHANISM_DRIVERS=openvswitch,hyperv
CONFIG_NEUTRON_ML2_FLAT_NETWORKS=*
CONFIG_NEUTRON_ML2_VLAN_RANGES=physnet1,physnet2:500:2000,physnet3:2010:3010
CONFIG_NEUTRON_ML2_TUNNEL_ID_RANGES=
CONFIG_NEUTRON_ML2_VXLAN_GROUP=
CONFIG_NEUTRON_ML2_VNI_RANGES=10:100
CONFIG_NEUTRON_L2_AGENT=openvswitch
CONFIG_NEUTRON_LB_INTERFACE_MAPPINGS=
CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-public-network,physnet2:br-local-network,physnet3:br-load-balancing-network
CONFIG_NEUTRON_OVS_EXTERNAL_PHYSNET=physnet1

I've been able to create and use the following networks:

public-network, physnet1 (flat), no router, public routeable cidr subnet, no dhcp
private-network, physnet2 (vlan id: 501), no router, local non-routeable cidr subnet (10.0.10.0/24), no dhcp

** I'm injecting IPs instead of using DHCP..

Both networks are working fine. I can spin up instances and they can communicate to either the external network (eg: Internet) or the local network.

Now, when it comes to creating a public network that would be used for load balancing projects and therefore, would be providing a neutron router, this is where it becomes problematic... I'm creating the "load balancing network" using the following commands:

neutron net-create --provider:network_type=vlan --provider:physical_network=physnet3 --router:external=True PUBLIC-CLUSTER-NETWORK
neutron subnet-create PUBLIC-CLUSTER-NETWORK PUBLIC_CIDR_HERE/27 --gateway GW_IP_HERE --allocation-pool start=IP_START_HERE,end=IP_END_HERE --disable-dhcp --name PUBLIC-CLUSTER-SUBNET --dns-nameservers list=true 8.8.8.8 4.2.2.2
neutron router-create ROUTER-PUBLIC-CLUSTER-NETWORK
neutron router-gateway-set ID_ROUTER ID_NETWORK

Then, I'm creating a "local cluster network" that would be using the router created above:

neutron net-create --provider:network_type=vlan --provider:physical_network=physnet2 CLIENT0001-CLUSTER-NETWORK --tenant-id=e0f7fb96271f48588e2aac86d66ae42e
neutron subnet-create CLIENT0001-CLUSTER-NETWORK 192.168.23.0/24 --name CLIENT0001-CLUSTER-SUBNET --dns-nameservers list=true 8.8.8.8 4.2.2.2 --disable-dhcp
neutron router-interface-add ID_ROUTEUR ID_CLIENT_SUBNET

Result in the Neutron router IP configuration is:

# ip netns exec qrouter-78ab8780-6282-4b8b-b840-b92ba0916e62 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
14: qg-0f38cb25-ae: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether fa:16:3e:76:7d:63 brd ff:ff:ff:ff:ff:ff
    inet 198.xxx.xxx.61/27 brd 198.xxx.xxx.63 scope global qg-0f38cb25-ae
       valid_lft forever preferred_lft forever
    inet6 xxxxxx/64 scope link
       valid_lft forever preferred_lft forever
15: qr-b4c68450-23: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether fa:16:3e:53:48:5f brd ff:ff:ff:ff:ff:ff
    inet 192.168.23.1/24 brd 192.168.23.255 scope global qr-b4c68450-23
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe53:485f/64 scope link
       valid_lft forever preferred_lft forever

And current route within the qrouter namespace:

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         198.27.95.62    0.0.0.0         UG    0      0        0 qg-909f764f-43
192.168.23.0    0.0.0.0         255.255.255.0   U     0      0        0 qr-63f19f8c-f7
198.27.95.32    0.0.0.0         255.255.255.224 U     0      0        0 qg-909f764f-43

Once done, I'm firing up an instance using "CLIENT0001-CLUSTER-NETWORK" network. From that VM, I'm unable to ping the gateway (which is 192.168.0.1). From the router namespace, I'm unable to ping the instance's IP (eg: 192.168.0.3):

# ip netns exec qrouter-78ab8780-6282-4b8b-b840-b92ba0916e62 ping -c 4 192.168.23.3
PING 192.168.23.3 (192.168.23.3) 56(84) bytes of data.
From 192.168.23.1 icmp_seq=1 Destination Host Unreachable
From 192.168.23.1 icmp_seq=2 Destination Host Unreachable
From 192.168.23.1 icmp_seq=3 Destination Host Unreachable
From 192.168.23.1 icmp_seq=4 Destination Host Unreachable

--- 192.168.23.3 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3000ms
pipe 4

Security Group has been added to the instance in order to allow any egress/ingress traffic (udp, icmp, tcp).

Current ovs-bridge info:

Bridge br-del-corp
    Controller "tcp:127.0.0.1:6633"
        is_connected: true
    fail_mode: secure
    Port "eth5"
        Interface "eth5"
    Port br-del-corp
        Interface br-del-corp
            type: internal
    Port phy-br-del-corp
        Interface phy-br-del-corp
            type: patch
            options: {peer=int-br-del-corp}
Bridge br-local-network
    Controller "tcp:127.0.0.1:6633"
        is_connected: true
    fail_mode: secure
    Port "eth3"
        Interface "eth3"
    Port "phy-br-lo0b229b"
        Interface "phy-br-lo0b229b"
            type: patch
            options: {peer="int-br-lo0b229b"}
    Port br-local-network
        Interface br-local-network
            type: internal
Bridge br-load-balancing-network
    Controller "tcp:127.0.0.1:6633"
        is_connected: true
    fail_mode: secure
    Port "phy-br-lo153fda"
        Interface "phy-br-lo153fda"
            type: patch
            options: {peer="int-br-lo153fda"}
    Port "eth4"
        Interface "eth4"
    Port br-load-balancing-network
        Interface br-load-balancing-network
            type: internal
Bridge br-int
    Controller "tcp:127.0.0.1:6633"
        is_connected: true
    fail_mode: secure
    Port "qg-0f38cb25-ae"
        tag: 4
        Interface "qg-0f38cb25-ae"
            type: internal
    Port "tapbe9f0d7c-1e"
        tag: 3
        Interface "tapbe9f0d7c-1e"
            type: internal
    Port br-int
        Interface br-int
            type: internal
    Port "int-br-lo0b229b"
        Interface "int-br-lo0b229b"
            type: patch
            options: {peer="phy-br-lo0b229b"}
    Port "qr-b4c68450-23"
        tag: 5
        Interface "qr-b4c68450-23"
            type: internal
    Port "int-br-pued1969"
        Interface "int-br-pued1969"
            type: patch
            options: {peer="phy-br-pued1969"}
    Port "int-br-lo153fda"
        Interface "int-br-lo153fda"
            type: patch
            options: {peer="phy-br-lo153fda"}
    Port int-br-del-corp
        Interface int-br-del-corp
            type: patch
            options: {peer=phy-br-del-corp}
Bridge br-public-network
    Controller "tcp:127.0.0.1:6633"
        is_connected: true
    fail_mode: secure
    Port br-public-network
        Interface br-public-network
            type: internal
    Port "phy-br-pued1969"
        Interface "phy-br-pued1969"
            type: patch
            options: {peer="int-br-pued1969"}
    Port "eth2"
        Interface "eth2"
ovs_version: "2.6.1"

I've enabled DHCP so that I can test pinging the DHCP namespace to/from the qrouter nameserver; pinging works fine. I'm still wondering if the problem might lies on the hyperv node.

I've enable trunking on the openstack-controller guest, which is running on hyperv and therefore, using that hypervisor as an openstack node.

The hyperv node is running windows 2016 datacenter with the Ocata HyperV Computer driver. Current neutron's config file on the node:

[DEFAULT]
verbose=false
control_exchange=neutron
rpc_backend=rabbit
log_dir=E:\OpenStack\Log\
log_file=neutron-hyperv-agent.log
[AGENT]
polling_interval=2
physical_network_vswitch_mappings=physnet1:VRACK,physnet2:VRACK,physnet3:VRACK,physnet4:VRACK
enable_metrics_collection=true
enable_qos_extension=false
worker_count=12
[SECURITYGROUP]
firewall_driver=hyperv
enable_security_group=true
[oslo_messaging_rabbit]
rabbit_host=10.236.245.10
rabbit_port=5672
rabbit_userid=xxxxx
rabbit_password=xxxxx

And current nova.conf:

[DEFAULT]
compute_driver=compute_hyperv.driver.HyperVDriver
instances_path=E:\OpenStack\Instances
use_cow_images=true
flat_injected=true
mkisofs_cmd=C:\Program Files\Cloudbase Solutions\OpenStack\Nova\bin\mkisofs.exe
verbose=false
allow_resize_to_same_host=true
running_deleted_instance_poll_interval=120
resize_confirm_window=5
resume_guests_state_on_host_boot=true
rpc_response_timeout=1800
lock_path=E:\OpenStack\Log\
use_neutron=True
vif_plugging_is_fatal=false
vif_plugging_timeout=60
rpc_backend=rabbit
log_dir=E:\OpenStack\Log\
log_file=nova-compute.log
force_config_drive=True
instance_usage_audit=true
instance_usage_audit_period=hour
[placement]
auth_strategy=keystone
auth_type = v3password
auth_url=http://10.236.245.10:5000/v3
project_name=services
username=placement
password=xxxxxxxxx
project_domain_name=Default
user_domain_name=Default
os_region_name=RegionOne
[notifications]
notify_on_state_change=vm_and_task_state
[glance]
api_servers=10.236.245.10:9292
[hyperv]
vswitch_name=VRACK
limit_cpu_features=false
config_drive_inject_password=true
qemu_img_cmd=C:\Program Files\Cloudbase Solutions\OpenStack\Nova\bin\qemu-img.exe
config_drive_cdrom=true
dynamic_memory_ratio=1
enable_instance_metrics_collection=true
[rdp]
enabled=true
html5_proxy_base_url=http://xxxxxxxxx:8000/
[neutron]
url=http://10.236.245.10:9696
auth_strategy=keystone
project_name=services
username=neutron
password=xxxxxxxxxx
auth_url=http://10.236.245.10:35357/v3
project_domain_name=Default
user_domain_name=Default
os_region_name=RegionOne
auth_type = v3password
[oslo_messaging_rabbit]
rabbit_host=10.236.245.10
rabbit_port=5672
rabbit_userid=xxxxxxx
rabbit_password=xxxxxxxx

Do you guys have any ideas what might be wrong ? If more information is needed, please let me know.

Thanks!