New Question

Revision history [back]

click to hide/show revision 1
initial version

openvswitch gre and mysterious RST

Hello Friends - I have just gotten my second Hyper-V compute node connected to my HA Mitaka cluster (well, HA except for Neutron ;)

Using GRE tunnels with VLAN tags. Gnarly.

But I have a mysterious problem that comes'n'goes - TCP RST. Here's the scenario:

  1. For testing - singletons for Keystone / Nova Controller (scheduler / api) / Neutron - Just to make watching logs easier.
  2. Dual HAproxy LBs in front of everything with a VIP that defines cluster entrypoint (e.g. 9696, which is then proxied by HAproxy to backend Neutron - same for all of the OpenStack functions)
  3. Neutron with ML2 using GRE (not VXLAN).
  4. For this test - two Nova Compute nodes. One standard KVM (CentOS 7), another is Hyper-V (W2K12 R2 with all SPs / patches / latest drivers). Both Compute nodes configured with Open vSwitch (2.50 from Mitaka yum repo for KVM, 2.5.1 latest CloudBase Mitaka .MSI for Hyper-V). Both configured with GRE to match Neutron.
  5. KVM Compute node runs Just Fine, thanks. It creates VMs, no connection problems.
  6. Hyper-V Compute node usually runs OK - but sometimes I get a mysterious RST.

So your next question is...versions and config!

  • Latest CentOS OpenStack Mitaka repo for the controllers, and CloudBase 2.5.1 Open vSwitch / CloudBase Mitaka download for Nova Compute.
  • On the Hyper-V Nova Compute, I dedicate a single 10GbE uplink for use by Hyper-V vSwitch.
  • Manually disabled TSO (and all of its friends) for the virtual NICs. And for the physicals NICs as well. But I've tried my tests both with / without TSO (and GSO / GRO).
  • Hyper-V Nova Compute is beefy enuf: 2 Xeon 2690 sockets (8-way) running hyperthreaded for a total of 32 cores. 256GB RAM. Separate dedicated NIC for management (as well as OOB iLO). 8x1.2TB disks running as RAID-1 for two OS disks, RAID-6 for the remaining disks. Tons of disk space.
  • My networking treats the 10GbE uplink on the Hyper-V Nova Compute host as a trunk device. I use a dedicated vif / VLAN for GRE traffic - and I keep that traffic fitting in a standard 1500-byte frame. FWIW - my OpenStack traffic is on a separate vif (and separate VLAN).
  • I use tenant-private networking with overlapping IPs (OpenFlow helps to keep all that sorted out pretty magically).

Here's how things look to Open vSwitch (and Windows):

PS C:\Users\Administrator> Get-NetAdapter

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
br-tun                    Hyper-V Virtual Ethernet Adapter #7          31 Up           00-15-5D-09-05-0A        10 Gbps
br-int                    Hyper-V Virtual Ethernet Adapter #6          30 Up           00-15-5D-09-05-09        10 Gbps
stg                       Hyper-V Virtual Ethernet Adapter #5          28 Up           00-15-5D-09-05-06        10 Gbps
gre                       Hyper-V Virtual Ethernet Adapter #4          27 Up           00-15-5D-09-05-05        10 Gbps
aos                       Hyper-V Virtual Ethernet Adapter #3          26 Up           00-15-5D-09-05-04        10 Gbps
br-eno1                   Hyper-V Virtual Ethernet Adapter #2          25 Up           00-15-5D-09-05-03        10 Gbps
ens1f3                    HP NC365T PCIe Quad Port Gigabit S...#4      16 Not Present  AC-16-2D-A1-3D-83          0 bps
ens1f2                    HP NC365T PCIe Quad Port Gigabit S...#3      12 Not Present  AC-16-2D-A1-3D-82          0 bps
ens1f1                    HP NC365T PCIe Quad Port Gigabit S...#2      17 Up           AC-16-2D-A1-3D-81         1 Gbps
ens1f0                    HP NC365T PCIe Quad Port Gigabit Ser...      14 Up           AC-16-2D-A1-3D-80         1 Gbps
eno2                      HP Ethernet 10Gb 2-port 560FLR-SFP...#2      13 Not Present  38-EA-A7-17-78-6D          0 bps
eno1                      HP Ethernet 10Gb 2-port 560FLR-SFP+ ...      15 Up           38-EA-A7-17-78-6C        10 Gbps


PS C:\Users\Administrator> ovs-vsctl show
301d2950-38ad-4fc0-8d09-53488020f867
    Bridge "br-eno1"
        Port aos
            tag: 104
            Interface aos
                type: internal
        Port gre
            tag: 120
            Interface gre
                type: internal
        Port stg
            tag: 101
            Interface stg
                type: internal
        Port "br-eno1"
            Interface "br-eno1"
                type: internal
        Port "eno1"
            trunks: [101, 104, 120]
            Interface "eno1"
    Bridge br-int
        fail_mode: secure
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port br-int
            Interface br-int
                type: internal
    Bridge br-tun
        fail_mode: secure
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port br-tun
            Interface br-tun
                type: internal
        Port "gre-ac144081"
            Interface "gre-ac144081"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="172.20.65.72", out_key=flow, remote_ip="172.20.64.129"}
        Port "gre-ac144146"
            Interface "gre-ac144146"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="172.20.65.72", out_key=flow, remote_ip="172.20.65.70"}

In the above: I created the br-eno1 bridge as well as the aos ("An OpenStack" - VLAN 104), gre (umm, "GRE" - VLAN 120), and stg ("Storage" - VLAN 101) ports. Also you can see that GRE endpoints are being displayed.

There is a single physical uplink eno1 to permit the three trunked VLANs to egress to my network topology. Here's the IP info for the pertinent addresses.

PS C:\Users\Administrator> netsh int ip show addr

Configuration for interface "stg"
    DHCP enabled:                         No
    IP Address:                           172.28.3.72
    Subnet Prefix:                        172.28.0.0/22 (mask 255.255.252.0)
    InterfaceMetric:                      5

Configuration for interface "gre"
    DHCP enabled:                         No
    IP Address:                           172.20.65.72
    Subnet Prefix:                        172.20.64.0/22 (mask 255.255.252.0)
    InterfaceMetric:                      5

Configuration for interface "aos"
    DHCP enabled:                         No
    IP Address:                           172.24.9.72
    Subnet Prefix:                        172.24.8.0/22 (mask 255.255.252.0)
    InterfaceMetric:                      5

From a raw networking (e.g. ping on assigned IP addresses), things work great. Including stg vif which happens to leverage jumbo frames :)

Now - if I was reading this I'd be thinking..."Right - that joker has just screwed up his networking and is Very Confused by the pretty blinking lights." So indulge me to present raw IP networking for each defined interface - including jumbo frame support.

PS C:\Users\Administrator> ping -n 1 -f -l 8972 -S 172.28.3.72 172.28.1.251

Pinging 172.28.1.251 from 172.28.3.72 with 8972 bytes of data:
Reply from 172.28.1.251: bytes=8972 time<1ms TTL=64

Ping statistics for 172.28.1.251:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 0ms, Average = 0ms
PS C:\Users\Administrator> ping -n 1 -S 172.20.65.72 172.20.64.129

Pinging 172.20.64.129 from 172.20.65.72 with 32 bytes of data:
Reply from 172.20.64.129: bytes=32 time=1ms TTL=64

Ping statistics for 172.20.64.129:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 1ms, Maximum = 1ms, Average = 1ms
PS C:\Users\Administrator> ping -n 1 -S 172.24.9.72 lvosksclu110.hlsdev.local

Pinging lvosksclu110.hlsdev.local [172.24.8.21] from 172.24.9.72 with 32 bytes of data:
Reply from 172.24.8.21: bytes=32 time=1ms TTL=64

Ping statistics for 172.24.8.21:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 1ms, Maximum = 1ms, Average = 1ms

That first one is to a device on my storage network ("stg" Open vSwitch port), the second is to the Neutron Controller ("gre" Open vSwitch port), and the third is to the HAProxy LB running in front of my Keystone controller(s) ("aos" Open vSwitch port). So, folks, raw IP networking and L2 are Just Fine, thanks. For that matter - so is DNS.

Now for some logs...here's the failure in nova-compute.log from Hyper-V Nova Compute node:

2016-10-11 14:24:29.836 2640 DEBUG keystoneauth.session [req-ae959ecc-f7d6-4764-85de-e74f09afc9fa 5781ce717aea4d88999d83a4856250e6 a19f1fb7a28743c9953ad1520ddac4c7 - - -] REQ: curl -g -i --insecure -X GET http://lvosksclu110.hlsdev.local:35357 -H "Accept: application/json" -H "User-Agent: keystoneauth1/2.6.0 python-requests/2.9.1 CPython/2.7.11" _http_log_request D:\Program Files\Cloudbase Solutions\OpenStack\Nova\Python27\lib\site-packages\keystoneauth1\session.py:248
2016-10-11 14:24:29.836 2640 WARNING keystoneauth.identity.generic.base [req-ae959ecc-f7d6-4764-85de-e74f09afc9fa 5781ce717aea4d88999d83a4856250e6 a19f1fb7a28743c9953ad1520ddac4c7 - - -] Discovering versions from the identity service failed when creating the password plugin. Attempting to determine version from URL.
2016-10-11 14:24:29.836 2640 ERROR nova.compute.manager [req-ae959ecc-f7d6-4764-85de-e74f09afc9fa 5781ce717aea4d88999d83a4856250e6 a19f1fb7a28743c9953ad1520ddac4c7 - - -] Instance failed network setup after 1 attempt(s)

Of course, the message is a bit misleading. It's actually a pure connection problem when I paste in the actual curl command:

C:\Users\Administrator>curl -g -i --insecure -X GET http://lvosksclu110.hlsdev.local:35357 -H "Accept: applica
tion/json" -H "User-Agent: keystoneauth1/2.6.0 python-requests/2.9.1 CPython/2.7.11"
curl: (55) Send failure: Connection was reset

However, the very next time I try that same command from the same command line...success!

C:\Users\Administrator>curl -g -i --insecure -X GET http://lvosksclu110.hlsdev.local:35357 -H "Accept: applica
tion/json" -H "User-Agent: keystoneauth1/2.6.0 python-requests/2.9.1 CPython/2.7.11"
HTTP/1.1 300 Multiple Choices
Date: Tue, 11 Oct 2016 18:29:25 GMT
Server: Apache/2.4.6 (CentOS) mod_wsgi/3.4 Python/2.7.5
Vary: X-Auth-Token
Content-Length: 621
Content-Type: application/json

{"versions": {"values": [{"status": "stable", "updated": "2016-04-04T00:00:00Z", "media-types": [{"base": "app
lication/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.6", "links": [{"href": "http
://lvosksclu110.hlsdev.local:35357/v3/", "rel": "self"}]}, {"status": "stable", "updated": "2014-04-17T00:00:0
0Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v2.0+json"}], "i
d": "v2.0", "links": [{"href": "http://lvosksclu110.hlsdev.local:35357/v2.0/", "rel": "self"}, {"href": "http:

And - the failing VM creation succeeds beautifully - as long as a pesky RST doesn't occur.

My questions?

  1. Are there specific settings I should have for the vNICs / pNICs defined to Open vSwitch?
  2. Are there any timing variables I can set - could this possibly be occurring because things may be happening too quickly for Open vSwitch on Windows?
  3. Are there any configuration parameters I can try for my HAproxy? Because the URL being queried isn't actually the backend Keystone node, but instead it is the HAproxy node (which proxies to the active Keystone - which normally runs in ACTIVE-BACKUP mode for my instances but for this test I'm only running a single Keystone).
  4. Any interest from the CloudBase guys to work on this? I'm happy to setup test cases.