openvswitch gre and mysterious RST
Hello Friends - I have just gotten my second Hyper-V compute node connected to my HA Mitaka cluster (well, HA except for Neutron ;)
Using GRE tunnels with VLAN tags. Gnarly.
But I have a mysterious problem that comes'n'goes - TCP RST. Here's the scenario:
- For testing - singletons for Keystone / Nova Controller (scheduler / api) / Neutron - Just to make watching logs easier.
- Dual HAproxy LBs in front of everything with a VIP that defines cluster entrypoint (e.g. 9696, which is then proxied by HAproxy to backend Neutron - same for all of the OpenStack functions)
- Neutron with ML2 using GRE (not VXLAN).
- For this test - two Nova Compute nodes. One standard KVM (CentOS 7), another is Hyper-V (W2K12 R2 with all SPs / patches / latest drivers). Both Compute nodes configured with Open vSwitch (2.50 from Mitaka yum repo for KVM, 2.5.1 latest CloudBase Mitaka .MSI for Hyper-V). Both configured with GRE to match Neutron.
- KVM Compute node runs Just Fine, thanks. It creates VMs, no connection problems.
- Hyper-V Compute node usually runs OK - but sometimes I get a mysterious RST.
So your next question is...versions and config!
- Latest CentOS OpenStack Mitaka repo for the controllers, and CloudBase 2.5.1 Open vSwitch / CloudBase Mitaka download for Nova Compute.
- On the Hyper-V Nova Compute, I dedicate a single 10GbE uplink for use by Hyper-V vSwitch.
- Manually disabled TSO (and all of its friends) for the virtual NICs. And for the physicals NICs as well. But I've tried my tests both with / without TSO (and GSO / GRO).
- Hyper-V Nova Compute is beefy enuf: 2 Xeon 2690 sockets (8-way) running hyperthreaded for a total of 32 cores. 256GB RAM. Separate dedicated NIC for management (as well as OOB iLO). 8x1.2TB disks running as RAID-1 for two OS disks, RAID-6 for the remaining disks. Tons of disk space.
- My networking treats the 10GbE uplink on the Hyper-V Nova Compute host as a trunk device. I use a dedicated vif / VLAN for GRE traffic - and I keep that traffic fitting in a standard 1500-byte frame. FWIW - my OpenStack traffic is on a separate vif (and separate VLAN).
- I use tenant-private networking with overlapping IPs (OpenFlow helps to keep all that sorted out pretty magically).
Here's how things look to Open vSwitch (and Windows):
PS C:\Users\Administrator> Get-NetAdapter
Name InterfaceDescription ifIndex Status MacAddress LinkSpeed
---- -------------------- ------- ------ ---------- ---------
br-tun Hyper-V Virtual Ethernet Adapter #7 31 Up 00-15-5D-09-05-0A 10 Gbps
br-int Hyper-V Virtual Ethernet Adapter #6 30 Up 00-15-5D-09-05-09 10 Gbps
stg Hyper-V Virtual Ethernet Adapter #5 28 Up 00-15-5D-09-05-06 10 Gbps
gre Hyper-V Virtual Ethernet Adapter #4 27 Up 00-15-5D-09-05-05 10 Gbps
aos Hyper-V Virtual Ethernet Adapter #3 26 Up 00-15-5D-09-05-04 10 Gbps
br-eno1 Hyper-V Virtual Ethernet Adapter #2 25 Up 00-15-5D-09-05-03 10 Gbps
ens1f3 HP NC365T PCIe Quad Port Gigabit S...#4 16 Not Present AC-16-2D-A1-3D-83 0 bps
ens1f2 HP NC365T PCIe Quad Port Gigabit S...#3 12 Not Present AC-16-2D-A1-3D-82 0 bps
ens1f1 HP NC365T PCIe Quad Port Gigabit S...#2 17 ...