Join as we discuss various OpenStack Neutron network configuration options and issues experienced with VLAN, VXLAN, L2population, multicast, Neutron routers, Open vSwitch and more.
AWS Community Day CPH - Three problems of Terraform
Â
DevOops - Lessons Learned from an OpenStack Network Architect
1. DEVOOPS:
LESSONS LEARNED FROM
A CLOUD NETWORK ARCHITECT
James Denton
Principal Architect
@jimmdenton
Jonathan Almaleh
OpenStack Network Architect
@ckent99999
4. 4
WHY WEâRE HERE : THE LESSON
⢠Consider everything when designing:
- Business requirements
- âCommunityâ opinions
- Available resources
- Scale
- Performance requirements
⢠Be willing and able to change and find creative solutions when issues arise
or priorities change for your private cloud
⢠Understand your options and how to migrate or transition rather than
scratch and rebuild
5. 5
NOT WHY WEâRE HERE
⢠To knock or discourage any technology
⢠To advocate one method or solution
over another
⢠To say that one is better than the other
(but we all know Superman > Batman)
7. 7
WHY NEUTRON
⢠Chose Neutron over Nova-Network
- Eventual nova-network deprecation. How eventual, no one was sure.
⢠Gained tenant-managed networking
⢠First glimpse of overlay networking
⢠Obvious community direction
8. 8
THE ISSUES
⢠Open vSwitch 1.x
- Packet loss / corruption
- Slow (Microflows vs Megaflows)
- Kernel panics
⢠Neutron agents immature
- Oops, sorry about the bridging loops
Bugs:
1228313 - Multiple tap interfaces on controller have overlapping tags
1324703 - Default NORMAL flows on OVS bridges at boot has potential to cause network storm
9. 9
Live migration from OVS to LinuxBridge
THE FIX
- Upgrade from Grizzly to Havana
- Migrate from OVS plugin to ML2 plugin
- Hack the database
- Converted OVS to ML2 schema
- Converted GRE networks to VLAN
- Delete the bridges and interfaces
- Restart the agent
- âŚ
- Profit!
10. 10
THE FUTURE
⢠LinuxBridge became the standard driver for Rackspace Private Cloud
starting with the Icehouse release
⢠Open vSwitch, and Neutron itself, continue to mature and focus on stability,
speed, and functionality
⢠Some features depend on the use of OVS, such as TAPaaS, OVN, DVR,
and more
⢠Open vSwitch and LinuxBridge are both supported mechanism drivers and
switching technologies for RPC
12. 12
TENANT NETWORKING
⢠Neutron network type that uses VXLAN
overlay to tunnel instance traffic between
hosts
⢠Runs over UDP
⢠Uses a unique segmentation ID, VXLAN
Network Identifier (VNI)
⢠~16 million unique IDs
⢠Considered a better tunneling protocol for
cloud versus GRE
⢠Neutron network type that uses the
more traditional 802.1q VLAN tagging to
pass and segment traffic between hosts
⢠Limited to 4096 "REAL" datacenter
VLANs - this limit is much lowered
depending on spanning tree mode.
VXLAN VLAN
13. 13
THE ISSUES
⢠MTU woes
- Configure instances to drop MTU by ~50
- Or, configure jumbo MTU on VTEP interface
- 1242534 - Linux Bridge MTU bug when the VXLAN tunneling is used
⢠L2population
- Dropped packets due slow FDB and ARP table updates
- Missing entries may require static programming for quick resolution
- Inability to leverage allowed-address-pairs extension due to ARP proxy
- 1445089 - allowed-address-pairs broken with l2pop/arp responder and
LinuxBridge/VXLAN
14. 14
THE ISSUES
⢠Slow throughput of TCP traffic
compared to VLAN
- Newer network cards needed to handle
offloading
0
2
4
6
8
10
12
VXLAN (without offload) VXLAN (with offload) VLAN
Aggregatethroughput for TCP traffic on 10G
interface
15. 15
THE ISSUES
⢠FDB bug resulted in duplicate flooding entries
- 1531013 - Duplicate entries in FDB table
- 1568969 - FDB table grows out of control
00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.133 self permanent
00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.133 self permanent
âŚ
00:00:00:00:00:00 dev vxlan-15 dst 172.29.243.133 self permanent
28870 entries
16. 16
Live migration from VXLAN to VLAN
THE FIX
- Provision and trunk a range of VLANs to
replace VNIs
- Set default tenant network type to VLAN
- Hack the database, again
- Convert VXLAN networks to VLAN
- Delete vxlan interfaces on all hosts
- Restart the agent
- Watch the magic
Shall I dare tempt fate again?
18. 18
THE FUTURE
⢠VLAN tenant networks may have a place after all
- Less work for Neutron agents
- Better performance and better scale in some cases
⢠VXLAN moving from the host to ToR switch
⢠Invest in newer hardware
- Mellanox ConnectX-3 Pro
- Intel XL710
21. 21
NORTH/SOUTH CONNECTIVITY
⢠Allows for multiple tenant networks to attach
to a single provider network via Neutron
Routers
⢠Provides routing between instances in
different tenant networks
⢠Allows user to easily create/modify/delete
NATs to instances
⢠Requires little to no change to the physical
infrastructure as tenants and respective
routers are created
⢠Allows for overlapping subnets
⢠Ability to use VPNaaS and FWaaS
⢠Instances are directly in a VLAN provider
network behind a physical network gateway
such as a router or firewall
⢠Each compute can forward instance traffic to a
physical gateway device
⢠As new networks are created, the physical
gateway device must be updated with new
interfaces
⢠Requires NATing to be addressed on physical
device
L3 Agent w/ Routers Provider Networks Only
22. 22
THE ISSUES
⢠A single router became a single point of failure
⢠Possible network congestion routing through a single router, or a network
node hosting multiple routers
⢠Router failover or reschedule requires reprogramming of interfaces and
floating IPs
- Could take minutes to restore connectivity
23. 23
THE ISSUES
⢠Increased time to reach newly booted instances
Consistent build
time between 6 â
10 sec
Increased accessibility
time up to over 2
minutes
Drop in TTP result of bug fix:
1566007 - l3 iptables floating IP
rules don't match iptables rules
24. 24
THE ISSUES
⢠The use of NAT had a negative impact on some applications
- WMI/DCOM on Windows
⢠Sometimes, users wanted to access via fixed and floating IP
- Required static routes to Neutron routers
25. 25
So long, L3âŚ
THE FIX
- Detach the routers from Neutron Networks
- Create and address new interfaces on
physical network device
- Address upstream routing
- Delete routers and restart DHCP agents
- Reboot instance or renew DHCP leases
27. 27
THE FUTURE
⢠BGP speaker functionality will allow access directly to tenant networks
behind Neutron routers
- This will also eliminate need for floating IPs in some cases
⢠Distributed virtual routers (DVR) address the SPoF and throughput issues
- Requires OVS
⢠L3 HA provides automatic failover and resiliency via VRRP/keepalived
- No reliance on external script or check
- Few bugs recently addressed in recent release
30. 30
VTEP LEARNING
⢠VTEP learning process that relies on
static programming of forwarding table and
ARP table on all hosts
⢠Implemented and managed by a Neutron
agent
⢠Developed by Neutron community
⢠Requires consistent programming across
all hosts for proper operation
⢠Does not require physical switch or router
changes
⢠VTEP learning process that uses
multicast to distribute forwarding
information to hosts in the multicast group
⢠Not managed by Neutron agent
⢠Leverages vxlan or OVS kernel module
for operation
⢠Requires IGMP configuration on switches
and routers
L2population Multicast
33. 33
L2POP ISSUES
⢠With l2pop, some agents would fail to properly build the FDB and ARP
tables
- An issue with the agent, server, or message bus could result in missing
messages
- Kernel bug resulted in millions of duplicate flooding entries
- Connectivity issues between instances and gateways
⢠As a result, multicast recommended over L2population, and became
default in OpenStack-Ansible
34. 34
MULTICAST ISSUES
⢠Physical infrastructure had not been configured to support multicast
- Requires IGMP snooping and IGMP querier if multicast router not present
⢠Without a reboot, forwarding database and ARP table failed to properly
populate or contained stale data
⢠As a result, all traffic between instances, DHCP servers, and routers failed
on VXLAN networks
35. 35
Multicast, out!
THE FIX
- Smaller cloud had little to no scaling issues
with l2pop or L3 Neutron Routers
- Created an Ansible override to make
L2popluation the default, again
- VXLAN interfaces deleted, and agents
restarted to rebuild VXLAN mesh and properly
build FDB and ARP tables
36. 36
THE FUTURE
⢠L2population can continue to be used for smaller scale clouds
- Lower number of hosts and networks
⢠Multicast may work better for larger scale clouds
- Will require proper configuration of physical switches/routers
⢠BGP/eVPN for robust propagation between ToR switches
- Move VXLAN off the hosts!
37. 37
RECAP
⢠If you live on the bleeding edge, prepare to feel some pain
⢠Sometimes, short term pain is needed for long term gain
- Gaining operational experience in the early days was crucial for understanding and adoption of features
⢠Available hardware and overall network requirements should be considered when choosing
supported network types
- How many networks am I expected to support? Do I require âscalableâ networking? What type of networking
features are necessary?
⢠Investments in hardware should be made for optimal performance
⢠End users can lose faith in the system if it doesnât provide reasonable access times, stability, and
consistency
⢠Thereâs nothing wrong with keeping things simple.
39. 39
Copyright Š 2016 Rackspace | RackspaceŽ Fanatical SupportŽ and other Rackspace marks are either registered service marks or service marks of Rackspce US, Inc. in the United States and other countries. Features, benefits
and pricing presented depend on system configuration and are subject to change without notice. Rackspace disclaims any representation, warranty or other legal commitment regarding its services except for those expressly
stated in a Rackspace services agreement. All other trademarks, service marks, images, products and brands remain the sole property of their respective holders and do not imply endorsement or sponsorship.
ONE FANATICAL PLACE | SAN ANTONIO, TX 78218
US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM
Hinweis der Redaktion
How to get from A->B without a repave.
Figuring out how it works, how to engineer and implement solution, and have it be stable (or more stable than before)
Weâve seen a lot of clouds come and go, and along with it, configurations that worked and others that didnât work so well. Every now and then, some of those decisions come back to haunt you. Other, require low maintenance with the trade-off of lack of functionality.
Why do we bring up the old stuff? Believe it or not, itâs still out there. WaitingâŚ
I donât thereâs any question that Neutron has outpaced Nova-Network in adoption and features over the last 5-6 releases
In 2013, with the move from Folsom to Grizzly, we adopted Neutron (then Quantum) over Nova-Network and with it, Open vSwitch. The writing was on the wall.
With Quantum/Neutron, we got tenant-managed networking for private networks and used provider networks on the front-side. We did not yet support L3-agent.
We lost floating IP functionality vs nova-networking, but the gains were better than losses.
The community documentation all referenced OVS. LinuxBridge unstable or not marketed well/at-all.
Everything was new, or cutting edge, and stability was questionable. Environments were delicate, network-wise, and required more supervision that previous deployments.
Briefly describe that tenant network segmentation can be perfomed
VLANs are limiting (thereâs only 4,096 max), but in reality much less due to Spanning Tree instance limitations on physical switches (PVST/MVST)
VXLAN offers more scale (more networks) and leverages a single interface or bond.
Allows for extensive on-demand network creations
No switch management needed after initial setup (in some cases)
The reality is, many private clouds we come across donât come near the scale that VXLAN was designed to solve and VLANs are perfectly viable option.
VXLAN over head mixed with a standard MTU of 1500 caused SSH login failures. TCP 22 connections would pass, however a full SSH login attempt would fail. We solved this by lowering the MTU inside the instance via DHCP option. Alternatively, you could raise the MTU across all interfaces, but to do so may require buy-in from the network team and planning.
Instances would boot, but the FDB and ARP table would fail to be updated in time (or at all), causing the DHCP to fail and the instance to never get its address. Or, the instance may not be able to communicate with gateway or vice-versa.
Allowed Address pairs failures due to agent failing to program shared IP addresses and ProxyARP preventing ARP Broadcast
Manual addition of the ARP entry would solve. https://bugs.launchpad.net/neutron/+bug/1445089
Neutron bug with FDB resulted in 1000s of duplicate âfloodingâ entries. https://bugs.launchpad.net/neutron/+bug/1531013
This one was much simpler than the original maintenance that inspired it. After all, there was no ML2 migration needed. Just simply modify the network segments table, delete the vxlan interfaces, restart the agents, and go! The VMs were none the wiser.
This one was much simpler than the original maintenance that inspired it. After all, there was no ML2 migration needed. Just simply modify the network segments table, delete the vxlan interfaces, restart the agents, and go! The VMs were none the wiser.
We learned that as the cloud scales out with additional nodes and more and more networks, the updates tend to have a major effect on the Neutron agents on the network nodes.
During a event, such as agent restart of host down/up, it may take tens of minutes, or more, to recreate all of the FDB/ARP entries.
When first developed, it was the only way you could have access to tenant networks
When first developed, it was the only way you could have access to tenant networks
Single-router limitation addressed by DVR
NAT had negative impact on some Windows applications
Static routes meant to be addressed with BGP speaker
You lose flexibility, but gain stability and move failover to the hardware layer.
One important thing to know about VXLAN is that both physical hosts involved in a connection between virtual machines need to be programmed with information on how to reach the other side. There are two methods of programming available in Neutron today.
We chose the l2population driver in Icehouse to manage FDB/ARP tables across hosts for VXLAN functionality
L2population was preferred over multicast, since there was no need to configure the physical switching infrastructure for multicast.
IssuesâŚ
As a result, a change to multicast was done
Better performance and better scale when talking about # of nodes, VMs, and networks.