Programmable network connectivity and network overlay technologies like Docker libnetwork, Weave Net, and Calico are essential tools for DevOps engineers using orchestration tools to manage and deploy Docker containers in production. Because network troubleshooting and optimization falls within the jurisdiction of DevOps, it’s vital that DevOps engineers understand exactly how network overlays work. Participants will learn the fundamentals of container networking, see practical examples of common network overlays, and receive guidance on effectively using and tuning network overlays.
19. Dirk Wallerstorfer, @wall_dirk 19
Three reasons for SDN
• Permanent connectivity
• Virtualization of everything
• Paradigm shift in software development
20. Dirk Wallerstorfer, @wall_dirk 20
Three reasons for SDN
Networking had to keep up
somehow!
Continuous
delivery
Virtualize
everything
Permanent
connectivity
34. Dirk Wallerstorfer, @wall_dirk 34
Multi-host Container Networking
Overlay Protocols
• VXLAN
• Ethernet in UDP, defacto standard, won the overlay war
• NVGRE
• Ethernet in IP, Microsoft’s answer to a question nobody asked
• STT
• Ethernet in fake TCP, to utilize TSO of NIC
• Geneve
• Ethernet in UDP, best of breed approach
• A+ for extensibility
• https://packetpushers.net/podcast/podcasts/pq-show-68-geneve-data-center-overlay-update/
55. Dirk Wallerstorfer, @wall_dirk 55
Project Calico
• Host is a router for the workloads
• BGP to distribute routes
• etcd backed
• Pure Layer 3, no encapsulation
70. 70Dirk Wallerstorfer, @wall_dirk
February 20, 2016
http://machinezone.github.io/research/networking-solutions-for-kubernetes/
Performance Comparison of Networking
Solutions for Kubernetes
71. 71Dirk Wallerstorfer, @wall_dirk
Performance Comparison of Networking
Solutions for Kubernetes
February 20, 2016
http://machinezone.github.io/research/networking-solutions-for-kubernetes/
72. 72Dirk Wallerstorfer, @wall_dirk
Performance Comparison of Networking
Solutions for Kubernetes
February 20, 2016
http://machinezone.github.io/research/networking-solutions-for-kubernetes/
to get a feeling how much experience you guys have, I want to start with a joke – it’s also a good way to start the day, either you will laugh because of the joke or because of me trying to tell the joke – either way, laughing is healthy no matter what, so here we go
containers, history, usage
container networking, different options, network overlays, flat networks, we’ll take a look at options and compare the most common options
software defined everything!
int his case it’s about software defined networking and why you should care
Guidance, pitfalls, and tuning
what is the takeaway? you will use these technologies, and there are some problem that you will run into if you get started, I want to tell you upfront what these problems are and how you can prevent them
there is not that much experience with these technologies today, but we have already seen some pitfalls that I would want to make you aware of
to that end, I want to explain to you how container networking works in detail, what kind of options there are for container networking, about the compatibility of the networking options to common orchestration tools you would definitely need if you did want to run containers in production, and
container networking, different options, network overlays, flat networks, we’ll take a look at options and compare the most common options
software defined everything!
int his case it’s about software defined networking and why you should care
today there are proprietary protocols, hundreds of network components that need to be configured correctly, there are solutions but vendor lock-in
network operators today have to “coax” their network equipment and its embedded control plane to meet their networking objectives
worst case scenario: exaggeration: new loadbalancer, webservers, database cluster -> network configuration must be done by hand -> this will take forever
classical SDN, ideas have been around for many years, promote concepts that can help make network administration easier again, e.g.
centralized vs. decentralized control plane, global network-wide view of network state
intents can be communicated through the northbound API of the controller that is independent of the physical network components; no intent: “create a new OSPF route on switch 16 and update the access control list on switches 3 to 5 and routers 1, 4, and 7”; intent: “connect server A and B, make sure that there is at least 20 MBit/s bandwidth available and the best possible transport quality”
direct control to the underlying network components through the southbound API of the controller (e.g. OpenFlow), that configures forwarding and routing tables in commodity hardware
today there are proprietary protocols, hundreds of network components that need to be configured correctly, there are solutions but vendor lock-in
network operators today have to “coax” their network equipment and its embedded control plane to meet their networking objectives
worst case scenario: exaggeration: new loadbalancer, webservers, database cluster -> network configuration must be done by hand -> this will take forever
classical SDN, ideas have been around for many years, promote concepts that can help make network administration easier again, e.g.
centralized vs. decentralized control plane, global network-wide view of network state
intents can be communicated through the northbound API of the controller that is independent of the physical network components; no intent: “create a new OSPF route on switch 16 and update the access control list on switches 3 to 5 and routers 1, 4, and 7”; intent: “connect server A and B, make sure that there is at least 20 MBit/s bandwidth available and the best possible transport quality”
direct control to the underlying network components through the southbound API of the controller (e.g. OpenFlow), that configures forwarding and routing tables in commodity hardware
not for large enterprises that run their own networks over several locations, for mid-size enterprise that for example use MPLS VPN services of their ISP but don’t want to route all the traffic over this connection since it’s quite expensive in terms of dollar/bit, alternative connection like the public Internet is often good enough
operates on existing underlying network, regardless of how the underlying network is configured
little invest necessary, if you already have a working network in place
NOTHING NEW: IPSec, MPLS, GRE, SSL VPN, ... overlay and encapsulation have been around for many years
although this works, you are limited to IP addresses of hosts again – so there is virtually no benefit
service resiliency
etcd, zookeeper, consul: to keep network overlay information in sync and share with all other overlay participants
some ports must be accessible for these tools to exchange information, ports vary
look in detail how these two networking options work, and take one example
VXLAN is the defacto standard, everybody knows it, has been around for X years now, hardware support from several vendors – it’s the technology to go for, if you don’t know what to choose – for now
VMware NSX also relies on VXLAN, a lot of networking that is going on in the OpenStack area relies on VXLAN
Open vSwitch relies on VXLAN
STT – stateless tunneling protocol, fake TCP header to utilize TCP segmentation offloading of NIC and not all computing has to be done by the CPU
NVGRE – alternative to VXLAN, works quite similar, works on IP level
Geneve – the new’er’ kid on the block, already supported by OVN and OpenV Switch,
VXLAN is the defacto standard, everybody knows it, has been around for X years now, hardware support from several vendors – it’s the technology to go for, if you don’t know what to choose – for now
VMware NSX also relies on VXLAN, a lot of networking that is going on in the OpenStack area relies on VXLAN
Open vSwitch relies on VXLAN
STT – stateless tunneling protocol, fake TCP header to utilize TCP segmentation offloading of NIC and not all computing has to be done by the CPU
NVGRE – alternative to VXLAN, works quite similar, works on IP level
Geneve – the new’er’ kid on the block, already supported by OVN and OpenV Switch,
VXLAN is the defacto standard, everybody knows it, has been around for X years now, hardware support from several vendors – it’s the technology to go for, if you don’t know what to choose – for now
VMware NSX also relies on VXLAN, a lot of networking that is going on in the OpenStack area relies on VXLAN
Open vSwitch relies on VXLAN
STT – stateless tunneling protocol, fake TCP header to utilize TCP segmentation offloading of NIC and not all computing has to be done by the CPU
NVGRE – alternative to VXLAN, works quite similar, works on IP level
Geneve – the new’er’ kid on the block, already supported by OVN and OpenV Switch,
VXLAN is the defacto standard, everybody knows it, has been around for X years now, hardware support from several vendors – it’s the technology to go for, if you don’t know what to choose – for now
VMware NSX also relies on VXLAN, a lot of networking that is going on in the OpenStack area relies on VXLAN
Open vSwitch relies on VXLAN
STT – stateless tunneling protocol, fake TCP header to utilize TCP segmentation offloading of NIC and not all computing has to be done by the CPU
NVGRE – alternative to VXLAN, works quite similar, works on IP level
Geneve – the new’er’ kid on the block, already supported by OVN and OpenV Switch,
VXLAN is the defacto standard, everybody knows it, has been around for X years now, hardware support from several vendors – it’s the technology to go for, if you don’t know what to choose – for now
VMware NSX also relies on VXLAN, a lot of networking that is going on in the OpenStack area relies on VXLAN
Open vSwitch relies on VXLAN
STT – stateless tunneling protocol, fake TCP header to utilize TCP segmentation offloading of NIC and not all computing has to be done by the CPU
NVGRE – alternative to VXLAN, works quite similar, works on IP level
Geneve – the new’er’ kid on the block, already supported by OVN and OpenV Switch,
explain the joke at this slide!
if the encapsulation protocol would also use TCP, then the self healing powers of TCP would be redundant! the TCP stack of the container takes care of making sure that the connection is intact – if a UDP datagram is dropped along the way, TCP takes care of requesting this segment of data again – so stateless transport at this level is sufficient
explain the joke at this slide!
if the encapsulation protocol would also use TCP, then the self healing powers of TCP would be redundant! the TCP stack of the container takes care of making sure that the connection is intact – if a UDP datagram is dropped along the way, TCP takes care of requesting this segment of data again – so stateless transport at this level is sufficient
macvlan, ipvlan
ONS2016
overlay adds a layer of abstraction = adds a layer of complexity
VXLAN
good for development environment
greenfield environment that don’t have legacy stuff
IPVLAN
future of large scale networking!
ridiculously fast, near wire
no ARP, no broadcast
turn primary NIC on a linux host into a router
how I imagine the idea for Calico was found
how I imagine the idea for Calico was found
Apache Felix for route configurations
In reality it is like with every other network ... you’ll run into similar troubles, e.g. connectivity problems, while you can’t
Location of services that talk to each other: k8s pods, marathon application groups, docker swarm constraints
watch out that services that talk to each other a lot a colocated to a certain extent, e.g. k8s pods, marathon application groups, swarm constraints, fleet units, ... or manually; make sure that the traffic between two services doesn’t go through every switch in your datacenter before arriving at your target
workload runs in container with any of the previously mentioned container networking options and you notice packet loss and connectivity problems of the offered services
check log files
iptables is a stateful firewall and TCP is a stateful protocol, the conntrack table tracks all active connections that are processed concurrently on one host. if there are several workloads with high network activity, the connection table can get full and you cannot statefully track new connections anymore until older connections are closed
Commonly, the largest IP packet that can fit into Ethernet frame
MTU: encapsulation transfer and CPU overhead, measurements from packetpushers, ... 3-6% overhead with encapsulation!
encapsulation protocols have impact on MTU; consider scenario with multiple encapsulations ... what about data connections between two locally distributed sites that are connected with a IPSEC/SSL-VPN tunnel?
VXLAN vs STT vs Geneve: TCP checksum offloading, TCP segmentation offload, ready for encryption, room for custom data
you can destroy your network performance by making administration easier ... the cost/tradeoff should be clear to everyone up front!
one major US retailer correlates 0.5 second site slowdown to 11% reduction in conversions!
one major US retailer correlates 0.5 second site slowdown to 11% reduction in conversions!
the choice of your networking driver for your multi-host container communication actually does matter – in fact, it matters a lot!
solving a problem with complexity almost never turns out great, keep it as simple as possible while respecting your needs and requirements
make sure you have an overview of what’s going on and don’t jump to conclusion too quickly
follow twitter and see which orchestrators and networking vendor are the most active and have the most integrations, these are also the ones that will help you when you’re stuck and these will be the ones that are going to be around for the longest time
volume-oriented and quality-oriented metrics
volume-oriented and quality-oriented metrics
short summary about what we’ve talked – let’s see if there are any questions