2. • Senior Devops Engineer at Onmobile Global
• Part of Bangalore Docker Community
• One of the Contributors in Docker Labs
• Have travelled extensively from work.
• A Linux, Docker Enthusiast, Avid Runner and Cyclist
LinkedIn profile https://in.linkedin.com/in/balasundaram-natarajan-43471115
Who am I?
3. •Overview of Container Networking Standards
•Docker CNM-Container Networking Model Dive
•Kubernetes CNI- Container Networking Interface Dive
Agenda
6. Putting all Container Standards together
Allows users to build container
images with any tool they choose.
Different tools are good for
different use cases.
OCI compliant runtimes can
consume the config.json and
root filesystem, and tell the
kernel to create a container.
OCI compliant runtimes can
consume the config.json and
root filesystem, and tell the
kernel to create a container.
The container engine is
responsible for creating the
config.json file and unpacking
images into a root file system.
8. Container Building Blocks
Namespace
• Linux provides seven different namespaces
(Cgroup, IPC, Network, Mount, PID, User and UTS).
• Network namespaces (CLONE_NEWNET) determine the network resources that are
available to a process,
• Each network namespace has its own network devices, IP addresses, IP routing
tables, /proc/net directory, port numbers, and so on.
cgroups:
• blkio, cpu, cpuacct, cpuset, devices, hugetlb, memory,
• net_cls,net_prio, pids, Freezer,Perf_events,ns
• xt_cgroup(cgroupv2)
9. Container Building Blocks
In cgroups v1, you could assign threads of the same process to different cgroups.But in Cgroup v2, this is not
possible. Rhel8 by default comes up with cgroupv2.
Note: Kernel version has to be 4.5 and above
30. What is Service Discovery
The ability to discover services within a Swarm
• Every service registers its name with the Swarm
• Every task registers its name with the Swarm
• Clients can lookup service names
• Service discovery uses the DNS resolver embedded inside each
container and the DNS server inside of each Docker Engine
31. Service Discovery Big Picture
“mynet” network (overlay)
Docker host 1
task1.myservice task2.myservice
Docker host 2
task3.myservice
task1.myservice 10.0.1.19
task2.myservice 10.0.1.20
task3.myservice 10.0.1.21
myservice 10.0.1.18
Swarm DNS (service discovery)
32. 32
Service Virtual IP (VIP) Load Balancing
• Every service gets a VIP when it’s created
• This stays with the service for its entire life
• Lookups against the VIP get load-balanced across all healthy
tasks in the service
• Behind the scenes it uses Linux kernel IPVS to perform transport
layer load balancing
• docker service inspect <service> (shows the service VIP)
NAME HEALTHY IP
myservice 10.0.1.18
task1.myservice Y 10.0.1.19
task2.myservice Y 10.0.1.20
task3.myservice Y 10.0.1.21
task4.myservice Y 10.0.1.22
task5.myservice Y 10.0.1.23
Service
VIP
Load balance
group
33. 33
What is the Routing Mesh
Native load balancing of requests coming from an external source
• Services get published on a single port across the entire Swarm
• Incoming traffic to the published port can be handled by all Swarm
nodes
• A special overlay network called “Ingress” is used to forward the
requests to a task in the service
• Traffic is internally load balanced as per normal service VIP load
balancing
34. Routing Mesh Example
Docker host 2
task2.myservice
Docker host 1
task1.myservice
Docker host 3
IPVS IPVS IPVS
Ingress network
8080 8080 8080
“mynet” overlay network
LB
1. Three Docker hosts
2. New service with 2 tasks
3. Connected to the mynet overlay
network
4. Service published on port 8080
swarm-wide
5. External LB sends request to Docker
host 3 on port 8080
6. Routing mesh forwards the request to
a healthy task using the ingress network
41. Services Tasks
• Services provide a piece of functionality
• Based on a Docker image
• Replicated Services and Global Services
• Tasks are the containers that actually do the work
• A service has 1-n tasks
42. How service deployment works
$ docker service create declares
the service name, network, image:tag
and scale
Managers break down service into
tasks, schedules them and
workers execute tasks
Engines check to see what is running
and compared to what was declared
to “true up” the environment
Declare
ScheduleReconcile
56. • The container runtime must create a new network namespace for the container before invoking any plugins.
• The runtime must then determine which networks this container should belong to, and for each network, which plugins must
be executed.
• The network configuration is in JSON format and can easily be stored in a file. The network configuration includes mandatory
fields such as "name" and "type" as well as plugin (type) specific ones. The network configuration allows for fields to change
values between invocations. For this purpose there is an optional field "args" which must contain the varying information.
• The container runtime must add the container to each network by executing the corresponding plugins for each network
sequentially.
• Upon completion of the container lifecycle, the runtime must execute the plugins in reverse order (relative to the order in
which they were executed to add the container) to disconnect the container from the networks.
• The container runtime must not invoke parallel operations for the same container, but is allowed to invoke parallel operations
for different containers.
• The container runtime must order ADD and DEL operations for a container, such that ADD is always eventually followed by a
corresponding DEL. DEL may be followed by additional DELs but plugins should handle multiple DELs permissively (i.e. plugin
DEL should be idempotent).
• A container must be uniquely identified by a ContainerID. Plugins that store state should do so using a primary key of
(network name, CNI_CONTAINERID, CNI_IFNAME).
• A runtime must not call ADD twice (without a corresponding DEL) for the same (network name, container id, name of the
interface inside the container). This implies that a given container ID may be added to a specific network more than once only
if each addition is done with a different interface name.
Points to consider on implementing CNI
87. 3rd Party CNI Plugins
• Calico provides high scalability on distributed architectures such as Kubernetes, Docker, and OpenStack.
• Cilium provides network connectivity and load balancing between application workloads, such as application containers and processes, and ensures transparent security.
• Contiv integrates containers, virtualization, and physical servers based on the container network using a single networking fabric.
• Contrail provides overlay networking for multi-cloud and hybrid cloud through network policy enforcement.
• Flannel makes it easier for developers to configure a Layer 3 network fabric for Kubernetes.
• Multus supports multiple network interfaces in a single pod on Kubernetes for SRIOV, SRIOV-DPDK, OVS-DPDK, and VPP workloads.
• Open vSwitch (OVS) offers a production-grade CNI platform with a standard management interface on OpenShift and OpenStack.
• ovn-kubernetes - an container network plugin built on Open vSwitch (OVS) and Open Virtual Networking (OVN) with support for both Linux and Windows
• Romana makes cloud network functions less expensive to build, easier to operate, and better performing than traditional cloud networks.
• Juniper Contrail / TungstenFabric - Provides overlay SDN solution, delivering multicloud networking, hybrid cloud networking, simultaneous overlay-underlay support, network policy
enforcement, network isolation, service chaining and flexible load balancing
• CNI-Genie - generic CNI network plugin
• Nuage CNI - Nuage Networks SDN plugin for network policy kubernetes support
• Silk - a CNI plugin designed for Cloud Foundry
• Linen - a CNI plugin designed for overlay networks with Open vSwitch and fit in SDN/OpenFlow network environment
• Vhostuser - a Dataplane network plugin - Supports OVS-DPDK & VPP
• Amazon ECS CNI Plugins - a collection of CNI Plugins to configure containers with Amazon EC2 elastic network interfaces (ENIs)
• Bonding CNI - a Link aggregating plugin to address failover and high availability network
• Terway - a collection of CNI Plugins based on alibaba cloud VPC/ECS network product
• Knitter - a CNI plugin supporting multiple networking for Kubernetes
• DANM - a CNI-compliant networking solution for TelCo workloads running on Kubernetes
• VMware NSX – a CNI plugin that enables automated NSX L2/L3 networking and L4/L7 Load Balancing; network isolation at the pod, node, and cluster level; and zero-trust security
policy for your Kubernetes cluster.
• SR-IOV CNI plugin-for discovering and advertising SRIOV network virtual functions (VFs) in a Kubernetes host.