Containers Docker Kind Kubernetes Istio

@arafkarsh arafkarsh
ARAF KARSH HAMID
Co-Founder / CTO
MetaMagic Global Inc., NJ, USA
@arafkarsh
arafkarsh
Microservices
Architecture Series
Building Cloud Native Apps
Docker
Kubernetes, KinD
Service Mesh: Istio
Part 7 of 11

Docker / Kubernetes / Istio
Containers Container Orchestration Service Mesh
2
MicroservicesArchitectureStyles © 2017 byAraf Karsh Hamid is licensed under CC BY 4.0

Slides are color coded based on the topic colors.
Linux Containers
Docker
1
Kubernetes
2
Kubernetes
Networking &
Packet Path
3
Service Mesh: Istio
Best Practices 4
3

• 12 Factor App Methodology
• Docker Concepts
• Images and Containers
• Anatomy of a Dockerfile
• Networking / Volume
Docker
1
• Kubernetes Concepts
• Namespace
• Pods
• RelicaSet
• Deployment
• Service / Endpoints
• Ingress
• Rollout and Undo
• Auto Scale
Kubernetes
2
• API Gateway
• Load Balancer
• Service Discovery
• Config Server
• Circuit Breaker
• Service Aggregator
Infrastructure Design Patterns
4
• Environment
• Config Map
• Pod Presets
• Secrets
3 Kubernetes – Container App Setup
4

• Docker / Kubernetes Networking
• Pod to Pod Networking
• Pod to Service Networking
• Ingress and Egress – Internet
Kubernetes Networking – Packet Path
7
• Kubernetes IP Network
• OSI | L2/3/7 | IP Tables | IP VS |
BGP | VXLAN
• Kube DNS | Proxy
• LB, Cluster IP, Node Port
• Ingress Controller
Kubernetes Networking Advanced
8
• In-Tree & Out-of-Tree Volume Plugins
• Container Storage Interface
• CSI – Volume Life Cycle
• Persistent Volume
• Persistent Volume Claims
• Storage Class
Kubernetes Volumes
5
• Jobs / Cron Jobs
• Quotas / Limits / QoS
• Pod / Node Affinity
• Pod Disruption Budget
• Kubernetes Commands
Kubernetes Advanced Concepts
6
5

• Docker Best Practices
• Kubernetes Best Practices
• Security Best Practices
13 Best Practices
• Istio Concepts / Sidecar Pattern
• Envoy Proxy / Cilium Integration
10 Service Mesh – Istio
• Security
• RBAC
• Mesh Policy | Policy
• Cluster RBAC Config
• Service Role / Role Binding
Istio – Security and RBAC
12
• Gateway / Virtual Service
• Destination Rule / Service Entry
• AB Testing using Canary
• Beta Testing using Canary
Istio Traffic Management
11
• Network Policy L3 / L4
• Security Policy for Microservices
• Weave / Calico / Cilium / Flannel
Kubernetes Network Security Policies
9
6

Agile
Scrum (4-6 Weeks)
Developer Journey
Monolithic
Domain Driven Design
Event Sourcing and CQRS
Waterfall
Optional
Design
Patterns
Continuous Integration (CI)
6/12 Months
Enterprise Service Bus
Relational Database [SQL] / NoSQL
Development QA / QC Ops
7
Microservices
Domain Driven Design
Event Sourcing and CQRS
Scrum / Kanban (1-5 Days)
Mandatory
Design
Patterns
CI
DevOps
Event Streaming / Replicated Logs
SQL NoSQL
CD
Container Orchestrator Service Mesh

12 Factor App Methodology
4 Backing Services Treat Backing services like DB, Cache as attached resources
5 Build, Release, Run Separate Build and Run Stages
6 Process Execute App as One or more Stateless Process
7 Port Binding Export Services with Specific Port Binding
8 Concurrency Scale out via the process Model
9 Disposability Maximize robustness with fast startup and graceful exit
10 Dev / Prod Parity Keep Development, Staging and Production as similar as possible
11 Logs Treat logs as Event Streams
12 Admin Process Run Admin Tasks as one of Process
Source:
https://12factor.net/
Factors Description
1 Codebase One Code base tracked in revision control
2 Dependencies Explicitly declare dependencies
3 Configuration Configuration driven Apps
8

Cloud Native
9
Cloud Native computing uses an
opensource software stack
to deploy applications as microservices,
packaging each part into its own container,
and dynamically orchestrating those
containers to optimize resource utilization.
As defined by CNCF
https://www.cncf.io/about/who-we-are/

Docker Containers
• 12 Factor App Methodology
• Docker Concepts
• Images and Containers
• Anatomy of a Dockerfile
• Networking / Volume
Source: https://github.com/MetaArivu/k8s-workshop
10
1

What’s a Container?
Virtual
Machine
Looks like a
Walks like a
Runs like a
Containers are a Sandbox inside Linux Kernel sharing the kernel with
separate Network Stack, Process Stack, IPC Stack etc.
They are NOT Virtual Machines or Light weight Virtual Machines.
11

Servers / Virtual Machines / Containers
Hardware
Host OS
HYPERVISOR
App 1 App 1 App 1
Guest
OS
BINS
/ LIB
Guest
OS
BINS
/ LIB
Guest
OS
BINS
/ LIB
Type 2 Hypervisor
App 2
App 3
App 2
OS
Hardware
Desktop / Laptop
BINS
/ LIB
App
BINS
/ LIB
App
Container 1 Container 2
Type 1 Hypervisor
Hardware
HYPERVISOR
App 1 App 1 App 1
Guest
OS
BINS
/ LIB
Guest
OS
BINS
/ LIB
Guest
OS
BINS
/ LIB
App 2
App 3
App 2
Guest OS
Hardware
Type 1 Hypervisor
BINS
/ LIB
App
BINS
/ LIB
App
BINS
/ LIB
App
Container 1 Container 2 Container 3
HYPERVISOR
Virtualizes the OS
Create Secure Sandboxes in OS
Virtualizes the Hardware
Creates Virtual Machines
Hardware
OS
BINS / LIB
App
1
App
2
App
3
Server
Data Center
No Virtualization
Cloud Elastic Computing
12

Docker containers are Linux Containers
CGROUPS
NAME
SPACES
Copy on
Write
DOCKER
CONTAINER
• Kernel Feature
• Groups Processes
• Control Resource
Allocation
• CPU, CPU Sets
• Memory
• Disk
• Block I/O
• Images
• Not a File System
• Not a VHD
• Basically a tar file
• Has a Hierarchy
• Arbitrary Depth
• Fits into Docker
Registry
• The real magic behind
containers
• It creates barriers
between processes
• Different Namespaces
• PID Namespace
• Net Namespace
• IPC Namespace
• MNT Namespace
• Linux Kernel Namespace
introduced between
kernel 2.6.15 – 2.6.26
docker run
lxc-start
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/ch01
13

Docker Container – Linux and Windows
Control Groups
cgroups
Namespaces
Pid, net, ipc, mnt, uts
Layer Capabilities
Union File Systems:
AUFS, btrfs, vfs
Control Groups
Job Objects
Namespaces
Object Namespace, Process
Table. Networking
Layer Capabilities
Registry, UFS like
extensions
Namespaces: Building blocks of the Containers
14

Linux Kernel
HOST OS (Ubuntu)
Client
Docker Daemon
Cent OS
Alpine
Debian
Linux
Kernel
Host Kernel
Host Kernel
Host Kernel
All the containers
will have the same
Host OS Kernel
If you require a
specific Kernel
version, then Host
Kernel needs to be
updated
HOST OS (Windows 10)
Client
Docker Daemon
Nano Server
Server Core
Nano Server
Windows
Kernel
Host Kernel
Host Kernel
Host Kernel
Windows Kernel
15

Docker Key Concepts
1. Docker images
1. A Docker image is a read-only template.
2. For example, an image could contain an Ubuntu operating system with Apache and
your web application installed.
3. Images are used to create Docker containers.
4. Docker provides a simple way to build new images or update existing images, or you
can download Docker images that other people have already created.
5. Docker images are the build component of Docker.
2. Docker containers
1. Docker containers are similar to a directory.
2. A Docker container holds everything that is needed for an application to run.
3. Each container is created from a Docker image.
4. Docker containers can be run, started, stopped, moved, and deleted.
5. Each container is an isolated and secure application platform.
6. Docker containers are the run component of Docker.
3. Docker Registries
1. Docker registries hold images.
2. These are public or private stores from which you upload or download images.
3. The public Docker registry is called Docker Hub.
4. It provides a huge collection of existing images for your use.
5. These can be images you create yourself or you can use images that others have
previously created.
6. Docker registries are the distribution component of Docker.
Images
Containers
16

Docker Daemon
Docker Client
How Docker works….
$ docker search ….
$ docker build ….
$ docker container create ..
Docker Hub
Images
Containers
$ docker container run ..
$ docker container start ..
$ docker container stop ..
$ docker container ls ..
$ docker push ….
$ docker swarm ..
2
1
3
4
1. Search for the Container
2. Docker Daemon Sends the request to Hub
3. Downloads the image
4. Run the Container from the image
17

Docker Image structure
• Images are read-only.
• Multiple layers of image
gives the final Container.
• Layers can be sharable.
• Layers are portable.
• Debian Base image
• Emacs
• Apache
• Writable Container
18

Running a Docker Container
$ ID=$(docker container run -d ubuntu /bin/bash -c “while true; do date; sleep 1; done”)
Creates a Docker Container of Ubuntu OS and runs the container and execute bash shell with a script.
$ docker container logs $ID Shows output from the( bash script) container
$ docker container ls List the running Containers
$ docker pull ubuntu Docker pulls the image from the Docker Registry
When you copy the commands for testing change ”
quotes to proper quotes. Microsoft PowerPoint
messes with the quotes.
19

Anatomy of a Dockerfile
Command Description Example
FROM
The FROM instruction sets the Base Image for subsequent instructions. As such, a
valid Dockerfile must have FROM as its first instruction. The image can be any valid
image – it is especially easy to start by pulling an image from the Public repositories
FROM ubuntu
FROM alpine
MAINTAINER
The MAINTAINER instruction allows you to set the Author field of the generated
images. (Deprecated)
MAINTAINER John Doe
LABEL
The LABEL instruction adds metadata to an image. A LABEL is a key-value pair. To
include spaces within a LABEL value, use quotes and backslashes as you would in
command-line parsing.
LABEL version="1.0”
LABEL vendor=“M2”
RUN
The RUN instruction will execute any commands in a new layer on top of the current
image and commit the results. The resulting committed image will be used for the
next step in the Dockerfile.
RUN apt-get install -y
curl
ADD
The ADD instruction copies new files, directories or remote file URLs from <src> and
adds them to the filesystem of the container at the path <dest>.
ADD hom* /mydir/
ADD hom?.txt /mydir/
COPY
The COPY instruction copies new files or directories from <src> and adds them to the
filesystem of the container at the path <dest>.
COPY hom* /mydir/
COPY hom?.txt /mydir/
ENV
The ENV instruction sets the environment variable <key> to the value <value>. This
value will be in the environment of all "descendent" Dockerfile commands and can be
replaced inline in many as well.
ENV JAVA_HOME /JDK8
ENV JRE_HOME /JRE8
20

Anatomy of a Dockerfile
Command Description Example
VOLUME
The VOLUME instruction creates a mount point with the specified name and marks it as
holding externally mounted volumes from native host or other containers. The value can be a
JSON array, VOLUME ["/var/log/"], or a plain string with multiple arguments, such as VOLUME
/var/log or VOLUME /var/log
VOLUME /data/webapps
USER
The USER instruction sets the user name or UID to use when running the image and for any
RUN, CMD and ENTRYPOINT instructions that follow it in the Dockerfile.
USER johndoe
WORKDIR
The WORKDIR instruction sets the working directory for any RUN, CMD, ENTRYPOINT, COPY
and ADD instructions that follow it in the Dockerfile.
WORKDIR /home/user
CMD
There can only be one CMD instruction in a Dockerfile. If you list more than one CMD then only
the last CMD will take effect.
The main purpose of a CMD is to provide defaults for an executing container. These defaults
can include an executable, or they can omit the executable, in which case you must specify an
ENTRYPOINT instruction as well.
CMD echo "This is a test." |
wc -
EXPOSE
The EXPOSE instructions informs Docker that the container will listen on the
specified network ports at runtime. Docker uses this information to interconnect
containers using links and to determine which ports to expose to the host when
using the –P flag with docker client.
EXPOSE 8080
ENTRYPOINT
An ENTRYPOINT allows you to configure a container that will run as an executable. Command
line arguments to docker run <image> will be appended after all elements in an exec form
ENTRYPOINT, and will override all elements specified using CMD. This allows arguments to be
passed to the entry point, i.e., docker run <image> -d will pass the -d argument to the entry
point. You can override the ENTRYPOINT instruction using the docker run --entrypoint flag.
ENTRYPOINT ["top", "-b"]
21

Docker Image
• Dockerfile
• Docker Container Management
• Docker Images
22

Build Docker Containers as easy as 1-2-3
Create
Dockerfile
1
Build
Image
2
Run
Container
3
23

Build a Docker Java image
1. Create your Dockerfile
• FROM
• RUN
• ADD
• WORKDIR
• USER
• ENTRYPOINT
2. Build the Docker image
3. Run the Container
$ docker build -t org/java:8 .
$ docker container run –it org/java:8
24

Docker Container Management
$ ID=$(docker container run –d ubuntu /bin/bash)
$ docker container stop $ID Start the Container and Store ID in ID field
Stop the container using Container ID
$ docker container stop $(docker container ls –aq)
Stops all the containers
$ docker container rm $ID
Remove the Container
$ docker container rm $(docker container ls –aq)
Remove ALL the Container (in Exit status)
$ docker container prune
Remove ALL stopped Containers)
$ docker container run –restart=Policy –d –it ubuntu /sh Policies = NO / ON-FAILURE / ALWAYS
$ docker container run –restart=on-failure:3
–d –it ubuntu /sh
Will re-start container ONLY 3 times if a
failure happens
$ docker container start $ID
Start the container
25

Docker Container Management
$ ID=$(docker container run –d -i ubuntu)
$ docker container exec -it $ID /bin/bash
Start the Container and Store ID in ID field
Inject a Process into Running Container
$ ID=$(docker container run –d –i ubuntu)
$ docker container exec inspect $ID
Start the Container and Store ID in ID field
Read Containers MetaData
$ docker container run –it ubuntu /bin/bash
# apt-get update
# apt-get install—y apache2
# exit
$ docker container ls –a
$ docker container commit –author=“name” –
message=“Ubuntu / Apache2” containerId apache2
Docker Commit
• Start the Ubuntu Container
• Install Apache
• Exit Container
• Get the Container ID (Ubuntu)
• Commit the Container with new
name
$ docker container run –cap-drop=chown –it ubuntu /sh To prevent Chown inside the Container
Source: https://github.com/meta-magic/kubernetes_workshop
26

Docker Image Commands
$ docker login …. Log into the Docker Hub to Push images
$ docker push image-name Push the image to Docker Hub
$ docker image history image-name Get the History of the Docker Image
$ docker image inspect image-name Get the Docker Image details
$ docker image save –output=file.tar image-name Save the Docker image as a tar ball.
$ docker container export –output=file.tar c79aa23dd2 Export Container to file.
$ docker image rm image-name Remove the Docker Image
$ docker rmi $(docker images | grep '^<none>' | tr -s " " | cut -d " " -f 3)
27

Build Docker Apache image
• FROM alpine
• RUN
• COPY
• EXPOSE
• ENTRYPOINT
$ docker build -t org/apache2 .
$ docker container run –d –p 80:80 org/apache2
$ curl localhost
28

Build Docker Tomcat image
• FROM alpine
• RUN
• COPY
• EXPOSE
• ENTRYPOINT
$ docker build -t org/tomcat .
$ docker container run –d –p 8080:8080 org/tomcat
$ curl localhost:8080
29

Docker Images in the Github Workshop
Ubuntu
JRE 8 JRE 11
Tomcat 8 Tomcat 9
My App 1
Tomcat 9
My App 3
Spring Boot
My App 4
From Ubuntu
Build My Ubuntu
From My Ubuntu
Build My JRE8
From My Ubuntu
Build My JRE11
From My JRE 11
Build My Boot
From My Boot
Build My App 4
From My JRE8
Build My TC8
From My TC8
Build My App 1
My App 2
30

Docker Images in the Github Workshop
Alpine Linux
JRE 8 JRE 11
Tomcat 9 Tomcat 10
My App 1
Tomcat 10
My App 3
Spring Boot
My App 4
From Alpine
Build My Alpine
From My Alpine
Build My JRE8
From My Alpine
Build My JRE11
From My JRE 11
Build My Boot
From My Boot
Build My App 4
From My JRE8
Build My TC9
From My TC8
Build My App 1
My App 2
31

Docker Networking
• Docker Networking – Bridge / Host / None
• Docker Container sharing IP Address
• Docker Communication – Node to Node
• Docker Volumes
32

Docker Networking – Bridge / Host / None
$ docker network ls
$ docker container run --rm --network=host alpine brctl show
$ docker network create tenSubnet –subnet 10.1.0.0/16
33

Docker Networking – Bridge / Host / None
$ docker container run --rm -–net=host alpine ip address
$ docker container run --rm alpine ip address
$ docker container run –rm –net=none alpine ip address
No Network Stack
https://docs.docker.com/network/#network-drivers
34

Docker Containers
Sharing IP Address
$ docker container run --name ipctr –itd alpine
$ docker container run --rm --net container:ipctr alpine ip address
IP
(Container)
Service 1
(Container)
Service 3
(Container)
Service 2
(Container)
$ docker container exec ipctr ip address
35

Docker Networking: Node to Node
Same IP Addresses
for the Containers
across different
Nodes.
This requires NAT.
Container 1
172.17.3.2
Web Server 8080
Veth: eth0
Container 2
172.17.3.3
Microservice 9002
Veth: eth0
Container 3
172.17.3.4
Microservice 9003
Veth: eth0
Container 4
172.17.3.5
Microservice 9004
Veth: eth0
IP tables rules
eth0
10.130.1.101/24
Node 1
Docker0 Bridge 172.17.3.1/16
Veth0 Veth1 Veth2 Veth3
Container 1
172.17.3.2
Web Server 8080
Veth: eth0
Container 2
172.17.3.3
Microservice 9002
Veth: eth0
Container 3
172.17.3.4
Microservice 9003
Veth: eth0
Container 4
172.17.3.5
Microservice 9004
Veth: eth0
IP tables rules
eth0
10.130.1.102/24
Node 2
Veth: eth0
Veth0
Veth Pairs connected to the
container and the Bridge
36

Docker Volumes
$ docker volume create hostvolume
Data Volumes are special directory in the Docker Host.
$ docker volume ls
$ docker container run –it –rm –v hostvolume:/data alpine
# echo “This is a test from the Container” > /data/data.txt
Source:
https://github.com/meta-magic/kubernetes_workshop
37

Docker Volumes
$ docker container run - - rm –v $HOME/data:/data alpine Mount Specific File Path
Source:
38

Kubernetes
39
2

Deployment – Updates and rollbacks, Canary Release
D
ReplicaSet – Self Healing, Scalability, Desired State
R
Worker Node 1
Master Node (Control Plane)
Kubernetes
Architecture
POD
POD itself is a Linux
Container, Docker
container will run inside
the POD. PODs with single
or multiple containers
(Sidecar Pattern) will share
Cgroup, Volumes,
Namespaces of the POD.
(Cgroup / Namespaces)
Scheduler
Controller
Manager
Using yaml or json
declare the desired
state of the app.
State is stored in
the Cluster store.
Self healing is done by Kubernetes using watch loops if the desired state is changed.
POD POD POD
BE
1.2
10.1.2.34
BE
1.2
10.1.2.35
BE
1.2
10.1.2.36
BE
15.1.2.100
DNS: a.b.com 1.2
Service Pod IP Address is dynamic, communication should
be based on Service which will have routable IP
and DNS Name. Labels (BE, 1.2) play a critical role
in ReplicaSet, Deployment, & Services etc.
Cluster
Store
etcd
Key Value
Store
Pod Pod Pod
Label Selector selects pods based on the Labels.
Label
Selector
Label Selector
Label Selector
Node
Controller
End Point
Controller
Deployment
Controller
Pod
Controller
….
Labels
Internet
Firewall K8s Virtual
Cluster
Cloud Controller
For the cloud providers to manage
nodes, services, routes, volumes etc.
Kubelet
Node
Manager
Container
Runtime
Interface
Port 10255
gRPC
ProtoBuf
Kube-Proxy
Network Proxy
TCP / UDP Forwarding
IPTABLES / IPVS
Allows multiple
implementation of
containers from v1.7
RESTful yaml / json
$ kubectl ….
Port 443
API Server
Pod IP ...34 ...35 ...36
EP
• Declarative Model
• Desired State
Key Aspects
N1
N2
N3
Namespace
1
N1
N2
N3
Namespace
2
• Pods
• ReplicaSet
• Deployment
• Service
• Endpoints
• StatefulSet
• Namespace
• Resource Quota
• Limit Range
• Persistent
Volume
Kind
Secrets
Kind
• apiVersion:
• kind:
• metadata:
• spec:
Declarative Model
• Pod
• ReplicaSet
• Service
• Deployment
• Virtual Service
• Gateway, SE, DR
• Policy, MeshPolicy
• RbaConfig
• Prometheus, Rule,
• ListChekcer …
@
@
Annotations
Names
Cluster IP
Node
Port
Load
Balancer
External
Name
@
Ingress
40

Focus on the Declarative Model
41

Ubuntu Installation
Kubernetes Setup – Minikube
$ sudo snap install kubectl --classic Install Kubectl using Snap Package Manager
$ kubectl version Shows the Current version of Kubectl
• Minikube provides a developer environment with master and a single node
installation within the Minikube with all necessary add-ons installed like DNS,
Ingress controller etc.
• In a real world production environment you will have master installed (with a
failover) and ‘n’ number of nodes in the cluster.
• If you go with a Cloud Provider like Amazon EKS then the node will be created
automatically based on the load.
• Minikube is available for Linux / Mac OS and Windows.
$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.30.0/minikube-linux-amd64
$ chmod +x minikube && sudo mv minikube /usr/local/bin/
https://kubernetes.io/docs/tasks/tools/install-kubectl/
Source: https://github.com/meta-magic/kubernetes_workshop 42

Windows Installation
Kubernetes Setup – Minikube
C:> choco install kubernetes-cli Install Kubectl using Choco Package Manager
C:> kubectl version Shows the Current version of Kubectl
Mac OS Installation
$ brew install kubernetes-cli Install Kubectl using brew Package Manager
$ kubectl version Shows the Current version of Kubectl
C:> cd c:usersyouraccount
C:> mkdir .kube
Create .kube directory
$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64
$ chmod +x minikube && sudo mv minikube /usr/local/bin/
C:> minikube-installer.exe Install Minikube using Minikube Installer
https://kubernetes.io/docs/tasks/tools/install-kubectl/
Source:
$ brew update; brew cask install minikube Install Minikube using Homebrew or using curl
43

Kubernetes Minikube - Commands
Commands
$ minikube status Shows the status of minikube installation
$ minikube start Start minikube
All workshop examples Source Code: https://github.com/meta-magic/kubernetes_workshop
$ minikube stop Stop Minikube
$ minikube ip Shows minikube IP Address
$ minikube addons list Shows all the addons
$ minikube addons enable ingress Enable ingress in minikube
$ minikube start --memory=8192 --cpus=4 --kubernetes-version=1.14.2 8 GB RAM and 4 Cores
$ minikube dashboard Access Kubernetes Dashboard in minikube
$ minikube start --network-plugin=cni --extra-config=kubelet.network-plugin=cni --memory=5120 With Cilium
Network
Driver
$ kubectl create -n kube-system -f https://raw.githubusercontent.com/cilium/cilium/v1.3/examples/kubernetes/addons/etcd/standalone-etcd.yaml
$ kubectl create -f https://raw.githubusercontent.com/cilium/cilium/v1.3/examples/kubernetes/1.12/cilium.yaml
44

K8s Setup – Master / Nodes : On Premise
Cluster Machine Setup
1. Switch off Swap
2. Set Static IP to Network interface
3. Add IP to Host file
$ k8s-1-cluster-machine-setup.sh
4. Install Docker
5. Install Kubernetes
Run the cluster setup script to install
the Docker and Kubernetes in all the
machines (master and worker node)
1
Master Setup
Setup kubernetes master with pod
network
1. Kubeadm init
2. Install CNI Driver
$ k8s-2-master-setup.sh
$ k8s-3-cni-driver-install.sh
$ k8s-3-cni-driver-uninstall.sh
$ kubectl get po --all-namespaces
Check Driver Pods
Uninstall the driver
2
Node Setup
n1$ kubeadm join --token t IP:Port
Add the worker node to Kubernetes
Master
$ kubectl get nodes
Check Events from namespace
3
$ kubectl get events –n namespace
Check all the nodes
$ sudo ufw enable
$ sudo ufw allow 31100
Source Code: https://github.com/meta-magic/metallb-baremetal-example
Only if the Firewall is blocking your Pod
Al the above-mentioned shell scripts are
available in the Source Code Repository
$ sudo ufw allow 443
45

Kubernetes Setup – Master / Nodes
$ kubeadm init node1$ kubeadm join --token enter-token-from-kubeadm-cmd Node-IP:Port Adds a Node
$ kubectl get nodes $ kubectl cluster-info
List all Nodes
$ kubectl run hello-world --replicas=7 --labels="run=load-balancer-example" --image=metamagic/hello:1.0 --port=8080
Creates a Deployment Object and a ReplicaSet object with 7 replicas of Hello-World Pod running on port 8080
$ kubectl expose deployment hello-world --type=LoadBalancer --name=hello-world-service
List all the Hello-World Deployments
$ kubectl get deployments hello-world
Describe the Hello-World Deployments
$ kubectl describe deployments hello-world
List all the ReplicaSet
$ kubectl get replicasets
Describe the ReplicaSet
$ kubectl describe replicasets
List the Service Hello-World-Service with
Custer IP and External IP
$ kubectl get services hello-world-service
Describe the Service Hello-World-Service
$ kubectl describe services hello-world-service
Creates a Service Object that exposes the deployment (Hello-World) with an external IP Address.
List all the Pods with internal IP Address
$ kubectl get pods –o wide
$ kubectl delete services hello-world-service
Delete the Service Hello-World-Service
$ kubectl delete deployment hello-world
Delete the Hello-Word Deployment
Create a set of Pods for Hello World App with an External IP Address (Imperative Model)
Shows the cluster details
$ kubectl get namespace
Shows all the namespaces
$ kubectl config current-context
Shows Current Context
46

Setup KinD (Kubernetes in Docker)
$ curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.11.1/kind-linux-amd64
$ chmod +x ./kind
$ mv ./kind /some-dir-in-your-PATH/kind
Linux
Source: https://kind.sigs.k8s.io/docs/user/quick-start/
$ brew install kind
Mac OS via Homebrew
c:> curl.exe -Lo kind-windows-amd64.exe https://kind.sigs.k8s.io/dl/v0.11.1/kind-windows-amd64
c:> Move-Item .kind-windows-amd64.exe c:some-dir-in-your-PATHkind.exe
Windows
o Kind is a tool for running
local Kubernetes clusters
using Docker container
“nodes”.
o Kind was primarily
designed for testing
Kubernetes itself but may
be used for local
development or CI.
47

KinD Creates Cluster(s)
$ kind create cluster $ kind create cluster - -name my2ndcluster
Create a default cluster named kind Create a cluster with a specific name
$ kind create cluster - -config filename.yaml
Create a cluster from a file
$ kind get cluster
List all Clusters
$ kind delete cluster
Delete a Cluster
$ kubectl cluster-info --context kind-kind $ kind load docker-image image1 image2 image3
Load Image into a Cluster (No Repository Needed)
48

KinD Cluster Setup
Single Node Cluster Setup 2 Node Cluster Setup
49

KinD Cluster Setup
3 Node Cluster Setup 5 Node Cluster Setup
50

KinD Cluster Setup + Network Driver
Single Node Cluster Setup
$ kind create cluster --config 1-clusters/alpha-1.yaml
2 Node Cluster Setup
$ kind create cluster --config 1-clusters/beta-2.yaml
$ kind create cluster --config 1-clusters/gama-3.yaml
$ kind create cluster --config 1-clusters/epsilon-5.yaml
$ kubectl apply --filename https://raw.githubusercontent.com/kubernetes/ingress-
nginx/master/deploy/static/provider/kind/deploy.yaml
NGINX Ingress Controller (This is part of shell scripts (Ex. ch1-create-epsilon-cluster ) provided in the GitHub Repo)
Creates 5 Node cluster and Adds NGINX Controller
$ ch1-create-epsilon-cluster
Install 4 Microservices and creates 7 instances
$ ch3-sigma-install-apps
51

KinD Cluster Setup Status
Setting up a 5 Node Cluster
o 2 Master Control Plane
o 3 Worker Node
o External Load Balancer
Setup Network Driver
o Adds NGINX Ingress
Controller
52

KinD Cluster Setup Status
5 Node Cluster Setup Status (Core DNS, API Server, Kube Proxy… ) Control Planes & Worker Nodes
53

KinD – Sigma App Installation Status
Install 4 Microservices and creates 7 instances
$ ch3-sigma-install-apps
54

KinD – Sigma Web App
55

3 Fundamental Concepts
1. Desired State
2. Current State
3. Declarative Model
56

Kubernetes Workload Portability
Goals
1. Abstract away Infrastructure
Details
2. Decouple the App Deployment
from Infrastructure (On-Premise
or Cloud)
To help Developers
1. Write Once, Run Anywhere
(Workload Portability)
2. Avoid Vendor Lock-In
Cloud
On-Premise
57

Kubernetes
Getting Started
• Namespace
• Pods / ReplicaSet / Deployment
• Service / Endpoints
• Ingress
• Rollout / Undo
• Auto Scale
58

Kubernetes Commands – Namespace
(Declarative Model)
$ kubectl config set-context $(kubectl config current-context) --namespace=your-ns
This command will let you switch
the namespace to your namespace
(your-ns).
$ kubectl get namespace
$ kubectl describe ns ns-name
$ kubectl create –f app-ns.yml
List all the Namespaces
Describe the Namespace
Create the Namespace
$ kubectl apply –f app-ns.yml
Apply the changes to the
Namespace
$ kubectl get pods –namespace= ns-name List the Pods from your
namespace
• Namespaces are used to group your teams and software’s in
logical business group.
• A definition of Service will add a entry in DNS with respect to
Namespace.
• Not all objects are there in Namespace. Ex. Nodes, Persistent
Volumes etc.
$ kubectl api-resources --namespaced=your-ns
59

• Pod is a shared environment for one of more
Containers.
• Pod in a Kubernetes cluster has a unique IP
address, even Pods on the same Node.
• Pod is a pause Container
Kubernetes Pods
$ kubectl create –f tc10-nr-Pod.yaml
$ kubectl get pods –o wide –n omega
Atomic Unit
Container
Pod
Virtual Server
Small
Big
60

Kubernetes Commands – Pods
(Declarative Model)
$ kubectl exec pod-name ps aux $ kubectl exec –it pod-name sh
$ kubectl exec –it –container container-name pod-name sh
By default kubectl executes the commands in the first container in the pod. If you are running multiple containers (sidecar
pattern) then you need to pass –container flag and give the name of the container in the Pod to execute your command.
You can see the ordering of the containers and its name using describe command.
$ kubectl get pods
$ kubectl describe pods pod-name
$ kubectl get pods -o json pod-name
$ kubectl create –f app-pod.yml
List all the pods
Describe the Pod details
List the Pod details in JSON format
Create the Pod (Imperative)
Execute commands in the first Container in the Pod Log into the Container Shell
$ kubectl get pods -o wide List all the Pods with Pod IP Addresses
$ kubectl apply –f app-pod.yml
Apply the changes to the Pod
$ kubectl replace –f app-pod.yml
Replace the existing config of the Pod
$ kubectl describe pods –l app=name Describe the Pod based on the
label value
$ kubectl logs pod-name container-name Source: https://github.com/MetaArivu/k8s-workshop
61

• Pods wrap around containers with benefits
like shared location, secrets, networking etc.
• ReplicaSet wraps around Pods and brings in
Replication requirements of the Pod
• ReplicaSet Defines 2 Things
• Pod Template
• Desired No. of Replicas
Kubernetes ReplicaSet
(Declarative Model)
What we want is the Desired State.
Game On!
62

Kubernetes Commands – ReplicaSet
(Declarative Model)
$ kubectl delete rs/app-rs cascade=false
$ kubectl get rs
$ kubectl describe rs rs-name
$ kubectl get rs/rs-name
$ kubectl create –f app-rs.yml
List all the ReplicaSets
Describe the ReplicaSet details
Get the ReplicaSet status
Create the ReplicaSet which will automatically create all the
Pods
Deletes the ReplicaSet. If the cascade=true then deletes all
the Pods, Cascade=false will keep all the pods running and
ONLY the ReplicaSet will be deleted.
$ kubectl apply –f app-rs.yml
Applies new changes to the ReplicaSet. For example, Scaling
the replicas from x to x + new value.
63

Kubernetes Commands – Deployment
(Declarative Model)
• Deployments manages
ReplicaSets and
• ReplicaSets manages
Pods
• Deployment is all about
Rolling updates and
• Rollbacks
• Canary Deployments
64

Kubernetes Commands – Deployment
(Declarative Model)
List all the Deployments
Describe the Deployment details
Show the Rollout status of the Deployment
Creates Deployment
Deployments contains Pods and its Replica information. Based on
the Pod info Deployment will start downloading the containers
(Docker) and will install the containers based on replication factor.
Updates the existing deployment.
Show Rollout History of the Deployment
$ kubectl get deploy app-deploy
$ kubectl describe deploy app-deploy
$ kubectl rollout status deployment app-deploy
$ kubectl rollout history deployment app-deploy
$ kubectl create –f app-deploy.yml
$ kubectl apply –f app-deploy.yml --record
$ kubectl rollout undo deployment app-deploy - -to-revision=1
$ kubectl rollout undo deployment app-deploy - -to-revision=2
Rolls back or Forward to a specific version number
of your app.
$ kubectl scale deployment app-deploy - -replicas=6 Scale up the pods to 6 from the initial 2 Pods.
65

Kubernetes Services
Why do we need Services?
• Accessing Pods from Inside the Cluster
• Accessing Pods from Outside
• Autoscale brings Pods with new IP
Addresses or removes existing Pods.
• Pod IP Addresses are dynamic.
Service Types
1. Cluster IP (Default)
2. Node Port
3. Load Balancer
4. External Name
Service will have a
stable IP Address.
Service uses Labels to
associate with a set
of Pods
66

Kubernetes Commands – Service / Endpoints
(Declarative Model)
$ kubectl delete svc app-service
$ kubectl create –f app-service.yml
List all the Services
Describe the Service details
List the status of the Endpoints
Create a Service for the Pods.
Service will focus on creating a
routable IP Address and DNS for
the Pods Selected based on the
labels defined in the service.
Endpoints will be automatically
created based on the labels in
the Selector.
Deletes the Service.
$ kubectl get svc
$ kubectl describe svc app-service
$ kubectl get ep app-service
$ kubectl describe ep app-service Describe the Endpoint Details
 Cluster IP (default) - Exposes the Service
on an internal IP in the cluster. This type
makes the Service only reachable from
within the cluster.
 Node Port - Exposes the Service on the
same port of each selected Node in the
cluster using NAT. Makes a Service
accessible from outside the cluster
using <NodeIP>:<NodePort>. Superset
of ClusterIP.
 Load Balancer - Creates an external load
balancer in the current cloud (if
supported) and assigns a fixed, external
IP to the Service. Superset of NodePort.
 External Name - Exposes the Service
using an arbitrary name (specified
by external Name in the spec) by
returning a CNAME record with the
name. No proxy is used. This type
requires v1.7 or higher of kube-dns.
67

Kubernetes Ingress
(Declarative Model)
An Ingress is a collection of rules
that allow inbound connections to
reach the cluster services.
Ingress Controllers are Pluggable.
Ingress Controller in AWS is linked to
AWS Load Balancer.
Source: https://kubernetes.io/docs/concepts/services-
networking/ingress/#ingress-controllers
68

Kubernetes Ingress
(Declarative Model)
An Ingress is a collection of rules
that allow inbound connections to
reach the cluster services.
Ingress Controllers are Pluggable.
Ingress Controller in AWS is linked to
AWS Load Balancer.
Source: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-controllers
69

Kubernetes Auto Scaling Pods
(Declarative Model)
• You can declare the Auto scaling
requirements for every Deployment
(Microservices).
• Kubernetes will add Pods based on the
CPU Utilization automatically.
• Kubernetes Cloud infrastructure will
automatically add Nodes if it ran out of
available Nodes.
CPU utilization kept at 2% to demonstrate the auto
scaling feature. Ideally it should be around 80% - 90%
70

Kubernetes Horizontal Pod Auto Scaler
$ kubectl autoscale deployment appname --cpu-percent=50 --min=1 --max=10
$ kubectl run -it podshell --image=metamagicglobal/podshell
Hit enter for command prompt
$ while true; do wget -q -O- http://yourapp.default.svc.cluster.local; done
Deploy your app with auto scaling parameters
Generate load to see auto scaling in action
$ kubectl get hpa
$ kubectl attach podshell-name -c podshell -it
To attach to the running container
71

Kubernetes
App Setup
• Environment
• Config Map
• Pod Preset
• Secrets
72

Detach the Configuration information
of the App from the Container Image.
Config Map lets you create multiple
profiles for your Dev, QA and Prod
environment.
Config Map
All the Database configurations like
passwords, certificates, OAuth tokens,
etc., can be stored in secrets.
Secret
Helps you create common
configuration which can be injected to
Pod based on a Criteria (selected using
Label). For Ex. SMTP config, SMS
config.
Pod Preset
Environment option let you pass any
info to the pod thru Environment
Variables.
Environment
Container App Setup
73

Kubernetes Pod Environment Variables

Kubernetes Adding Config to Pod
Config Maps allow you to
decouple configuration artifacts
from image content to keep
containerized applications
portable.
Source: https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/
75

Kubernetes Pod Presets
A Pod Preset is an API resource for injecting
additional runtime requirements into a Pod
at creation time. You use label selectors to
specify the Pods to which a given Pod
Preset applies.
Using a Pod Preset allows pod template
authors to not have to explicitly provide all
information for every pod. This way,
authors of pod templates consuming a
specific service do not need to know all the
details about that service.
Source: https://kubernetes.io/docs/concepts/workloads/pods/podpreset/

Kubernetes Pod Secrets
Objects of type secret are intended to hold
sensitive information,
such as passwords,
OAuth tokens, and ssh keys.
Putting this information in a secret is safer
and more flexible than putting it verbatim
in a pod definition or in a docker
Source: https://kubernetes.io/docs/concepts/configuration/secret/

Infrastructure
Design Patterns
• API Gateway
• Load balancer
• Service discovery
• Circuit breaker
• Service Aggregator
• Let-it crash pattern
78

API Gateway Design Pattern – Software Stack
UI
Layer
WS
BL
DL
Database
Shopping Cart
Order
Customer
Product
Firewall
Users
API Gateway
Load
Balancer
Circuit
Breaker
UI
Layer
Web
Services
Business
Logic
Database
Layer
Product
SE
MySQL
DB
Product
Microservice
With 4 node
cluster
Load
Balancer
Circuit
Breaker
UI
Layer
Web
Services
Business
Logic
Database
Layer
Customer
Redis
DB
Customer
Microservice
With 2 node
cluster
Users
Access the
Monolithic
App
Directly
API Gateway (Reverse Proxy Server) routes the traffic
to appropriate Microservices (Load Balancers)
79

API Gateway – Kubernetes Implementation
/customer
/product
/cart
/order
API Gateway
Ingress
Deployment / Replica / Pod Nodes
Kubernetes Objects
Firewall
Customer Pod
Customer Pod
Customer Pod
Customer
Service
N1
N2
N2
EndPoints
Product Pod
Product Pod
Product Pod
Product
Service
N4
N3
MySQL
DB
EndPoints
Review Pod
Review Pod
Review Pod
Review
Service
N4
N3
N1
Service Call
Kube DNS
EndPoints
Internal
Load Balancers
Users
Routing based on Layer 3,4 and 7
Redis
DB
Mongo
DB
Load Balancer
80

API Gateway – Kubernetes / Istio
/customer
/product
/auth
/order
API Gateway
Virtual Service
Istio Sidecar - Envoy
Load Balancer
Firewall
P M C
Istio Control Plane
MySQL
Pod
N4
N3
Destination
Rule
Product Pod
Product Pod
Product Pod
Product
Service
Service Call
Kube DNS
EndPoints
Internal
Load Balancers
81
Kubernetes
Objects
Istio Objects
Users
Review Pod
Review Pod
Review Pod
Review
Service
N1
N4
N3
EndPoints
Customer Pod
Customer Pod
Customer Pod
Customer
Service
N1
N2
N2
Destination
Rule
EndPoints
Redis
DB
Mongo
DB
81

Load Balancer Design Pattern
Firewall
Users
API Gateway
Load
Balancer
Circuit
Breaker
UI
Layer
Web
Services
Business
Logic
Database
Layer
Product
SE
MySQL
DB
Product
Microservice
With 4 node
cluster
Load
Balancer
CB
=
Hystrix
UI
Layer
Web
Services
Business
Logic
Database
Layer
Customer
Redis
DB
Customer
Microservice
With 2 node
cluster
API Gateway (Reverse Proxy Server) routes
the traffic to appropriate Microservices
(Load Balancers)
Load Balancer Rules
1. Round Robin
2. Based on
Availability
3. Based on
Response Time
82

Ingress
Load Balancer – Kubernetes Model
Kubernetes
Objects
Firewall
Users
Product 1
Product 2
Product 3
Product
Service
N4
N3
N1
EndPoints
Internal
Load Balancers
DB
Load Balancer
API Gateway
N1
N2
N2
Customer 1
Customer 2
Customer 3
Customer
Service
EndPoints
DB
Internal
Load Balancers
Pods Nodes
• Load Balancer receives the (request) packet from the User and it picks up
a Virtual Machine in the Cluster to do the internal Load Balancing.
• Kube Proxy using IP Tables redirect the Packet using internal load
Balancing rules.
• Packet enters Kubernetes Cluster and reaches Node (of that specific Pod)
and Node handover the packet to the Pod.
/customer
/product
/cart
83

Service Discovery – NetFlix Network Stack Model
Firewall
Users
API Gateway
Load
Balancer
Circuit
Breaker
Product
MySQL
DB
Product
Microservice
With 4 node
cluster
Load
Balancer
Circuit
Breaker
UI
Layer
Web
Services
Business
Logic
Database
Layer
Customer
Redis
DB
Customer
Microservice
With 2 node
cluster
• In this model Developers write the
code in every Microservice to register
with NetFlix Eureka Service Discovery
Server.
• Load Balancers and API Gateway also
registers with Service Discovery.
• Service Discovery will inform the Load
Balancers about the instance details
(IP Addresses).
Service Discovery
84

Ingress
Service Discovery – Kubernetes Model
Kubernetes
Objects
Firewall
Users
Product 1
Product 2
Product 3
Product
Service
N4
N3
N1
EndPoints
Internal
Load Balancers
DB
API Gateway
N1
N2
N2
Customer 1
Customer 2
Customer 3
Customer
Service
EndPoints
DB
Internal
Load Balancers
Pods Nodes
• API Gateway (Reverse Proxy Server) doesn't know the instances (IP
Addresses) of News Pod. It knows the IP address of the Services
defined for each Microservice (Customer / Product etc.)
• Services handles the dynamic IP Addresses of the pods. Services
Endpoints will automatically discover the new Pods based on Labels.
Service Definition
from Kubernetes
Perspective
/customer
/product
/cart
Service Call
Kube DNS
85

Circuit Breaker Pattern
/ui
/productms
If Product Review is not
available Product service
will return the product
details with a message
review not available.
Reverse Proxy Server
Ingress
Kubernetes Objects
Firewall
UI Pod
UI Pod
UI Pod
UI Service
N1
N2
N2
EndPoints
Product Pod
Product Pod
Product Pod
Product
Service
N4
N3
MySQL
Pod
EndPoints
Internal
Load Balancers
Users
Routing based on Layer 3,4 and 7
Review Pod
Review Pod
Review Pod
Review
Service
N4
N3
N1
Service Call
Kube DNS
EndPoints
86

Service Aggregator Pattern
/newservice
Ingress
Kubernetes
Objects
Firewall
Service Call
Kube DNS
Users
Internal
Load Balancers
EndPoints News Pod
News Pod
News Pod
News
Service
N4
N3
N2
News Service Portal
• News Category wise
Microservices
• Aggregator Microservice to
aggregate all category of news.
Auto Scaling
• Sports Events (IPL / NBA) spikes
the traffic for Sports Microservice.
• Auto scaling happens for both
News and Sports Microservices.
N1
N2
N2
National
National
National
National
Service
EndPoints
Internal
Load Balancers
DB
N1
N2
N2
Politics
Politics
Politics
Politics
Service
EndPoints
DB
Sports
Sports
Sports
Sports
Service
N4
N3
N1
EndPoints
Internal
Load Balancers
DB
87

Music UI
Play Count
Discography
Albums
88

Service Aggregator Pattern
/artist
Ingress
Kubernetes
Objects
Firewall
Service Call
Kube DNS
Users
Internal
Load Balancers
EndPoints Artist Pod
Artist Pod
Artist Pod
Artist
Service
N4
N3
N2
Spotify Microservices
• Artist Microservice combines all
the details from Discography,
Play count and Playlists.
Auto Scaling
• Scaling of Artist and downstream
Microservices will automatically
scale depends on the load factor.
N1
N2
N2
Discography
Discography
Discography
Discography
Service
EndPoints
Internal
Load Balancers
DB
N1
N2
N2
Play Count
Play Count
Play Count
Play Count
Service
EndPoints
DB
Playlist
Playlist
Playlist
Playlist
Service
N4
N3
N1
EndPoints
Internal
Load Balancers
DB
89

Config Store – Spring Config Server
Firewall
Users
API Gateway
Load
Balancer
Circuit
Breaker
Product
MySQL
DB
Product
Microservice
With 4 node
cluster
Load
Balancer
Circuit
Breaker
UI
Layer
Web
Services
Business
Logic
Database
Layer
Customer
Redis
DB
Customer
Microservice
With 2 node
cluster
• In this model Developers write the
code in every Microservice to
download the required configuration
from a Central server (Ex. Spring
Config Server for the Java World).
• This creates an explicit dependency
order in which service to come up will
be critical.
Config Server
90

Software Network Stack Vs Network Stack
Pattern Software Stack Java Software Stack .NET Kubernetes
1 API Gateway Zuul Server SteelToe K8s Ingress / Istio Envoy
2 Service Discovery Eureka Server SteelToe Kube DNS
3 Load Balancer Ribbon Server SteelToe Istio Envoy
4 Circuit Breaker Hysterix SteelToe Istio
5 Config Server Spring Config SteelToe Secrets, Env - K8s Master
Web Site https://netflix.github.io/ https://steeltoe.io/ https://kubernetes.io/
The Developer needs to write code to integrate with the Software Stack
(Programming Language Specific. For Ex. Every microservice needs to subscribe to
Service Discovery when the Microservice boots up.
Service Discovery in Kubernetes is based on the Labels assigned to Pod and Services
and its Endpoints (IP Address) are dynamically mapped (DNS) based on the Label.
91

Let-it-Crash Design Pattern – Erlang Philosophy
• The Erlang view of the world is that everything is a process and that processes can
interact only by exchanging messages.
• A typical Erlang program might have hundreds, thousands, or even millions of processes.
• Letting processes crash is central to Erlang. It’s the equivalent of unplugging your router
and plugging it back in – as long as you can get back to a known state, this turns out to be
a very good strategy.
• To make that happen, you build supervision trees.
• A supervisor will decide how to deal with a crashed process. It will restart the process, or
possibly kill some other processes, or crash and let someone else deal with it.
• Two models of concurrency: Shared State Concurrency, & Message Passing Concurrency.
The programming world went one way (toward shared state). The Erlang community
went the other way.
• All languages such as C, Java, C++, and so on, have the notion that there is this stuff called
state and that we can change it. The moment you share something you need to bring
Mutex a Locking Mechanism.
• Erlang has no mutable data structures (that’s not quite true, but it’s true enough). No
mutable data structures = No locks. No mutable data structures = Easy to parallelize.
92

Let-it-Crash Design Pattern
1. The idea of Messages as the first class citizens of a system, has been
rediscovered by the Event Sourcing / CQRS community, along with a strong
focus on domain models.
2. Event Sourced Aggregates are a way to Model the Processes and NOT things.
3. Each component MUST tolerate a crash and restart at any point in time.
4. All interaction between the components must tolerate that peers can crash.
This mean ubiquitous use of timeouts and Circuit Breaker.
5. Each component must be strongly encapsulated so that failures are fully
contained and cannot spread.
6. All requests sent to a component MUST be self describing as is practical so
that processing can resume with a little recovery cost as possible after a
restart.
93

Let-it-Crash : Comparison Erlang Vs. Microservices Vs. Monolithic Apps
Erlang Philosophy Micro Services Architecture Monolithic Apps (Java, C++, C#, Node JS ...)
1 Perspective
Everything is a
Process
Event Sourced Aggregates are a way to
model the Process and NOT things.
Things (defined as Objects) and
Behaviors
2
Crash
Recovery
Supervisor will
decide how to
handle the
crashed process
Kubernetes Manager monitors all the
Pods (Microservices) and its Readiness
and Health. K8s terminates the Pod if
the health is bad and spawns a new
Pod. Circuit Breaker Pattern is used
handle the fallback mechanism.
Not available. Most of the monolithic
Apps are Stateful and Crash Recovery
needs to be handled manually and all
languages other than Erlang focuses
on defensive programming.
3 Concurrency
Message Passing
Concurrency
Domain Events for state changes within
a Bounded Context & Integration Events
for external Systems.
Mostly Shared State Concurrency
4 State
Stateless :
Mostly Immutable
Structures
Immutability is handled thru Event
Sourcing along with Domain Events and
Integration Events.
Predominantly Stateful with Mutable
structures and Mutex as a Locking
Mechanism
5 Citizen Messages
Messages are 1st class citizen by Event
Sourcing / CQRS pattern with a strong
focus on Domain Models
Mutable Objects and Strong focus on
Domain Models and synchronous
communication.
94

Summary
Setup
1. Setting up Kubernetes Cluster
• 1 Master and
• 2 Worker nodes
Getting Started
1. Create Pods
2. Create ReplicaSets
3. Create Deployments
4. Rollouts and Rollbacks
5. Create Service
6. Create Ingress
7. App Auto Scaling
App Setup
1. Secrets
2. Environments
3. ConfigMap
4. PodPresets
On Premise Setup
1. Setting up External Load
Balancer using Metal LB
2. Setting up nginx Ingress
Controller
1. API Gateway
2. Service Discovery
3. Load Balancer
4. Config Server
5. Circuit Breaker
6. Service Aggregator Pattern
7. Let It Crash Pattern
95

Kubernetes Pods
Advanced
• Jobs / Cron Jobs
• Quality of Service: Resource Quota and Limits
• Pod Disruption Range
• Pod / Node Affinity
• Daemon Set
• Container Level features
96

Kubernetes Pod Quality of Service
Source: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/
QoS:
Guaranteed
Memory limit =
Memory Request
CPU Limit =
CPU Request
QoS:
Burstable
!= Guaranteed
and
Has either
Memory OR
CPU Request
QoS:
Best Effort
No
Memory OR
CPU Request /
limits
97

A probe is an indicator to a container's health. It judges the
health through periodically performing a diagnostic action
against a container via kubelet:
• Liveness probe: Indicates whether a container is alive or
not. If a container fails on this probe, kubelet kills it and
may restart it based on the restartPolicy of a pod.
• Readiness probe: Indicates whether a container is ready
for incoming traffic. If a pod behind a service is not
ready, its endpoint won't be created until the pod is
ready.
Kubernetes Pod in Depth
3 kinds of action handlers can be configured to perform
against a container:
exec: Executes a defined command inside the container.
Considered to be successful if the exit code is 0.
tcpSocket: Tests a given port via TCP, successful if the port
is opened.
httpGet: Performs an HTTP GET to the IP address of target
container. Headers in the request to be sent is
customizable. This check is considered to be healthy if the
status code satisfies: 400 > CODE >= 200.
Additionally, there are five parameters to define a probe's behavior:
initialDelaySeconds: How long kubelet should be waiting for before the first probing.
successThreshold: A container is considered to be healthy when getting consecutive times of probing successes
passed this threshold.
failureThreshold: Same as preceding but defines the negative side.
timeoutSeconds: The time limitation of a single probe action.
periodSeconds: Intervals between probe actions. Source: https://github.com/meta-magic/kubernetes_workshop
98

A job creates one or more pods and ensures that a
specified number of them successfully terminate.
As pods successfully complete, the job tracks the
successful completions. When a specified number
of successful completions is reached, the job itself
is complete. Deleting a Job will cleanup the pods it
created.
A simple case is to create one Job object in order to
reliably run one Pod to completion. The Job object
will start a new Pod if the first pod fails or is deleted
(for example due to a node hardware failure or a
node reboot).
A Job can also be used to run multiple pods in
parallel.
Kubernetes Jobs
Source: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
Command is wrapped for display purpose.
99

Kubernetes Cron Jobs
Source: https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs//
Command is wrapped for display purpose.
You can use CronJobs to run jobs on a time-
based schedule. These automated jobs run
like Cron tasks on a Linux or UNIX system.
Cron jobs are useful for creating periodic and
recurring tasks, like running backups or sending
emails. Cron jobs can also schedule individual
tasks for a specific time, such as if you want to
schedule a job for a low activity period
100

• A resource quota, defined by a Resource
Quota object, provides constraints that
limit aggregate resource consumption per
namespace.
• It can limit the quantity of objects that can
be created in a namespace by type, as well
as the total amount of compute resources
that may be consumed by resources in
that project.
Kubernetes Resource Quotas
Source: https://kubernetes.io/docs/concepts/policy/resource-quotas/
101

• Limits specifies the Max resource a Pod
can have.
• If there is NO limit is defined, Pod will
be able to consume more resources
than requests. However, the eviction
chances of Pod is very high if other Pods
with Requests and Resource Limits are
defined.
Kubernetes Limit Range

• Liveness probe: Indicates
whether a container is alive
or not. If a container fails on
this probe, kubelet kills it
and may restart it based on
the restartPolicy of a pod.
Kubernetes
Pod Liveness Probe
Source: https://kubernetes.io/docs/tasks/configure-pod-
container/configure-liveness-readiness-probes/

• A PDB limits the number pods
of a replicated application that
are down simultaneously from
voluntary disruptions.
• Cluster managers and hosting
providers should use tools
which respect Pod Disruption
Budgets by calling the Eviction
API instead of directly deleting
pods.
Kubernetes Pod Disruption Range
Source: https://kubernetes.io/docs/tasks/run-application/configure-pdb/
$ kubectl drain NODE [options]

• You can constrain a pod to only be
able to run on particular nodes or
to prefer to run on particular
nodes. There are several ways to
do this, and they all use label
selectors to make the selection.
• Assign the label to Node
• Assign Node Selector to a Pod
Kubernetes Pod/Node Affinity / Anti-Affinity
Source: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
$ kubectl label nodes k8s.node1 disktype=ssd

Kubernetes Pod Configuration
Source: https://kubernetes.io/docs/user-journeys/users/application-developer/advanced/
Pod configuration
You use labels and annotations to attach metadata to your resources. To inject data into your
resources, you’d likely create ConfigMaps (for non-confidential data) or Secrets (for confidential data).
Taints and Tolerations - These provide a way for nodes to “attract” or “repel” your Pods. They are often
used when an application needs to be deployed onto specific hardware, such as GPUs for scientific
computing. Read more.
Pod Presets - Normally, to mount runtime requirements (such as environmental variables, ConfigMaps,
and Secrets) into a resource, you specify them in the resource’s configuration file. PodPresets allow you
to dynamically inject these requirements instead, when the resource is created. For instance, this
allows team A to mount any number of new Secrets into the resources created by teams B and C,
without requiring action from B and C.

Kubernetes DaemonSet
A DaemonSet ensures that all (or some) Nodes run a copy of a
Pod. As nodes are added to the cluster, Pods are added to them.
As nodes are removed from the cluster, those Pods are garbage
collected. Deleting a DaemonSet will clean up the Pods it created.
Some typical uses of a DaemonSet are:
• running a cluster storage daemon, such as glusterd, ceph, on
each node.
• running a logs collection daemon on every node, such
as fluentd or logstash.
• running a node monitoring daemon on every node, such
as Prometheus Node Exporter, collectd, Dynatrace OneAgent,
Datadog agent, New Relic agent, Ganglia gmond or Instana
agent.
107

Container-level features
Sidecar container: Although your Pod should still have a single main
container, you can add a secondary container that acts as a helper
(see a logging example). Two containers within a single Pod can
communicate via a shared volume.
Init containers: Init containers run before any of a Pod’s app
containers (such as main and sidecar containers)
Kubernetes Container Level Features
Source: https://kubernetes.io/docs/user-journeys/users/application-developer/advanced/
108

Kubernetes Volumes
• In-Tree and Out-Tree Volume Plugins
• Container Storage Interface – Components
• CSI – Volume Life Cycle
• Persistent Volume
• Persistent Volume Claims
• Storage Class
• Volume Snapshot
109

Kubernetes Workload Portability
Goals
1. Abstract away Infrastructure
Details
2. Decouple the App Deployment
from Infrastructure (On-Premise
or Cloud)
To help Developers
1. Write Once, Run Anywhere
(Workload Portability)
2. Avoid Vendor Lock-In
Cloud
On-Premise
110

K8s Volume Plugin – History
In-Tree Volume Plugins
• First set of Volume plugins with K8s.
• They are linked and compiled and
shipped with K8s releases.
• They were part of Core K8s libraries.
• Volume Driver Development is
tightly coupled with K8s releases.
• Bugs in the Volume Driver crashes
critical K8s components.
• Deprecated since K8s v1.8
Out-of-Tree Volume Plugins
• Flex Volume Driver
• Executable Binaries
• Worker Node communicates
with binaries in CLI.
• Need to access the Root File
System of the Worker Node
• Dependency issues
• CSI – Container Storage Interface
• Address the pain points of Flex
Volume Driver
111

Container Storage Interface
Source:https://blogs.vmware.com/cloudnative/2019/04/18/supercharging-kubernetes-storage-with-csi/
o CSI Spec is Container Orchestrator (CO) neutral
o Uses gRPC for inter-process communication
o Runs Outside CO Processes.
o CSI is control plane only Specs.
o Identity: Identity and capability of the Driver
o Controller: Volume operations such as
provisioning and attachment.
o Node: Mount / unmount ops must be executed
on the node where the volume is needed.
o Identity and Node are mandatory requirement
for the driver implementation.
Container Orchestrator (CO)
Cloud Foundry, Docker, Kubernetes,
Mesos
CSI
Driver
gRPC
Volume
Access
Storage API
Storage
System
112

CSI – Components – 3 gRPC Services on UDS
Controller Service
• Create Volume
• Delete Volume
• List Volume
• Controller Publish Volume
• Controller Unpublish Volume
• Validate Volume Capabilities
• Get Capacity
• Create Snapshot
• Delete Snapshot
• List Snapshots
• Controller Get Capabilities
Node Service
• Node Stage Volume
• Node Unstage Volume
• Node Publish Volume
• Node Unpublish Volume
• Node Get Volume Stats
• Node Get Info
• Node Get Capabilities
Identity Service
• Get Plugin Info
• Get Plugin Properties
• Probe (Probe Request)
Unix Domain Socket
113

StatefulSet Pod
Provisioner CSI
Driver
Attacher
Storage
System
Kubernetes & CSI Drivers
DaemonSet Pod
Registrar CSI
Driver
Kubelet
Worker Node
Master
API Server
etcd
gRPC
gRPC
gRPC
gRPC
Node Service
Identity Service
Controller Service
114

CSI – Volume Life cycle
Controller Service Node Service
CreateVolume ControllerPublishVolume NodeStageVolume
NodeUnStageVolume
NodePublishVolume
NodeUnPublishVolume
DeleteVolume ControllerUnPublishVolume
CREATED NODE_READY VOL_READY PUBLISHED
Volume Created Volume available for use Volume initialized in the
Node. One-time activity.
Volume attached to the Pod
115

Container Storage Interface Adoption
Container
Orchestrator
CO Version CSI Version
Kubernetes
1.10 0.2
1.13 0.3, 1.0
OpenShift 3.11 0.2
Mesos 1.6 0.2
Cloud Foundry 2.5 0.3
PKS 1.4 1.0
116

CSI – Drivers
Name
CSI Production Name
Provisioner
Ver Persistence Access Mode
Dynamic
Provisioning
Raw Block
Support
Volume
Snapshot
1 AWS EBS ebs.csi.aws.com v0.3, v1.0 Yes RW Single Pod Yes Yes Yes
2 AWS EFS efs.csi.aws.com v0.3 Yes RW Multi Pod No No No
3 Azure Disk disk.csi.azure.com v0.3, v1.0 Yes RW Single Pod Yes No No
4 Azure File file.csi.azure.com v0.3, v1.0 Yes RW Multi Pod Yes No No
5 CephFS cephfs.csi.ceph.com v0.3, v1.0 Yes RW Multi Pod Yes No No
6 Ceph RBD rbd.csi.ceph.com v0.3, v1.0 Yes RW Single Pod Yes Yes Yes
7 GCE PD pd.csi.storage.gke.io v0.3, v1.0 Yes RW Single Pod Yes No Yes
8 Nutanix Vol com.nutanix.csi v0.3, v1.0 Yes RW Single Pod Yes No No
9 Nutanix Files com.nutanix.csi v0.3, v1.0 Yes RW Multi Pod Yes No No
10 Portworx pxd.openstorage.org v0.3, v1.1 Yes RW Multi Pod Yes No Yes
Source: https://kubernetes-csi.github.io/docs/drivers.html
117

Kubernetes Volume Types
Host Based
o EmptyDir
o HostPath
o Local
Block Storage
o Amazon EBS
o OpenStack Cinder
o GCE Persistent Disk
o Azure Disk
o vSphere Volume
Others
o iScsi
o Flocker
o Git Repo
o Quobyte
Distributed File System
o NFS
o Ceph
o Gluster
o FlexVolume
o PortworxVolume
o Amazon EFS
o Azure File System
Life cycle of a
Persistent Volume
o Provisioning
o Binding
o Using
o Releasing
o Reclaiming

Ephemeral Storage
Volume Plugin: EmptyDir
o Scratch Space (Temporary) from the
Host Machine.
o Data exits only for the Life Cycle of
the Pod.
o Containers in the Pod can R/W to
mounted path.
o Can ONLY be referenced in-line from
the Pod.
o Can’t be referenced via Persistent
Volume or Claim.
119

Remote Storage Block Storage
o Amazon EBS
o OpenStack Cinder
o GCE Persistent Disk
o Azure Disk
o vSphere Volume
Distributed File System
o NFS
o Ceph
o Gluster
o FlexVolume
o PortworxVolume
o Amazon EFS
o Azure File System
o Remote Storage attached to the
Pod based on the requirement.
o Data persists beyond the life
cycle of the Pod.
o Two Types of Remote Storage
o Block Storage
o File System
o Referenced in the Pod either in-
line or PV/PVC
120

Remote Storage
Kubernetes will do the
following Automatically.
o Kubernetes will attach the
Remote (Block or FS)
Volume to the Node.
o Kubernetes will mount the
volume to the Pod.
This is NOT recommended because it breaks the
Kubernetes principle of workload portability.
121

Deployment and StatefulSet
Source: https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes#deployments_vs_statefulsets
Deployment
Kind: Deployment
• All Replicas of the Deployment share
the same Persistent volume Claim.
• ReadWriteOnce Volumes are NOT
recommended even with ReplicaSet 1
as it can fail or get into a deadlock
(when the Pod goes down and Master
tries to bring another Pod).
• Volumes with ReadOnlyMany &
ReadWriteMany are the best modes.
• Deployments are used for Stateless
Apps
For
Stateful
Apps
StatefulSet
Kind: StatefulSet
• StatefulSet is recommended for App
that need a unique volume per
ReplicaSet.
• ReadWriteOnce should be used with a
StatefulSet. RWO will create a unique
volume per ReplicaSet.
122

Node 3
Node 2
Deployment and StatefulSet
Storage GCE PD
Node 1
D Service1 Pod1
D Service1 Pod2
D Service1 Pod3
Test Case 1
Kind Deployment
Replica 3
Provisioning Storage Class
Volume GCE PD
Volume Type File System
Access Mode ReadWriteOnce (RWO)
Storage NFS
Node 1
D Service1 Pod1
D Service1 Pod2
D Service1 Pod3
Test Case 2
Kind Deployment
Replica 3
Provisioning Persistent Volume
Volume NFS
Access Mode RWX, ReadOnlyMany
Node 3
Node 2
Storage GCE PD
Node 1
S Service2 Pod1
Test Case 3
Kind StatefulSet
Replica 3
Provisioning Storage Class
Volume GCE PD
Access Mode ReadWriteOnce (RWO)
S Service2 Pod2
S Service2 Pod3
Node 3
Node 2
Storage NFS
Node 1
S Service2 Pod1
Test Case 4
Kind StatefulSet
Replica 3
Provisioning Persistent Volume
Volume NFS
Access Mode ReadWriteMany (RWX)
S Service2 Pod2
S Service2 Pod3
Mounted Storage System Mounted Storage System (Shared Drive) Mounted Storage System Mounted Storage System (Shared Drive)
Error Creating Pod
GCE – PD – 10 GB Storage GCE – PD – 10 GB Storage
Source: https://github.com/meta-magic/kubernetes_workshop/tree/master/yaml/volume-nfs-gcppd-scenarios
S1 S3
S2
S3
S2
S1
123

Node 3
Node 2
Deployment/StatefulSet – NFS Shared Disk – 4 PV & 4 PVC
Storage NFS
Node 1
D Service2 Pod1
D Service2 Pod2
D Service2 Pod3
Test Case 6
Kind Deployment
Replica 3
PVC pvc-3gb-disk
Volume NFS
Volume Type File System (ext4)
Node 3
Node 2
Storage NFS
Node 1
S Service4 Pod1
Test Case 8
Kind StatefulSet
Replica 3
PVC pvc-1gb-disk
Volume NFS
S Service4 Pod2
S Service4 Pod3
Mounted Storage System (Shared Drive) Mounted Storage System (Shared Drive)
Node 3
Node 2
Storage NFS
Node 1
D Service1 Pod1
D Service1 Pod2
D Service1 Pod3
Test Case 5
Kind Deployment
Replica 3
PVC pvc-2gb-disk
Volume NFS
Mounted Storage System (Shared Drive)
Node 3
Node 2
Storage NFS
Node 1
D Service3 Pod1
D Service3 Pod2
D Service3 Pod3
Test Case 7
Kind Deployment
Replica 3
PVC pvc-4gb-disk
Volume NFS
Mounted Storage System (Shared Drive)
GCE – PD – 2 GB Storage GCE – PD – 3 GB Storage GCE – PD – 4 GB Storage GCE – PD – 1 GB Storage
Source: https://github.com/meta-magic/kubernetes_workshop/tree/master/yaml/volume-nfs-gcppd-scenarios
PV, PVC mapping is 1:1
124

Volume Plugin: ReadWriteOnce, ReadOnlyMany, ReadWriteMany
Volume Plugin Kind: Deployment Kind: StatefulSet ReadWriteOnce ReadOnlyMany ReadWriteMany
AWS EBS Yes ✓ - -
AzureFile Yes Yes ✓ ✓ ✓
AzureDisk Yes ✓ - -
CephFS Yes Yes ✓ ✓ ✓
Cinder Yes ✓ - -
CSI depends on the driver depends on the driver depends on the driver
FC Yes Yes ✓ ✓ -
Flexvolume Yes Yes ✓ ✓ depends on the driver
Flocker Yes ✓ - -
GCEPersistentDisk Yes Yes ✓ ✓ -
Glusterfs Yes Yes ✓ ✓ ✓
HostPath Yes ✓ - -
iSCSI Yes Yes ✓ ✓ -
Quobyte Yes Yes ✓ ✓ ✓
NFS Yes Yes ✓ ✓ ✓
RBD Yes Yes ✓ ✓ -
VsphereVolume Yes ✓ - - (works when pods are collocated)
PortworxVolume Yes Yes ✓ - ✓
ScaleIO Yes Yes ✓ ✓ -
StorageOS Yes ✓ - -
Source: https://kubernetes.io/docs/concepts/storage/persistent-volumes/
125

Kubernetes Volumes for Stateful Pods
Provision
Network
Storage
Static / Dynamic
1
Request
Storage
2
Use
Storage
3
Static: Persistent Volume
Dynamic: Storage Class
Persistent Volume Claim
Claims are mounted
as Volumes inside the
Pod
126

Storage Class, PV, PVC and Pods
Physical Storage
AWS: EBS, EFS
GCP: PD
Azure: Disk
NFS: Path, Server
Dynamic
Storage Class
Static
Persistent Volume
Persistent Volume Claims
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName:
csi-hp-sc
Pod
spec:
volumes
- name: my-csi-v
persisitentVolumeClaim
claimName: my-csi-pvc
Ref: https://rancher.com/blog/2018/2018-09-20-unexpected-kubernetes-part-1/.
127

Kubernetes Volume
Volume
• A Persistent Volume is the
physical storage available.
• Storage Class is used to configure
custom Storage option (nfs, cloud
storage) in the cluster. They are
the foundation of Dynamic
Provisioning.
• Persistent Volume Claim is used
to mount the required storage
into the Pod.
• ReadOnlyMany: Can be
mounted as read-only by many
nodes
• ReadWriteOnce: Can be
mounted as read-write by a
single node
• ReadWriteMany: Can be
mounted as read-write by many
nodes
Access Mode
Source: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#claims-as-volumes
Persistent
Volume
Persistent
Volume Claim
Storage Class
Volume Mode
• There are two modes
• File System and or
• raw Storage Block.
• Default is File System.
Retain: The volume will need to
be reclaimed manually
Delete: The associated storage
asset, such as AWS EBS, GCE PD,
Azure disk, or OpenStack Cinder
volume, is deleted
Recycle: Delete content only (rm
-rf /volume/*) - Deprecated
Reclaim Policy
128

Kubernetes Persistent Volume – AWS EBS
• Use a Network File System or Block Storage for Pods to access
and data from multiple sources. AWS EBS is such a storage
system.
• A Volume is created and its linked with a storage provider. In
the following example the storage provider is AWS for the
EBS.
• Any PVC (Persistent Volume Claim) will be bound to the
Persistent Volume which matches the storage class.
1
Volume ID is auto generated
$ aws ec2 create-volume - -size 100
Storage class is mainly
meant for dynamic
provisioning of the
persistent volumes.
Persistent Volume is not
bound to any specific
namespace.

Persistent Volume – AWS EBS
Pod Access storage by issuing a
Persistent Volume Claim.
In the following example Pod
claims for 2Gi Disk space from
the network on AWS EBS.
• Manual Provisioning of
the AWS EBS supports
ReadWriteMany,
However all the pods
are getting scheduled
into a Single Node.
• For Dynamic
Provisioning use
ReadWriteOnce.
• Google Compute Engine
also doesn't support
ReadWriteMany for
dynamic provisioning.
2
3
https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes
Source:
130

Kubernetes Persistent Volume - hostPath
• HostPath option is to make the Volume available from the
Host Machine.
• A Volume is created and its linked with a storage provider. In
the following example the storage provider is Minikube for
the host path.
• Any PVC (Persistent Volume Claim) will be bound to the
Persistent Volume which matches the storage class.
• If it doesn't match a dynamic persistent volume will be
created.
Storage class is mainly
meant for dynamic
provisioning of the
persistent volumes.
Persistent Volume is not
bound to any specific
namespace.
Host Path is NOT Recommended in Production
1

Persistent Volume - hostPath
Pod Access storage by issuing a
Persistent Volume Claim.
In the following example Pod
claims for 2Gi Disk space from the
network on the host machine.
• Persistent Volume Claim
and Pods with
Deployment properties
are bound to a specific
namespace.
• Developer is focused on
the availability of
storage space using PVC
and is not bothered
about storage solutions
or provisioning.
• Ops Team will focus on
Provisioning of
Persistent Volume and
Storage class.
2
3

Persistent Volume - hostPath
Running the Yaml’s
from the Github
2
3
1
1. Create Static Persistent Volumes OR Dynamic Volumes (using Storage Class)
2. Persistent Volume Claim is created and bound static and dynamic volumes.
3. Pods refer PVC to mount volumes inside the Pod.

Kubernetes Commands
• Kubernetes Commands – Quick Help
• Kubernetes Commands – Field Selectors
134

Kubernetes Commands – Quick Help
$ kubectl create –f app-rs.yml
$ kubectl get rs/app-rs
$ kubectl get rs $ kubectl delete rs/app-rs cascade=false
$ kubectl describe rs app-rs
$ kubectl apply –f app-rs.yml Cascade=true will delete all the pods
$ kubectl get pods
$ kubectl describe pods pod-name
$ kubectl get pods -o json pod-name
$ kubectl create –f app-pod.yml
$ kubectl get pods –show-labels
$ kubectl exec pod-name ps aux
$ kubectl exec –it pod-name sh
Pods
ReplicaSet
(Declarative Model)
$ kubectl get pods –all-namespaces
$ kubectl apply –f app-pod.yml
$ kubectl replace –f app-pod.yml
$ kubectl replace –f app-rs.yml
135

Kubernetes Commands – Quick Help
$ kubectl create –f app-service.yml
$ kubectl get svc
$ kubectl describe svc app-service
$ kubectl get ep app-service
$ kubectl describe ep app-service
$ kubectl delete svc app-service
$ kubectl create –f app-deploy.yml
$ kubectl get deploy app-deploy
$ kubectl describe deploy app-deploy
$ kubectl rollout status deployment app-deploy
$ kubectl apply –f app-deploy.yml
$ kubectl rollout history deployment app-deploy
$ kubectl rollout undo deployment
app-deploy - -to-revision=1
Service
Deployment
(Declarative Model)
$ kubectl apply –f app-service.yml
$ kubectl replace –f app-service.yml
$ kubectl replace –f app-deploy.yml
136

Kubernetes Commands – Field Selectors
$ kubectl get pods --field-selector status.phase=Running Get the list of pods where status.phase = Running
Source: https://kubernetes.io/docs/concepts/overview/working-with-objects/field-selectors/
Field selectors let you select Kubernetes resources based on the value of one or
more resource fields. Here are some example field selector queries:
• metadata.name=my-service
• metadata.namespace!=default
• status.phase=Pending
Supported Operators
You can use the =, ==, and != operators with field selectors (= and == mean the
same thing). This kubectl command, for example, selects all Kubernetes Services
that aren’t in the default namespace:
$ kubectl get services --field-selector metadata.namespace!=default
137

Kubernetes Commands – Field Selectors
$ kubectl get pods --field-selector=status.phase!=Running,spec.restartPolicy=Always
Source: https://kubernetes.io/docs/concepts/overview/working-with-objects/field-selectors/
Chained Selectors
As with label and other selectors, field selectors can be chained together as a
comma-separated list. This kubectl command selects all Pods for which
the status.phase does not equal Running and the spec.restartPolicy field
equals Always:
Multiple Resource Type
You use field selectors across multiple resource types. This kubectl command
selects all Statefulsets and Services that are not in the default namespace:
$ kubectl get statefulsets,services --field-selector metadata.namespace!=default
138

K8s Packet Path
• Kubernetes Networking
• Compare Docker and Kubernetes Networking
• Pod to Pod Networking within the same Node
• Pod to Pod Networking across the Node
• Pod to Service Networking
• Ingress - Internet to Service Networking
• Egress – Pod to Internet Networking
139
3

Kubernetes Networking
Mandatory requirements for Network implementation
1. All Pods can communicate with All other Pods
without using Network Address Translation
(NAT).
2. All Nodes can communicate with all the Pods
without NAT.
3. The IP that is assigned to a Pod is the same IP the
Pod sees itself as well as all other Pods in the
cluster. Source: https://github.com/meta-magic/kubernetes_workshop
140

Docker Networking Vs. Kubernetes Networking
Container 1
172.17.3.2
Web Server 8080
Veth: eth0
Container 2
172.17.3.3
Microservice 9002
Veth: eth0
Container 3
172.17.3.4
Microservice 9003
Veth: eth0
Container 4
172.17.3.5
Microservice 9004
Veth: eth0
IP tables rules
eth0
10.130.1.101/24
Node 1
Container 1
172.17.3.2
Web Server 8080
Veth: eth0
Container 2
172.17.3.3
Microservice 9002
Veth: eth0
Container 3
172.17.3.4
Microservice 9003
Veth: eth0
Container 4
172.17.3.5
Microservice 9004
Veth: eth0
IP tables rules
eth0
10.130.1.102/24
Node 2
Pod 1
172.17.3.2
Web Server 8080
Veth: eth0
Pod 2
172.17.3.3
Microservice 9002
Veth: eth0
Pod 3
172.17.3.4
Microservice 9003
Veth: eth0
Pod 4
172.17.3.5
Microservice 9004
Veth: eth0
IP tables rules
eth0
10.130.1.101/24
Node 1
L2 Bridge 172.17.3.1/16
Same IP Range. NAT Required Uniq IP Range. netFilter, IP Tables / IPVS. No NAT required
Pod 1
172.17.3.6
Web Server 8080
Veth: eth0
Pod 2
172.17.3.7
Microservice 9002
Veth: eth0
Pod 3
172.17.3.8
Microservice 9003
Veth: eth0
Pod 4
172.17.3.9
Microservice 9004
Veth: eth0
IP tables rules
eth0
10.130.1.102/24
Node 2
L2 Bridge 172.17.3.1/16
141

3 Networks
Networks
1. Physical Network
2. Pod Network
3. Service Network
CIDR Range (RFC 1918)
1. 10.0.0.0/8
2. 172.0.0.0/11
3. 192.168.0.0/16
Keep the Address ranges separate – Best Practices
RFC 1918
1. Class A
2. Class B
3. Class C
142

3 Networks
eth0 10.130.1.102/24
Node 1
veth0
eth0
Pod 1
Container 1
172.17.4.1
eth0
Pod 2
Container 1
172.17.4.2
veth1
eth0
10.130.1.103/24
Node 2
veth1
eth0
Pod 1
Container 1
172.17.5.1
eth0
10.130.1.104/24
Node 3
veth1
eth0
Pod 1
Container 1
172.17.6.1
Service
EP EP EP
VIP
192.168.1.2/16
1. Physical Network
2. Pod Network
3. Service Network
End Points
handles
dynamic IP
Addresses of
the Pods
selected by a
Service based
on Pod Labels
Virtual IP doesn’t have any
physical network card or
system attached.
143

Kubernetes: Pod to Pod Networking inside a Node
By Default Linux has a Single Namespace and all the process in
the namespace share the Network Stack. If you create a new
namespace then all the process running in that namespace will
have its own Network Stack, Routes, Firewall Rules etc.
$ ip netns add namespace1
A mount point for namespace1 is created under /var/run/netns
Create Namespace
$ ip netns List Namespace
eth0 10.130.1.101/24
Node 1
Root NW Namespace
L2 Bridge 10.17.3.1/16
veth0 veth1
Forwarding
Tables
Bridge
implements
ARP
to
discover
link-
layer
MAC
Address
eth0
Container 1
10.17.3.2
Pod 1
Container 2
10.17.3.2
eth0
Pod 2
Container 1
10.17.3.3
1. Pod 1 sends packet to eth0 – eth0 is connected to
veth0
2. Bridge resolves the Destination with ARP protocol and
3. Bridge sends the packet to veth1
4. veth1 forwards the packet directly to Pod 2 thru eth0
of the Pod 2
1
2
4
3
This entire communication happens in localhost. So, Data
transfer speed will NOT be affected by Ethernet card speed.
Kube Proxy
144

eth0 10.130.1.102/24
Node 2
Root NW Namespace
L2 Bridge 10.17.4.1/16
veth0
Kubernetes: Pod to Pod Networking Across Node
eth0 10.130.1.101/24
Node 1
Root NW Namespace
L2 Bridge 10.17.3.1/16
veth0 veth1
Forwarding
Tables
eth0
Container 1
10.17.3.2
Pod 1
Container 2
10.17.3.2
eth0
Pod 2
Container 1
10.17.3.3
1. Pod 1 sends packet to eth0 –
eth0 is connected to veth0
2. Bridge will try to resolve the
Destination with ARP protocol
and ARP will fail because there
is no device connected to that
IP.
3. On Failure Bridge will send the
packet to eth0 of the Node 1.
4. At this point packet leaves eth0
and enters the Network and
network routes the packet to
Node 2.
5. Packet enters the Root
namespace and routed to the
L2 Bridge.
6. veth0 forwards the packet to
eth0 of Pod 3
1
2
4
3
eth0
Pod 3
Container 1
10.17.4.1
5
6
Kube Proxy
Kube Proxy
Src-IP:Port: Pod1:17711 – Dst-IP:Port: Pod3:80
145

eth0 10.130.1.102/24
Node 2
Root NW Namespace
L2 Bridge 10.17.4.1/16
veth0
Kubernetes: Pod to Service to Pod – Load Balancer
eth0 10.130.1.101/24
Node 1
Root NW Namespace
L2 Bridge 10.17.3.1/16
veth0 veth1
Forwarding
Tables
eth0
Container 1
10.17.3.2
Pod 1
Container 2
10.17.3.2
eth0
Pod 2
Container 1
10.17.3.3
1. Pod 1 sends packet to eth0 – eth0 is
connected to veth0
2. Bridge will try to resolve the Destination
with ARP protocol and ARP will fail
because there is no device connected to
that IP.
3. On Failure Bridge will give the packet to
Kube Proxy
4. it goes thru ip tables rules installed by
Kube Proxy and rewrites the Dst-IP with
Pod3-IP. IPVS has done the Cluster load
Balancing directly on the node and
packet is given to eth0 of the Node1.
5. Now packet leaves Node 1 eth0 and
enters the Network and network routes
the packet to Node 2.
6. Packet enters the Root namespace and
routed to the L2 Bridge.
7. veth0 forwards the packet to eth0 of
Pod 3
1
2
4
3
eth0
Pod 3
Container 1
10.17.4.1
5
6
Kube Proxy
Kube Proxy
7
SrcIP:Port: Pod1:17711 – Dst-IP:Port: Service1:80 Src-IP:Port: Pod1:17711 – Dst-IP:Port: Pod3:80
Order Payments
146

eth0 10.130.1.102/24
Node 2
Root NW Namespace
L2 Bridge 10.17.4.1/16
veth0
Kubernetes Pod to Service to Pod – Return Journey
eth0 10.130.1.101/24
Node 1
Root NW Namespace
L2 Bridge 10.17.3.1/16
veth0 veth1
Forwarding
Tables
eth0
Container 1
10.17.3.2
Pod 1
Container 2
10.17.3.2
eth0
Pod 2
Container 1
10.17.3.3
1. Pod 3 receives data from Pod 1 and
sends the reply back with Source as
Pod3 and Destination as Pod1
with ARP protocol and ARP will fail
because there is no device connected to
that IP.
3. On Failure Bridge will give the packet
Node 2 eth0
4. Now packet leaves Node 2 eth0 and
enters the Network and network routes
the packet to Node 1. (Dst = Pod1)
5. it goes thru ip tables rules installed by
Kube Proxy and rewrites the Src-IP with
Service-IP. Kube Proxy gives the packet
to L2 Bridge.
6. L2 bridge makes the ARP call and hand
over the packet to veth0
Pod1
1
2
4
3
eth0
Pod 3
Container 1
10.17.4.1
5
6
Kube Proxy
Kube Proxy
7
Src-IP: Pod3:80 – Dst-IP:Port: Pod1:17711
Src-IP:Port: Service1:80– Dst-IP:Port: Pod1:17711
Order
Payments
147

eth0 10.130.1.102/24
Node X
Root NW Namespace
L2 Bridge 10.17.4.1/16
veth0
Kubernetes: Internet to Pod 1. Client Connects to App published
Domain.
2. Once the Ingress Load Balancer
receives the packet it picks a VM (K8s
Node).
3. Once inside the VM IP Tables knows
how to redirect the packet to the Pod
using internal load Balancing rules
installed into the cluster using Kube
Proxy.
4. Traffic enters Kubernetes cluster and
reaches the Node X (10.130.1.102).
5. Node X gives the packet to the L2
Bridge
6. L2 bridge makes the ARP call and hand
over the packet to veth0
Pod 8
1
2
4
3
5
6
7
Src: Client IP –
Dst: App Dst
Src: Client IP –
Dst: Pod IP
Ingress
Load
Balancer
Client /
User
Src: Client IP –
Dst: VM-IP
eth0
Pod 8
Container 1
10.17.4.1
Kube Proxy
VM
VM
VM
148

Kubernetes: Pod to Internet
eth0 10.130.1.101/24
Node 1
Root NW Namespace
L2 Bridge 10.17.3.1/16
veth0 veth1
Forwarding
Tables
eth0
Container 1
10.17.3.2
Pod 1
Container 2
10.17.3.2
eth0
Pod 2
Container 1
10.17.3.3
1. Pod 1 sends packet to eth0 – eth0 is
connected to veth0
with ARP protocol and ARP will fail because
there is no device connected to that IP.
3. On Failure Bridge will give the packet to IP
Tables
4. The Gateway will reject the Pod IP as it will
recognize only the VM IP. So, source IP is
replaced with VM-IP (NAT)
5. Packet enters the network and routed to
Internet Gateway.
6. Packet reaches the GW and it replaces the
VM-IP (internal) with an External IP.
7. Packet Reaches External Site (Google)
1
2
4
3
5
6
Kube Proxy
7
Src: Pod1 – Dst: Google Src: VM-IP –
Dst: Google
Gateway
Google
Src: Ex-IP –
Dst: Google
On the way back the packet follows the same
path and any Src IP mangling is undone, and
each layer understands VM-IP and Pod IP within
Pod Namespace.
VM
149

Kubernetes
Networking Advanced
• Kubernetes IP Network
• OSI Layer | L2 | L3 | L4 | L7 |
• IP Tables | IPVS | BGP | VXLAN
• Kubernetes DNS
• Kubernetes Proxy
• Kubernetes Load Balancer, Cluster IP, Node Port
• Kubernetes Ingress
• Kubernetes Ingress – Amazon Load Balancer
• Kubernetes Ingress – Metal LB (On Premise)
150

Kubernetes Network Requirements
1. IPAM (IP Address Management & Life
cycle Management of Network
Devices
2. Connectivity and Container Network
3. Route Advertisement
151

OSI Layers
152

Networking Glossary
Netfilter – Packet Filtering in Linux
Software that does packet filtering, NAT and other
Packet mangling
IP Tables
It allows Admin to configure the netfilter for
managing IP traffic.
ConnTrack
Conntrack is built on top of netfilter to handle
connection tracking..
IPVS – IP Virtual Server
Implements a transport layer load balancing as part
of the Linux Kernel. It’s similar to IP Tables and
based on netfilter hook function and uses hash
table for the lookup.
Border Gateway Protocol
BGP is a standardized exterior gateway protocol
designed to exchange routing and reachability
information among autonomous systems (AS) on
the Internet. The protocol is often classified as a
path vector protocol but is sometimes also classed
as a distance-vector routing protocol. Some of the
well known & mandatory attributes are AS Path,
Next Hop Origin.
L2 Bridge (Software Switch)
Network devices, called switches (or bridges) are
responsible for connecting several network links to
each other, creating a LAN. Major components of a
network switch are a set of network ports, a control
plane, a forwarding plane, and a MAC learning
database. The set of ports are used to forward traffic
between other switches and end-hosts in the
network. The control plane of a switch is typically used
to run the Spanning Tree Protocol, that calculates a
minimum spanning tree for the LAN, preventing
physical loops from crashing the network. The
forwarding plane is responsible for processing input
frames from the network ports and making a
forwarding decision on which network port or ports
the input frame is forwarded to.
153

Networking Glossary
Layer 2 Networking
Layer 2 is the Data Link Layer (OSI Mode) providing Node to
Node Data Transfer. Layer 2 deals with delivery of frames
between 2 adjacent nodes on a network. Ethernet is an Ex.
Of Layer 2 networking with MAC represented as a Sub Layer.
Flannel uses L3 with VXLAN (L2) networking.
Layer 4 Networking
Transport layer controls the reliability of a given link
through flow control.
Layer 7 Networking
Application layer networking (HTTP, FTP etc.,) This is the
closet layer to the end user. Kubernetes Ingress Controller
is a L7 Load Balancer.
Layer 3 Networking
Layer 3’s primary concern involves routing packets between
hosts on top of the layer 2 connections. IPv4, IPv6, and ICMP
are examples of Layer 3 networking protocols. Calico uses L3
networking.
VXLAN Networking
Virtual Extensible LAN used to help large cloud
deployments by encapsulating L2 Frames within UDP
Datagrams. VXLAN is similar to VLAN (which has a
limitation of 4K network IDs). VXLAN is an encapsulation
and overlay protocol that runs on top of existing Underlay
networks. VXLAN can have 16 million Network IDs.
Overlay Networking
An overlay network is a virtual, logical network built on
top of an existing network. Overlay networks are often
used to provide useful abstractions on top of existing
networks and to separate and secure different logical
networks.
Source Network Address Translation
SNAT refers to a NAT procedure that modifies the source
address of an IP Packet.
Destination Network Address Translation
DNAT refers to a NAT procedure that modifies the
Destination address of an IP Packet.
154

eth0 10.130.1.102
Node / Server 1
172.17.4.1
VSWITCH
172.17.4.1
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
VSWITCH
172.17.5.1
Customer 1
Customer 2
VXLAN Encapsulation
10.130.1.0/24 10.130.2.0/24
Underlay Network
VSWITCH: Virtual Switch
Switch Switch
Router
155

eth0 10.130.1.102
Node / Server 1
172.17.4.1
VSWITCH
VTEP
172.17.4.1
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
VSWITCH
VTEP
172.17.5.1
Customer 1
Customer 2
VXLAN Encapsulation
Overlay Network
VSWITCH: Virtual Switch. | VTEP : Virtual Tunnel End Point
VXLAN encapsulate L2 into UDP
packets tunneling using L3. This
means no specialized hardware
required. So, the Overlay networks
could be created purely in
Software.
VLAN = 4094 (2 reserved) Networks
VNI = 16 Million Networks (24-bit ID)
156

eth0 10.130.1.102
Node / Server 1
172.17.4.1
VSWITCH
VTEP
172.17.4.1
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
VSWITCH
VTEP
172.17.5.1
Customer 1
Customer 2
VXLAN Encapsulation
Overlay Network
ARP Broadcast ARP Broadcast
ARP Broadcast
Multicast
VSWITCH: Virtual Switch. | VTEP : Virtual Tunnel End Point
ARP Unicast
157

eth0 10.130.1.102
Node / Server 1
172.17.4.1
B1 – MAC
VSWITCH
VTEP
172.17.4.1
Y1 – MAC
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
B2 – MAC
VSWITCH
VTEP
172.17.5.1
Y2 – MAC
Customer 1
Customer 2
VXLAN Encapsulation
Overlay Network
Src: 172.17.4.1
Src: B1 – MAC
Dst: 172.17.5.1
Dst: B2 - MAC
Src: 10.130.1.102
Dst: 10.130.2.187
Src UDP Port: Dynamic
Dst UDP Port: 4789
VNI: 100
Src: 172.17.4.1
Src: B1 – MAC
Dst: 172.17.5.1
Dst: B2 - MAC
Src: 172.17.4.1
Src: B1 – MAC
Dst: 172.17.5.1
Dst: B2 - MAC
VSWITCH: Virtual Switch. | VTEP : Virtual Tunnel End Point | VNI : Virtual Network Identifier
158

eth0 10.130.1.102
Node / Server 1
172.17.4.1
B1 – MAC
VSWITCH
VTEP
172.17.4.1
Y1 – MAC
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
B2 – MAC
VSWITCH
VTEP
172.17.5.1
Y2 – MAC
Customer 1
Customer 2
VXLAN Encapsulation
Overlay Network
Src: 10.130.2.187
Dst: 10.130.1.102
Dst UDP Port: 4789
VNI: 100
Src: 172.17.5.1
Src: B2 - MAC
Dst: 172.17.4.1
Dst: B1 – MAC
Src: 172.17.5.1
Src: B2 - MAC
Dst: 172.17.4.1
Dst: B1 – MAC
Src: 172.17.5.1
Src: B2 - MAC
Dst: 172.17.4.1
Dst: B1 – MAC
159

eth0 10.130.1.102
Node / Server 1
172.17.4.1
B1 – MAC
VSWITCH
VTEP
172.17.4.1
Y1 – MAC
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
B2 – MAC
VSWITCH
VTEP
172.17.5.1
Y2 – MAC
Customer 1
Customer 2
VXLAN Encapsulation
Overlay Network
Src: 172.17.4.1
Src: Y1 – MAC
Dst: 172.17.5.1
Dst: Y2 - MAC
Src: 10.130.1.102
Dst: 10.130.2.187
Dst UDP Port: 4789
VNI: 200
Src: 172.17.4.1
Src: Y1 – MAC
Dst: 172.17.5.1
Dst: Y2 - MAC
Src: 172.17.4.1
Src: Y1 – MAC
Dst: 172.17.5.1
Dst: Y2 - MAC
160

eth0 10.130.1.102
Node / Server 1
172.17.4.1
B1 – MAC
VSWITCH
VTEP
172.17.4.1
Y1 – MAC
Customer 1
Customer 2
eth0 10.130.2.187
Node / Server 2
172.17.5.1
B2 – MAC
VSWITCH
VTEP
172.17.5.1
Y2 – MAC
Customer 1
Customer 2
VXLAN Encapsulation
Overlay Network
VNI: 100
VNI: 200
161

Containers Docker Kind Kubernetes Istio

Containers Docker Kind Kubernetes Istio

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Containers Docker Kind Kubernetes Istio

Ähnlich wie Containers Docker Kind Kubernetes Istio (20)

Mehr von Araf Karsh Hamid

Mehr von Araf Karsh Hamid (16)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Containers Docker Kind Kubernetes Istio

Hinweis der Redaktion