This document provides an overview of Kubernetes and how it compares to VMware technologies. It begins with an analogy that containers are to operating systems what virtual machines are to server hardware. It then discusses how Kubernetes orchestrates multiple containers across nodes by splitting applications into smaller services. The remainder of the document discusses key Kubernetes concepts like pods, replica sets, deployments and services. It provides a mapping of how Kubernetes concepts compare to VMware concepts like vCenter and vSphere hosts. It also discusses considerations for installing Kubernetes and operating it at scale.
As a quick overview, we need to understand how containers are disrupting the current status quo.
Virtual machines simplified operating systems by providing common virtual hardware which abstracted the complexity of the underlying infrastructure. You can think of containers as abstracting operating system complexity from the application. Meaning, I can package up not only the application, but all the dependencies for that application regardless of the operating system it runs on.
There’s plenty of websites out there that show the trajectory of container adoption over time compared to VM adoption so it’s coming at a rapid pace.
The driving forces behind any shift can point root cause back to the application. The container movement is happening because of new application architectures. We are hearing stories of companies who are deconstructing their monolithic applications to cerate small services that can be maintained and upgraded independently. This in turn will allow a person or a team to own a particular service and be responsible for it’s communication and hooks into the rest of the application. It also allows experimental product sets or features to be implemented without effecting the core components. And containers exposing their service becomes that core construct.
I learn by using analogies. Taking something I’m already familiar with an mapping it to as new idea.
Virtual machines were able to take the constraints of physical hardware and make them ubiquitous. This allowed the operating system to have virtualized hardware. This in turn allows the operating system to own the dependencies of the application. These dependencies could be tied to a certain version of ruby, node.js, or golang the operating system needed to have installed. Of course, this limits multiple applications from running on a single VM because of version dependency or even language dependency. Once the dependency is in place, the application can be deployed in a multitude of ways.
With containers, the abstraction layer moves to the operating system. The container host is your operating system and the only dependency It requires is to have a container runtime installed. From there, your application and it’s dependencies are wrapped inside the container. The container itself is sharing the kernel and its properties from the container engine so we can have multiple containers, eaching having its own application with a different dependency as needed such as golang 1.4 in one container and golang1.12 in another.
Before, when you needed to deploy an application you needed and VM image as your base OS, then some sort fo Configuration Management technique to configure the OS and install dependenciesm, and then lean on other configuration Management tooling to install/run the application.
Now we can take something as simple as a dockerfile, build our application through a series of runtime commands, push it to a container registry such as DockerHub or if you’re running locally in your own datacenter, you can use an open source project like Harbor. From there, the container host will issue a docker run command that pulls from a registry and runs that application with all the dependencies it needs. This makes applications super portable, in a way that virtual machines can’t.
When looking to see what type of applications you can containerize, this chart is helpful in knowing what level of complexity it can be accomplished with. Going from left to right, we see the progression. In bucket 1, you may have software that is coming from an ISV in some form of binaries. You have no access to the source code, so you have to build it based on a series of trial and error. In bucket 2, you are in the same situation except you are looking at the vendor to provide the best possible way of running it in a container by giving you the images to make it happen. Bucket 3 is for software that many enterprises struggle with containerizing using .NET. This is getting better as time progresses but windows support is always tricky. Swiftly moving over to bucket 5 we have more modern types of applications that are purposely built with containers in mind.
We’ve talked a lot how Docker is making this all possible. But Docker alone only gives us part of the functionality we need to successfully run at scale. If we were to take a single application that has multiple container components, it can be ran but we miss out on those higher level pieces that give us more availability and easily scaling when needed. This is why we need an orchestrator.
Google has won the battle and that’s why we are here talking about Kubernetes. Kubernetes has emerged as the defacto container orchestrator as every major container technology company is supporting it. Kubernetes provides the abiliy to use the docker container run time but add higher level value such as scheduling, service discovery, scaling, resource management and much more.
Now that we have an idea of why we need Kubernetes, lets look at the architectural components and how that translates into the vSphere environment we all know and love.
Kubernetes has two main pieces, there is the control plane which is our master nodes and the data plane which is our worker nodes. We will take a look at each at a fairly high level.
The Kubernetes scheduler is policy-rich and topology-aware. It makes snap that effect availability, performance, and capacity of the cluster. The scheduler takes into account resource requirements, quality of service requirements, hardware, software, and policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on.
The API server is the central communication hub. It provides REST based services for the components to talk to one another as well as user interaction when deploying applications.
The Kubernetes controller manager is a service that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state.
The Cloud controller manager is a daemon that embeds the cloud specific components shipped with Kubernetes such as pieces relating to AWS or vSphere.
These two make up everything needed for management of the running state
The scheduler is what will watch for new pods as they are requested and created
Etcd is like pretty much our database. It saves the current state of the cluster.
This is pretty similar to what happens in vCenter.
Instead of etcd, we have our database which is some flavor of SQL. There is the scheduler that places VMs in certain places. There are all kinds of services build into vCenter such as the web client, inventory, licensing, and more. The difference is that not everything is communicating over a singular API construct. But there is still an API available for these services as well.
The control plane of Kubernets can scale as well. There is a lot more complex configuration that needs to take place that isn’t mentioned in this diagram such as fronting all these additional master nodes with a load balancer but etcd will replicate changes across the master nodes give it a highly available solution.
The data plane is where our workloads are running. We start off with a base operating system that has a container runtime installed and our two Kubernetes components. The Kubelet is like a kubernetes agent. It’s responsible for issuing commands on the local node that spin up pods. Each pod can container 1 or more containers and that’s how our applications are packaged. The kube-proxy is exactly that, a network proxy. It can do simple or round robin TCP, UDP, and SCTP stream forwarding across a your choice of overlay networks. The kubelet is in costant communication with the API server for resource monitoring and heartbeating. This is analgous to our vSphere model. The ESXi worker is not something we typically interact with. Its run workloads but vCenter is it’s main source of communication and orchestration.
Interacting with kubernetes is a bit different from vSphere. Most of us are ingrained with the instinct to use the vSphere Web Client to perform everything we need. Then we learn how to use other tooling like PowerCLI to automate some things and use a cli based control mechanism. Of course, we use vCenter as the main touch point here as well.
In Kubernetes, kubectl is a binary that is used on any computer to access the Kuberentes API server. It’s what is used to issue commands to the API server that then kicks off any application deployments. Today, 99% of the work is all done through this command line tool. Kubernetes does come with a GUI but it’s read-only and is mostly used for resource consumption statistics. There are other GUIs being developed like Scope from weaveworks but you will have to become comfortable with the CLI for a while.
If you’re interested to see what the cli can do, I’ve highlighted most of the common commands you will issue. Apply and create are very similar but these you will use most often when applying a policy or deployment to the api servier. When you need more instances of an application, that’s where scale will come into play. Looking to update your application, use the rolling update to update the pods in a fashion where there won’t be any hiccups in the app. Lastly, if you need to get into a container for any reason, there is the exec command line that’s similar to docker exec if you used that in the past.
So a final note on the architecture, the container is wrapping your application. The pod runs multiple containers for your application. The worker node run the container runtime and kubernetes agents. The Control plane is your management components. All of this can run on top of vSphere as well.
Going down a bit deeper is when we look at mapping more components from kubernetes to our infrastructure. Our application developer wants to provision a deployment and issues the apply command to the K8s api service. The application has specific requirements for it’s resources and affinity policies, security policy, how it is going to be accessed from the outside world using a load balancer, how the application storage is managed through persistent volumes, and what application metrics are pushed out for continual monitoring.
As a vSphere admin, these can be tied back to components that exist today. The vSphere Cloud Provider within Kubernetes will help with workload direction. NSX-T can take care of networking security profiles as well as being one of the only on-premise solutions that provide Load Balancer primitives from Kubernetes. The vSphere Cloud Provider also manages where persistent volumes are stored by orchestrating all the necessary steps needed to create, attach and mount and VMDK to a worker node so data can be preserved after the lifecycle has ended. Lastly, integrations with Wavefront and vRealize Operations can conintually monitor the application and the infrastructure.
Building your own solution by selecting individual pieces is exciting, but where does the fun end?
Time spent researching integration and compatibility of components
Does the management or orchestration layer know how to interoperate with all its resources?
When an update is available, is there interdependency management matrices?
If there is a problem, where is a line of support?
What’s my organization’s level of maturity and willingness to spend time?
Quicker ROI
Updates and maintenance is verified by the assembler
Deterministic capabilities and feature set
Support becomes common instead of custom
Common components mean tighter integrations that develop enhanced capabilities
Easy manufactured repeatability
A better overall user experience
Now that we know about the architecture and how it maps to vSphere, in addition to the level of difficulty when it comes to building a kubernetes cluster on your own, let’s examine the high level constructs of deploying your first applications.
Labels is what help us tie components together. We can label particular volumes so only certain containers can access them. In addition, we can map it our to higher levels such as saying a load balancer needs to tie itself to the service we call Front End for multiple types of applications, and in this case we have one called hello.
In the vSphere world we use labels as well. Probably the most notable is using storage policies. When we create a new VM we can attach a storage policy to it that only allows datastores that meet that policy. In addition there are tags and custom attributes that can be used by other applications
A replica set makes sure multiple copies of an application are running. It’s fairly simple to see how this is functioning but you are never going to be deploying a replicaSet or even a Pod on it’s own.
That’s why we look to higher level constructs such as a deployment. The deployment will take these lower level constructs and orchestrate the roll out based on our needs
Then when we need to access an application, we create services and can then expose pods based on the labels we had created previously. These all end up tying back to each other in meaningful ways.
But we’re only scratching the surface with types of application deployments. There’s far too many concepts that we don’t have time to cover such as namespaces, ingress, autoscaling, and Daemonsets.