Adopting Docker for production applications and services used to be hard. You had to hand-roll a lot of the underlying infrastructure and write lots of custom code for service discovery, load balancing, orchestration, desired state, etc. Today, with the rise of open source container orchestration platforms and cloud-native offerings, it's a lot easier to get up and running.
Github repo for demo: https://github.com/elabor8/dockertalk
2. Agenda
• Motivations for containers
• Quick recap of Docker
• Docker: The past
• Docker: The present
• Operational concerns for adopting Docker in production
• Live Demo: Docker laptop to the cloud
• Docker platform choices
• Q&A
3. Motivations for containers
• Applications used to be:
• Nearly always monolithic
• Tightly coupled
• Slow to change
• Based on a single tech stack
• Today applications are:
• Decoupled
• Continuously changing
• Provisioned and scaled dynamically
• Cloud native
• Polyglot
4. Motivations for containers
• Challenges with modern architectures
• Supporting multiple tech stacks
• Onboarding teams/individuals
• Operational support
• Lots of moving parts - complex infrastructure
• Virtual Machines – not application centric, not portable, large and slow
• Dynamic scheduling and scaling
• Immutable infrastructure and Phoenix deployments
• Designing CI/CD pipelines
• Cloud infrastructure has made things a lot easier
5. Motivations for containers
• Enter containers…
• Lightweight virtualization in user-
space
• Share underlying OS kernel
• Do not require a hypervisor
• Resource isolation and constraints
• Efficient use of resources on host
• Not a new concept (see
infographic)
• Were not really designed for
application developer ease of use
• Docker changed this in 2013…
Source: Pivotal Software, Inc. ("Moments in Container History")
(https://content.pivotal.io/infographics/moments-in-container-history)
6. Quick recap of Docker
• Docker is a …
• … company (Docker, Inc.)
• … project (now Moby Project)
• … product/platform (Docker CE, Docker EE, etc.)
• … ecosystem (community, support, plugins, Docker Store)
• Founded in 2009 (formerly called dotCloud, Inc.)
• “Docker” was released as an open source project in 2013
• Initially, a single monolithic binary, now made up of many
components (runc, containerd, engine, client, etc.)
7. Quick recap of Docker
• What was missing from widespread container adoption?
• Containers were not easy to use for developers and operations
• Focus was not on simplicity and user experience
• Containers did not have a standard runtime or image format – no portability
(see Open Container Initiative)
• Standard tooling was missing (dev, ops, orchestration, registry, etc.)
• No emphasis on re-use of components (Images, Layers, Docker Hub, Store)
• No standard remote APIs
• Docker addressed many of these issues
• Docker reinforced the ideas behind microservices and 12 factor
applications
• Docker supported a laptop to production development lifecycle
8. Quick recap of Docker
• Docker tools
• Docker Engine (CE/EE) – assembled from upstream Moby components
• Docker CLI
• Docker Compose
• Docker Machine
• Docker Swarm (replaced by Swarm mode, now native to Docker Engine)
• Docker Registry
• Docker Cloud
• Docker Enterprise Edition (EE)
• Docker Trusted Registry (Image Signing, Scanning, RBAC)
• Docker Universal Control Plane (UI to manage swarms, SSO, RBAC)
• Certified components and support
9. Docker: The past
• Single host
• Initially, no official multi-host support
• Use `docker run`, links, docker0 bridge, docker-compose, etc.
• Multi host
• Workarounds: ambassador pattern, services exposed on host port, legacy Swarm
• Later, Docker Networks released with overlay networking (v1.8) and Network plugin
support
• No Windows support
• No official Docker support on Windows
• No native Windows containers with Docker
• No Volume management
• Workarounds: host directory mounting, Data-only containers
• Later, Volume plugin support added with 3pp support for multi-host volume management
• New `docker volume` commands added
10. Docker: The past
• No Config and Secrets management
• Reliance upon environment variables, host file binding, Config-only containers
or external K-V stores
• No secrets management – rely on external tools
• No Automated Orchestration and Scheduling
• No Desired State Reconciliation
• No Service Discovery
• No Healthchecking
• No Load-Balancing
• Limited Security Features
• Underlying host security, e.g. SELINUX
• Set Linux Kernel Capabilities on containers (division of root user actions)
21. Operational concerns for adopting
Docker in production
• Platforms
• Choice of container scheduling and orchestration platforms
• Build vs. Buy, Hosted vs. DIY, Cloud vs. On-prem
• Linux vs. Windows, Multi-architecture (Linux + Windows)
• Base images
• Official, Custom, Minimal (e.g. alpine, scratch)
• Registry
• Self-hosted, Hosted 3pp, Docker Trusted Registry
• Logging
• SaaS (e.g. SumoLogic), Cloud native – e.g. CloudWatch Logs, DIY – ELK, Splunk,
etc.
• Monitoring
• SaaS (e.g. DataDog), Cloud native – e.g. CloudWatch Metrics/Alarms, DIY –
Prometheus, Graphite, etc.
22. Operational concerns for adopting
Docker in production
• Security
• Secrets management, Image scanning and Signing, TLS
• Storage Management and Migration
• Volume drivers, Storage options
• High-Availability
• Scaling, Multi-instance, multi-region, Load-balancing, etc.
• Multi-host Networking and Service Discovery
• Deployment patterns
• CI/CD pipelines, Docker Compose to Stack, Kubernetes Deployments,
Single container per VM, etc.
23. Live Demo
• We'll take an application from
laptop to the cloud…
• …using Docker CE for AWS
24. Docker platform choices
• Docker CE/EE
• Docker Cloud (Standard mode,
Swarm mode)
• Docker for Mac / Windows
• Docker for AWS / Azure / GCP
• Kubernetes - Vanilla upstream (kops,
kubeadm) or Canonical Distribution of
Kubernetes (conjure-up)
• DC/OS (Mesos + Marathon)
• HashiCorp Nomad (with Docker
plugin)
• IBM BlueMix Container Service
(Kubernetes)
• Google Container Engine
(Kubernetes)
• AWS EC2 Container Service (ECS)
• AWS Elastic Beanstalk (Single
container – EC2, Multi-container –
ECS)
• Azure Container Service (DC/OS,
Swarm, Kubernetes)
• Rancher (Cattle, Swarm,
Kubernetes)
• Heroku (Docker container support)
• And more…
V1.0 – 18th Sep 2017 – Initial version
This is a 60-min talk (including demo and Q&A) so we can’t go into detail into what Docker and containers are – some basic knowledge is assumed.
Objectives:
Create awareness around what is needed to run applications in Docker today
Contrast this to what it used to be like in the building applications in the early days of Docker
Demo how to deploy applications to Docker in production using a particular platform, Docker for AWS
Tools like Puppet, Chef, Ansible, etc., improvements things a lot – repeatable process, automation, infrastructure as code, desired-state configuration and reconciliation.
Still mutable state…leads to configuration drift.
Immutable servers become a thing – baked images (AMIs, VMware templates, etc.)
VMs:
Still a lot of work, hard to do it right.
Lots of storage required, still to copy around, slow to boot.
Not application centric. Devs didn’t understand this well.
The underlying infrastructure needed to be in-place/setup before hand with implicit dependencies needed by the application being deployed e.g. runtime, packages, libraries, etc.
Did not support running multiple apps on the same server easily. This led to wasted resources and inefficient utilisation.
Whilst we now had immutable infra, application deployments tended to be static. Auto-scaling groups in AWS helped address this to some extent.
Workloads were still not portable though (between hypervisors, clouds, OS versions, etc.)
Resource isolation: cgroups, Linux kernel namespaces
Docker commoditized container development and operations through focusing on user experience and simplicity.
No base images, use Jails or chroot or LXC, need to create your own base filesystem (debootstrap, etc.).
Networking was still difficult to setup. You needed to be an expert in Linux kernel and OS features.
Ops tool – devs didn’t know about it or prepare for it. Heroku (and similar PaaS) started to use containers before Docker. This was hidden from developers, they relied on conventions and the platform took care of the rest. However, it only worked on the specific PaaS -- no standards, some components extracted for reuse in other open source PaaSes like Heroku’s build packs. Still supporting only specific types of apps and those built around the 12 factor model – not a complete solution for Enterprises.
Image source: http://www.nebulaworks.com/blog/wp-content/uploads/2016/08/01-docker-container.jpg
The image format is based on layered filesystems. Each layer is immutable and has a content hash stored against it for sharing and security purposes.
Docker EE
Docker EE for AWS / Azure / GCP (coming soon)
Docker EE for your datacentre
Docker Cloud
Links – Deprecated in 1.9 in favour of docker Networks (which appears in v1.8), then new linking features added back in 1.10
No Docker Volumes (natively) – used Data only containers and config containers – abstraction, needed to exist on host first, can be shared by containers – still need to manage backups, etc.
3pp volume and network plugins (flocker, weave) – before plugins, used to wrap or shin the docker API
Multi-host – consul, registrator, nginx, consul-template, host-port mapping, no desired state, no dynamic scheduling -- -early options include CoreOS
Swarm (not swarm mode) –
Compose – V2 file format – “services” – client side concept
Originally did not work with swarm or docker networks
Updated to support legacy Swarm with multi host depoy and docker networks
Consul is a tool for service discovery and configuration
Registrator is a service registry bridge for Docker
Service discovery: Consul + Registrator, Etcd, Kubernetes proxy
Load balancer/Reverse proxy: Traefik, Nginx, HAProxy
Static number of hosts
Pre-assign where contains will run
Use bash or Ansible scripts to deploy
No desired stat maintenance, no dynamic scaling
Set up only after CD deployment.
Rely n monitoring
Registrator was one of the first widespread instances of a service registry bridge for Docker – supporting pluggable backed like Consul, etcd, SkyDNS 2
Consul service looks up could be done via HTTP API or DNS.
Consul could also handle a form of load-balancing, since multiple container instances could register with the same service name.
Consul could run its own health checks against containers (specified via meta data) and would not return unhealthy instance IPs when queried.
Swarm experimental rescheduling only on node failure
no multi master (no HA)
no scaling
no healthcheck support
no auto-healing
no LB
a lot work work to setup
not very reliable
needed external KV store
You set up TLS yourself
Other players: Mesos / Marathon, Kubernetes, Fleet, Rancher, CloudFoundry (not docker)
Swarm supported – spread, binpack and random placement of containers.
It required a external discovery service to discover nodes.
This was a typical setup circa 2015 or so.
A lot of DIY heavy lifting.
Orchestration was still done via Bash / Ansible or similar tools.
Compose was still a tool for local development only.
An external load-balancer was typically configured from consul (or etcd/skydns). Traditional load balancers couldn’t dynamically update themselves. You had to use something like Confd or Consul-Template to listen for changes in the service registry and then dynamically rewrite the config for Nginx/HAProxy/etc., then reload the load-balancer (e.g. service nginx reload). This usually required baking Consul-Template/Confd into the same Docker image with HAProxy/Ngnix.
This would work with static container placement (i.e. with swarm) as well. For example, with bash or ansible script deployment of containers.
Better integrated security: Newer Linux Kernels, AppArmor profiles, Seccomp profiles, User Namespaces
Swarm mode: Now a server side concept
Managers use raft protocol to share state – one manager is active. The others vote to be elected if a manager goes down.
You need a quorum of (N+1)/2 managers to elect a leader -- usually 3 or 5 managers is recommend with a max of 7. 1 manager has not HA and is only for demo purposes.
The active manager makes the scheduling decisions, runs desired state reconciliation and allocates tasks to workers.
Workers just take orders from managers to run tasks and report back their status to the managers.
Config and Secrets are stored on the managers in the internal raft store. Secrets are encrypted at rest and in transport.
Workers that need config or secrets get a copy over TLS – only whilst a service on the worker needs it.
TLS is automatically configured and rotated between all managers and workers.
You can join any Docker node into a swarm (with a secret token). All nodes in the swarm can be promoted to a manager or demoted to a worker.
Additionally, a node can be set to “Drain” where all service tasks are moved off the node – useful for managers or workers that need maintenance.
Services that publish a port can be reached via that port on any Swarm manager (ingress load-balancing) even if a container for the service is not running on that manager.
Typically, all the manager IPs are added to an external load-balancer.
Internal to the swarm, services get automatic load-balancing and service discovery.
Docker EE environment on AWS (from https://github.com/aws-quickstart/quickstart-docker)
A virtual private cloud (VPC) that spans three Availability Zones and includes three public subnets.
Three Swarm controller nodes that run the DTR and UCP services.
A cluster of Swarm nodes in an Auto Scaling group, so the cluster can grow dynamically as the load on the instances increases.
Three Elastic Load Balancing (ELB) load balancers. Two of these load balancers provide inbound access to the management consoles for UCP and DTR, and the third provides inbound access to customer applications running on the Swarm nodes.
Amazon Simple Storage Service (Amazon S3) for backing up the root certificate authorities (CAs).
Windows servers in Swarm via UCP agent.
Kubernetes has more moving parts than Docker Swarm mode and a steeper learning curve.
Not everything you need is built in – like with Swarm.
You also need to install a CNI network plugin (Calico, Weave, Flannel, OVS, etc.)
Kubernetes is more widely adopted across cloud and enterprise but is more difficult to use.
It’s also not as easy for developers to use compared to compose and swarm mode.
Kubernetes is more ops centric (suited for system engineers – build your own platform on top), Docker (swarm, compose) is more dev centric.
Kubernetes is more of power users and has more features like Network and Security Policies.
Docker has 3pp plugins to do some of this like Contiv from Cisco.
Also, Service meshes like Istio and Linkerd are filling in some of these gaps.
Many PaaSes build upon Docker and Kubernetes.
ECS uses AWS specific constructs and is therefore not portable.