An introduction to Docker native clustering: Swarm.
Deployment and configuration, integration with Consul, for a product-like cluster to serve web-application with multiple containers on multiple hosts. #dockerops
2. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
About me
Lead developer at
ClouDesire.com
Open Source Enthusiast
with SuperCow Powers
Java/PHP/whatever
developer
writer of the OpenNebula
book
devops
https://twitter.com/gionn
me@gionn.net
2
3. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
What is ClouDesire?
Application Marketplace to help software vendors to sell and provision
applications
â Web Applications:
â provision VM
â on multiple cloud providers
â deploy/upgrade application and dependencies docker containers
â application logging
â resource monitoring
â With multi-tenant applications/SaaS:
â expose REST hooks and API for billing lifecycle
â manage subscriptions, billing, pay-per-use, invoicing, payments.
3
4. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
History of Docker networking support
4
â 2014-06-09 â Docker 1.0 release - standard bridges, no multi-host
support
â 2015-06-16 â Docker 1.7 release - experimental volume plugins,
networking rewritten and released as libnetwork
â 2015-07-24 â libnetwork 0.4.0 release, experimental overlay driver
and network plugins
â 2015-08-04 â Docker Swarm 0.4 release
â 2015-10-13 â Docker Swarm 1.0 release
â 2015-11-03 â Docker 1.9 release - network feature exits experimental,
multi-host networking using VXLAN based overlay driver
â 2016-02-04 â Docker 1.10 release - DNS based discovery
5. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Docker without Swarm
â Independent Docker hosts
â Chef, Puppet, Ansible?
â Containers manual allocation on multiple nodes
â Non-linear resources usage
â No service discovery, hardcoded configurations
â Consul?
â Manual reaction to failures
â Unhandled container data - bounded to local node
â Only third-party OSS âschedulersâ available (without simplicity in mind)
â Google Kubernetes
â Apache Mesos
â Spotify Helios
â New Relic Centurion
5
6. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
And then, Swarm.
â Native clustering for Docker:
â turns a pool of Docker hosts into a single, virtual host
â Standard Docker API
â re-use existing tools
â docker cli
â compose
â dokku
â anything else
â Pluggable schedulers
6
7. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Steps for bootstrap a Swarm cluster
Bootstrapping a cluster, the practical way:
â Launch a fleet of VM, reachable via SSH
â docker daemon running
â reachable via TCP port
â auth with TLS certificates
â external service discovery backend required
â Bootstrap swarm-manager
â Bootstrap swarm-agent on the remaining nodes
â Use swarm-manager API
7
8. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Docker-Machine for launching VM
Machine manager (like Vagrant)
https://github.com/docker/machine
(Win/Mac: distributed in Docker toolkit)
â Launch VM somewhere
â Install Docker
â Generates and copy certificates
â (password-less auth)
â Enable remote access via TCP
8
9. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Docker-Machine help
â active: Print which machine is
active
â config: Print the connection config
for machine
â create: Create a machine
â env: Display the commands to set
up the environment for the Docker
client
â inspect: Inspect information about a
machine
â ip: Get the IP address of a machine
â kill: Kill a machine
â ls: List machines
â provision: Re-provision existing
machines
â regenerate-certs: Regenerate TLS
Certificates for a machine
9
â restart: Restart a machine
â rm: Remove a machine
â ssh: Log into or run a command on
a machine with SSH.
â scp: Copy files between machines
â start: Start a machine
â status: Get the status of a machine
â stop: Stop a machine
â upgrade: Upgrade a machine to the
latest version of Docker
â url: Get the URL of a machine
â version: Show the Docker Machine
version or a machine docker
version
10. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Docker-Machine backends
Where nodes can run?
â Generic backend
â existing hosts with ssh access
â Local machine (virtualization)
â Virtualbox
â VMware Fusion
â Cloud providers
â Amazon
â GCE
â Rackspace
â DigitalOcean
â ...
10
12. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Interaction with a Docker-Machine node
$ docker-machine env default
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.100:2376"
export DOCKER_CERT_PATH="/home/gionn/.docker/machine/machines/default"
export DOCKER_MACHINE_NAME="default"
# Run this command to configure your shell:
# eval "$(docker-machine env default)"
$ docker info
Kernel Version: 4.1.17-boot2docker
Operating System: Boot2Docker 1.10.0 (TCL 6.4.1); master : b09ed60 - Thu
Feb 4 20:16:08 UTC 2016
12
13. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Docker-Machine for launching a swarm-master
Using the Docker Hub discovery backend (best for testing/development):
$ docker run swarm create
a62518a837ed196550ec83442901dfad
$ docker-machine create
-d <backend-plugin>
--swarm
--swarm-master
--swarm-discovery token://<token>
swarm-master
or manually:
$ docker run -d -p 3375:2375 -t swarm manage token://<token>
13
14. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Docker-Machine for launching swarm nodes
$ docker-machine create
-d <backend-plugin>
--swarm
--swarm-discovery token://<token>
swarm-node-00
or manually:
$ docker run -d swarm join --addr=<master-ip>:2375 token://<token>
14
15. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Check running machine status
$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
swarm-master - virtualbox Running tcp://192.168.99.101:2376 swarm-master (master) v1.10.0
swarm-node-00 - virtualbox Running tcp://192.168.99.100:2376 swarm-node-00 v1.10.0
$ eval $(docker-machine env swarm-master) && docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS
NAMES
b664b357e999 swarm:latest "/swarm join --advert" 2 days ago Up 21 minutes swarm-agent
52ddf6fbab43 swarm:latest "/swarm manage --tlsv" 2 days ago Up 21 minutes swarm-agent-master
15
16. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
First lap with Docker Swarm
$ docker -H 192.168.99.101:3376 ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS
NAMES
c4067a2f176b swarm:latest "/swarm join --advert" 2 minutes ago Up 2 minutes
swarm-node-00/swarm-agent
9623e4e94771 swarm:latest "/swarm join --advert" 7 minutes ago Up 7 minutes
swarm-master/swarm-agent
8576ffa755c4 swarm:latest "/swarm manage --tlsv" 7 minutes ago Up 7 minutes
swarm-master/swarm-agent-master
â agent running on every node
â master running on a single node
16
17. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Service discovery backends for production
Swarm relies on a service discovery backend to knows endpoints of
all the nodes.
â Docker Hub token (ok for testing, not intended for production)
â Static file with IP:port list or range (poor man service discovery)
â etcd
â consul
â zookeeper
17
18. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Service discovery with Consul
Consul is a distributed, highly available
Key/Value store and service registry, with
simple API.
https://www.hashicorp.com/
https://www.consul.io
18
19. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Consul features
â Agent based
â Key-Value Store
â Services discovery backend
â Services Health Checking
â Query interfaces
â HTTP JSON API
â DNS
â LAN communication
â WAN replication (Multi-DC)
â UI for browsing
19
â Agent
â Health Checking
â Query interface (HTTP,
DNS)
â Server
â Data storage and
replication
â Leader election
â Query interface (HTTP,
DNS)
21. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Bootstrap Consul cluster with Docker-Machine
Initialize new node(s):
$ docker-machine create
-d <backend-plugin>
consul-1
Prepare for launch:
$ eval $(docker-machine env consul-1)
21
22. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Service discovery with Consul
Single node bootstrap:
$ docker run --net=host progrium/consul -server -bootstrap
Multiple node bootstrap:
$ docker run --net=host progrium/consul -server -bootstrap-expect 3
$ docker run --net=host progrium/consul -server -join <existing-node-ip>
https://hub.docker.com/r/progrium/consul/
22
23. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Bootstrap swarm-master backed by Consul
$ docker-machine create
-d <backend-plugin>
--swarm
--swarm-master
--swarm-discovery="consul://$(docker-machine ip consul-1):8500" --
engine-opt="cluster-store=consul://$(docker-machine ip consul-1):
8500"
--engine-opt="cluster-advertise=eth1:2376"
swarm-master
â Node information saved in the K/V store
â Master announce itself on the network for being picked up by agents
23
24. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Highly available Swarm master backed by Consul
24
Master replaced automatically when a last advertise TTL expires
$ docker-machine create
-d virtualbox
--swarm
--swarm-master
--swarm-discovery="consul://$(docker-machine ip consul-1):8500"
--engine-opt="cluster-store=consul://$(docker-machine ip consul-1):
8500"
--engine-opt="cluster-advertise=eth1:2376"
--swarm-opt="replication=true"
--swarm-opt="advertise=eth1:3376"
swarm-master
25. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Multi-Host networking with Overlay driver
Default bridge network allows only single host networking.
Overlay enables multi-host networking with a software-defined network.
â K/V Store is required (e.g. Consul)
â Create a network with overlay driver
$ docker -H 192.168.99.101:3376 network create --driver overlay --subnet=10.
0.9.0/24 cloudesire
â Run containers within the new network
$ docker -H 192.168.99.101:3376 run -ti --net=cloudesire busybox
25
26. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Multi-Host networking with Overlay driver (2)
â Example ip addr of a container attached to overlay network
11: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:00:09:02 brd ff:ff:ff:ff:ff:ff
inet 10.0.9.2/24 scope global eth0
14: eth1@if15: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.2/16 scope global eth1
â Multiple overlay network can be created
â Service discovery via dns enabled
â Forget about using links
â No more starting order madness
â No more restart parties
â Additional services registered via --publish-service=service.name
â Multiple containers exposing the same service
26
27. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Swarm Manager scheduler policies
Available strategies:
â spread
â few containers on every node
â binpack
â most containers on few nodes
â random
â totally cpu/memory unaware
Tip: stopped containers count towards scheduler allocation
27
28. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Scheduler filters
Filters enabled by default:
â Health
â avoid starting containers on unhealthy hosts
â Constraints
â by node name
â by storage driver
â by kernel version
â by custom labels
$ docker run -e constraint:storage==ssd mysql
28
29. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Container filters
â affinity
â container: prefer scheduling nearby existing container
â -e affinity:container==frontend
â image: prefer scheduling on node with already pulled image
â -e affinity:image==redis
â label: prefer scheduling nearby tagged containers
â --label com.example.type=frontend
â -e affinity:com.example.type==frontend
â dependency
â --volumes-from=N â same node where volume reside
â --link=N:alias â same node with container to link to
â --net=container:N â node with same network stack of another container
â port
â avoids port clashes when launching multiple containers on the same port
29
30. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
What about Storage?
â Docker 1.8 introduced volume plugins
â Docker 1.9 improve usability of volume plugins
Available plugins (any particular Swarm support required):
â Flocker (move data along with containers)
â Netshare (NFS, CIFS, AWS AFS)
â Convoy (NFS, EBS, plus snapshot support)
â GlusterFS
â https://github.com/docker/docker/blob/master/docs/extend/plugins.md
$ docker run -d --volume-driver <driver> -v <src:dst_path> <image>
30
31. DockerOps 2016 #dockerops Giovanni Toraldo ~ ClouDesire.com
Gotchas -> Roadmap
â Too simple container rescheduling on node failure
â No stateful/stateless distinction
â No rebalancing across nodes
â No Global Scheduling (same container on every node, e.g. log
collector)
â No Persistence of status - no Shared State
â If master goes offline, and then node goes offline, master came back, no way to
know what was on node running
â Scalability up to hundreds of nodes
â Lacking integration with larger platforms: Mesos, Kubernetes
31