4. Jérôme Petazzoni
(@jpetazzo)
Grumpy French DevOps
- Go away or I will replace you
with a very small shell script
Goal in life:
run everything in containers
- Docker-in-Docker
- VPN-in-Docker
- KVM-in-Docker
- Xorg-in-Docker
- etc
9. Packaging format
It's easier to build a container image than a
traditional distro package (.deb, .rpm...)
- Dockerfile
Application dependencies are isolated from the host
- application can use Python X even if the host has Python Y
- application can use distro X even if the host has distro Y
Downside: increased disk usage
- but disks are cheap
10. Service abstraction*
Service installation “docker pull”→
- no more dependency issues
Service start “docker run”→
- no more fiddling with language-specific wrappers
Service stop “docker stop/kill”→
- no more runaway processes
*If using a standard container format, e.g. Docker
11. Service abstraction
Service check “docker inspect”→
- no more “is this thing really running or not?”
Service reset “docker run”→
- discard copy-on-write layer to return to original state
- no more “oops, I broke it, how do I fix it?”
- doesn't work in all situations, e.g. data corruption
(but recovery is easier because creating test copies is cheap & fast)
12. No overhead*
CPU performance
= native performance
Memory performance
= a few % shaved off for (optional) accounting
Network and disk I/O performance
= small overhead; can be reduced to zero
*May require tuning!
13. Immutable infrastructure
What's an immutable infrastructure?
- re-create images each time you change a line of code
- prevent (or track) modifications of running images
Why is it useful?
- no more deviant servers after manual upgrades
- no more “oops, how do we rollback?” after catastrophic upgrade
- easier security audit (inspect images at rest)
How containers can help?
- container images are easier to create and manage than VM images
14. Micro-service architecture
What's a micro-service architecture?
- decompose big application into many small services
Why is it useful?
- it's easier to upgrade/refactor/replace a small service
- encourages to have many small teams*, each owning a service
(*small teams are supposedly better, see Jeff Bezos “two-pizza rule”)
How containers can help?
- problem: 10 micro-services instead of 1 big application
= 10x more work to deploy everything
- solution: need extremely easy deployment; hello containers!
17. Virtual Machines
Emulate CPU instructions
(painfully slow)
Emulate hardware (storage, network...)
(painfully slow)
Run as an userland process on top of a kernel
(painfully slow)
18. Virtual Machines
Use native CPU
(fast!)
Paravirtualized storage, network...
(fast, but higher resource usage)
Run on top of an hypervisor
(faster, but still some overhead)
20. Virtual Machines vs Containers
Native CPU
Paravirtualized devices
Hypervisor
Native CPU
Native syscalls
Native kernel
21. Inter-VM communication
Strong isolation, enforced by hypervisor + hardware
- no fast-path data tranasfer between virtual machines
- yes, there are PCI pass-throughs and things like xenbus,
but that's not easy to use, very specific, not portable
Most convenient method: network protocols (L2/L3)
But: huge advantage from a security POV
22. Inter-container communication
Tunable isolation
- each namespace can be isolated or shared
Allows normal Unix communication mechanisms
- network protocols on loopback interface
- UNIX sockets
- shared memory
- IPC...
Reuse techniques that we know and love (?)
25. Shared localhost
Multiple containers can share the same “localhost”
(by reusing the same network namespace)
Communication over localhost is very very fast
Also: localhost is a well-known address
26. Shared filesystem
A directory can be shared by multiple containers
(by using a bind-mount)
That directory can contain:
- named pipes (FIFOs)
- UNIX sockets
- memory-mapped files
Bind-mount = zero overhead
27. Shared IPC
Multiple containers can share IPC resources
(using the special IPC namespace)
Semaphores, Shared Memory, Message Queues...
Is anybody still using this?
28. Host networking
Containers can share the host's network stack
(by reusing its network namespace)
They can use the host's interfaces without penalty
(high speed, low latency, no overhead!)
Native performance to talk with external containers
29. Host filesystems
Containers can share a directory with the host
Example: use fast storage (SAN, SSD...) in container
- mount it on the host
- share it with the container
- done!
Native performance to use I/O subsystem
30. Device nodes
Containers can use the host's devices (if allowed)
Access is granted through “devices” control group
Performance (throughput, latency, CPU usage)...
is the same as using the device on the host
Examples:
- /dev/sd*: raw block device in container
- /dev/video*: GPU in container
- /dev/kvm: VM in container
- and more!
33. Service discovery is not a new problem
Usually not important for …
- small architectures
- low server count
- static deployments
But with containers...
- container count can be very high
- deployments are very dynamic
35. Name resolution
Provide custom /etc/hosts in container
(or give a custom DNS resolver to the container)
Easy to integrate in code (just connect to e.g. “db”)
But:
- no way to push changes to the application
- some libraries won't re-resolve
- might need to restart containers when service address changes
- service must run on well-known port
36. Environment variables (or SRV records)
Solves the “well-known port” requirement
Harder to integrate in code
Doesn't solve the other problems
37. In-app discovery with config DB
Connect to zookeeper/etcd to find service location
Watch zookeeper/etcd for changes
If change occurs, reconnect
38. In-app discovery with config DB
Works well, but requires deep code changes
If you support multiple languages, it's a lot of work
Transitioning from a system to another is hard
41. database host web host
database container
I'm frontdb!
web container
I want to talk to frontdb!
wiring container
I actually talk to frontdb!
wiring container
I pretend I'm frontdb!
local
connect
local
connect
?
42. database host web host
database container
I'm frontdb!
web container
I want to talk to frontdb!
wiring container
I actually talk to frontdb!
wiring container
I pretend I'm frontdb!
local
connect
local
connect
?
44. database host web host
database container
I'm frontdb!
web container
I want to talk to frontdb!
wiring container
I actually talk to frontdb!
wiring container
I pretend I'm frontdb!
local
connect
local
connect
UNICORNS
45. But! We are adding extra layers!
Yes.
But service/ambassador communication can be:
- over locahost
- over UNIX socket (even better but not always possible)
- over iptables or IPVS (to bypass the ambassador process)
46. Ambassador implementations
A few examples
- Registrator
https://github.com/progrium/registrator
- Grand Ambassador
https://github.com/cpuguy83/docker-grand-ambassador
- AirBNB SmartStack
https://github.com/airbnb/nerve
https://github.com/airbnb/synapse
Or roll your own
- some HA KV store + HAProxy, stunnel, iptables, IPVS...
- serverless, ad-hoc discovery (avahi, multicast...)
49. Thanks to containers, we can...
Build, ship, and run our applications more easily
Decouple application code from “plumbing”
(service discovery, load balancing)
Which allows to implement micro-service archtectures
efficiently (hopefully without the usual drawbacks!)
Which yields higher agility, shorter dev cycles, etc.
Win!