Docker for Mac and Windows were released in beta in March, and provide lots of new features that users have been clamouring for including: file system notifications, simpler file sharing, and no Virtualbox hassles.
During this talk, I will give the inside guide to how these products work. We will look at all the major components and how they fit together to make up the product. This includes a technical deep dive covering the hypervisors for OSX and Windows, the custom file sharing code, the networking, the embedded Alpine Linux distribution, and more.
5. Make it native
â˘Â  Install Docker and it is just there and working
â˘Â  Make everything work just like a Linux install
â˘Â  Make ďŹle notiďŹcations work
â˘Â  No Virtualbox
â˘Â  Get out of the way
â˘Â  Make the powerful simple
5
6. People liked it!
â˘Â  30,000 signups on the ďŹrst day!
â˘Â  70,000 by DockerCon
â˘Â  The team building it used it from the start
6
10. Hyperkit: A toolkit for embedding
hypervisor capabilities in your application
â˘Â  Only used on the Mac â Windows uses Hyper-V
â˘Â  Based on xhyve, which in turn is bsed on bhyve from FreeBSD
â˘Â  A hypervisor is just a single process per emulated CPU core
â˘Â  https://github.com/docker/hyperkit
10
11. Hyperkit
â˘Â  Sparse block device, using qcow2 format
â˘Â  Virtio devices: network, block, 9p, socket, rng
â˘Â  ConďŹgure the amount of memory and CPU cores in the preferences
11
13. Git database
â˘Â  Can do really interesting stuff â it is git for datastructures
â˘Â  Simple use cases here for storing conďŹguration
â˘Â  Yes there is a git tree at
~/Library/Containers/com.docker.docker/Data/database
â˘Â  There is a ďŹlesystem view inside the VM using 9p
â˘Â  https://github.com/docker/datakit
13
15. VSock and HVSock
â˘Â  Socket families for communication with VMs, like Unix sockets
â˘Â  VSock originally developed by VMWare
â˘Â  HVSock developed by Microsoft, similar design, but different addressing
â˘Â  New socket families so not well supported yet, eg no Go support have to
use C bindings
â˘Â  Communication not over network, so works even if network issues
â˘Â  Used to transport the Docket socket to the VM not via https, and other
uses.
15
17. Alpine Linux is a security-oriented,
lightweight Linux distribution based on
musl libc and busybox.
17
18. Stateless
â˘Â  Alpine was designed to boot from init ramdisk
â˘Â  Reboot and everything is back to the same state, except /var with
Docker state on it.
â˘Â  ConďŹguration stored in the datakit database, currently mounted on
/Database
â˘Â  Phoenix like, just rebuild a new image, no upgrades
â˘Â  Only runs Docker and supporting infrastructure.
18
19. No user serviceable parts?
â˘Â  Designed to âjust workâ and keep Docker runnning.
â˘Â  Database sets all the Docker and system conďŹg (UI coming soon)
â˘Â  Use privileged containers or capabilities to do other host changes, persist
with --restart always
â˘Â  eg install sysdig kernel module
â˘Â  root shell: docker run -it --privileged --pid=host
debian nsenter -t 1 -m -u -n -i sh
19
20. Kernel
â˘Â  Currently stable 4.4.x series kernel with aufs, vsock and hvsock patches
â˘Â  No modules, everything built in for fast boot, but you can add modules
â˘Â  Supports aufs and overlay storage drivers (overlay2 soon)
â˘Â  Supports NFS, SMB, CRIU, many other things requested by users on the
forum
â˘Â  binfmt_misc set up so you can run arm and other binaries with qemu
emulation
â˘Â  conďŹg in /proc/config.gz patches in /etc/kernel-patches
20
21. Userspace
â˘Â  Alpine 3.4
â˘Â  Docker, statically linked build with seccomp.
â˘Â  Reasonable set of utility programs, not just base system, eg Kubernetes
had some requirements which were added
â˘Â  You can install new packages, again they will not persist over a reboot,
but we use this to debug Docker sometimes
â˘Â  Some diagnostics code, setup code, ďŹle sharing, time synchronisation, ...
â˘Â  Will be open sourced when a bit more stable, after GA.
21
23. Come to Mindy Preston's talk tomorrow at
14:25 Ballroom 6C for more about VPNkit
23
24. VPNkit
â˘Â  https://github.com/docker/vpnkit
â˘Â  Currently being used on Mac, being tested for Windows
â˘Â  Take the ethernet trafďŹc from the Linux VM
â˘Â  Reconstruct it as application trafďŹc on OSX
â˘Â  So socket(); ... listen() ... send() in Linux is
reconstructed from ethernet trafďŹc as the same series of calls on OSX.
â˘Â  No OS interfaces needed, complete control, appears just like another
application so works with VPN, ďŹrewall.
24
25. VPNkit
â˘Â  VPNkit uses the network stack from the Mirage unikernel
â˘Â  Mirage is a set of low level system libraries
â˘Â  Written in OCaml, a functional programming language
â˘Â  Easy to repurpose for completely different use cases
25
27. OSXfs
â˘Â  Currently only being used on the Mac, SMB used in Windows
â˘Â  Uses FUSE on the Linux side to get the system calls
â˘Â  Transports these over VSock
â˘Â  Converts into OSX ďŹlesystem calls.
â˘Â  Listens for ďŹlesystem notiďŹcations in OSX and replays events in Linux.
27
28. OSXfs
â˘Â  As you may see it is somewhat similar model to VPNkit but more
complex
â˘Â  Filesystems are more different that network stacks
â˘Â  Filesystem notiďŹcations are quite different as not very standard
â˘Â  eg OSX does not provide read notiďŹcations, but Linux does.
â˘Â  Soon will be able to specify which parts of the ďŹlesystem to share.
28
29. Windows ďŹle sharing
â˘Â  Uses a SMB mount from Linux to Windows
â˘Â  SMB protocol supports ďŹlesystem notiďŹcations but Linux does not
unfortunately
â˘Â  Will have to solve this by adding notiďŹcation support or porting OSXfs
29
31. User interface
â˘Â  Minimal at the moment but more plans for the future
â˘Â  Native code: Swift on OSX, C# on Windows
â˘Â  Communicates through database for conďŹguration.
â˘Â  Plans to integrate Kitematic or similar graphical interface later
31
36. --net=host
â˘Â  With VPNkit, when you --publish a port in a container, we use a
custom userland proxy
â˘Â  This communicates through the /port 9p ďŹlesystem to the Mac
â˘Â  The Mac opens a listening socket on the host, so you can connect to your
container on localhost:nnn
â˘Â  This is also used for docker service in Swarm mode
â˘Â  But listening to a port with --net=host does not notify Docker so we
cannot easily intercept it
36
37. I cannot connect to container ports
â˘Â  On OSX there is no direct routing to the Linux VM, and so neither can you
reach the machines on a bridge network
â˘Â  If you want to debug, do it from another container, or docker run --
net=container:name
â˘Â  You need to do this for overlay networks anyway, so good idea to get
used to this.
37
38. Unix socket between host and container
doesnât work
â˘Â  Well yes, they are different computers...
â˘Â  But we do plan to make it work via osxfs and transparently proxying over
VSock for you
â˘Â  In the mean time, make do with a TCP socket or run everything in
containers.
38
39. Sound and Vision
â˘Â  X Windows, RDP, VNC, or HTTP graphical interfaces can be made to work
â˘Â  Audio should be possible too.
â˘Â  Quite a few people want this, please help out!
39
41. Stability ďŹrst,features second
â˘Â  Stability and bug ďŹxing priorities for GA
â˘Â  People are using it every day for their work. Including us.
â˘Â  Features driven by what you want
â˘Â  More components will be open sourced.
41
43. The team,many are here at DockerCon
Many many more people at Docker who
contributed
A very long list of contributors to open
source projects,especially bhyve,xhyve,
Mirage,Alpine Linux
43