SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Docker and the
Future of Containers in
Production
CTO
bryan@joyent.com
Bryan Cantrill
@bcantrill
Prehistory: Virtualization as cloud catalyst
• In the 1960s — shortly after the dawn of computing! —
pundits foresaw a compute utility that would be public
and multi-tenant
• The vision was four decades too early: it took the
internet + commodity computing + virtualization to yield
cloud computing
• Virtualization is the essential ingredient for multi-tenant
operation — but where in the stack to virtualize?
• Choices around virtualization capture tensions between
elasticity, tenancy, and performance
• tl;dr: Virtualization choices drive economic tradeoffs
• The historical answer — since the 1960s — has been to
virtualize at the level of the hardware:
• A virtual machine is presented upon which each
tenant runs an operating system of their choosing
• There are as many operating systems as tenants
• The singular advantage of hardware virtualization: it can
run entire legacy stacks unmodified
• However, hardware virtualization exacts a heavy price:
operating systems are not designed to share resources
like DRAM, CPU, I/O devices or the network
• Hardware virtualization limits tenancy, elasticity and
performance
Hardware-level virtualization?
• Virtualizing at the application platform layer addresses
the tenancy challenges of hardware virtualization
• Added advantage of a much more nimble (& developer-
friendly!) abstraction…
• ...but at the cost of dictating abstraction to the developer
• This creates the “Google App Engine problem”:
developers are in a straightjacket where toy programs
are easy — but sophisticated apps are impossible
• Virtualizing at the application platform layer poses many
other challenges with respect to security, containment
and scalability
Platform-level virtualization?
• Virtualizing at the OS level hits the sweet spot:
• Single OS (i.e., single kernel) allows for efficient use of
hardware resources, maximizing tenancy and
performance
• Disjoint instances are securely compartmentalized by
the operating system
• Gives users what appears to be a virtual machine (albeit
a very fast one) on which to run higher-level software
• The ease of a PaaS with the generality of IaaS
• Model was pioneered by FreeBSD jails and taken to
their logical extreme by Solaris zones — and then aped
by Linux containers
OS-level virtualization!
OS-level virtualization in the cloud
• Joyent runs OS containers in the cloud via SmartOS
(our illumos derivative) — and we have run containers in
multi-tenant production since ~2006
• Core SmartOS facilities are container-aware and
optimized: Zones, ZFS, DTrace, Crossbow, SMF, etc.
• SmartOS also supports hardware-level virtualization —
but we have long advocated OS-level virtualization for
new build out
• We emphasized their operational characteristics
(performance, elasticity, tenancy), and for many years
we were a lone voice...
Containers as PaaS foundation?
• Some saw the power of OS containers to facilitate up-
stack platform-as-a-service abstractions
• For example, dotCloud — a platform-as-a-service
provider — build their PaaS on OS containers
• Hearing that many were interested in their container
orchestration layer (but not their PaaS), dotCloud open
sourced their container-based orchestration layer...
...and Docker was born
Docker revolution
• Docker has used the rapid provisioning + shared
underlying filesystem of containers to allow developers
to think operationally
• Developers can encode dependencies and deployment
practices into an image
• Images can be layered, allowing for swift development
• Images can be quickly deployed — and re-deployed
• Docker will do to apt what apt did to tar
Docker’s challenges
• The Docker model is the future of containers
• Docker’s challenges are largely around production
deployment: security, network virtualization, persistence
• Security concerns are real enough that for multi-tenancy,
OS containers are currently running in hardware VMs (!!)
• SmartOS, we have spent a decade addressing these
concerns — and are proven in production…
• Could we combine the best of both worlds?
• Could we somehow deploy Docker containers as
SmartOS zones?
Docker + SmartOS: Linux binaries?
• First (obvious) problem: while it has been designed to
be cross-platform, Docker is Linux-centric
• While Docker could be ported, the encyclopedia of
Docker images will likely forever remain Linux binaries
• SmartOS is Unix — but it isn’t Linux…
• Could we somehow natively emulate Linux — and run
Linux binaries directly on the SmartOS kernel?
OS emulation: An old idea
• Operating systems have long employed system call
emulation to allow binaries from one operating system
run on another on the same instruction set architecture
• Combines the binary footprint of the emulated system
with the operational advantages of the emulating system
• Sun first did this with SunOS 4.x binaries on Solaris 2.x
• In mid-2000s, Sun developed zone-based OS emulation
for Solaris: branded zones
• Several brands were developed — notably including an
LX brand that allowed for Linux emulation
LX-branded zones: Life and death
• The LX-branded zone worked for RHEL 3 (!): glibc 2.3.2
+ Linux 2.4
• Remarkable amount of work was done to handle device
pathing, signal handling, /proc — and arcana like TTY
ioctls, ptrace, etc.
• Worked for a surprising number of binaries!
• But support was only for 2.4 kernels and only for 32-bit;
2.6 + 64-bit appeared daunting…
• Support was ripped out of the system on June 11, 2010
• Fortunately, this was after the system was open sourced
in June 2005 — and the source was out there...
LX-branded zones: Resurrection!
• In January 2014, David Mackay, an illumos community
member, announced that he was able to resurrect the
LX brand —and that it appeared to work!
Linked below is a webrev which restores LX branded zones
support to Illumos:
http://cr.illumos.org/~webrev/DavidJX8P/lx-zones-restoration/
I have been running OpenIndiana, using it daily on my
workstation for over a month with the above webrev applied to
the illumos-gate and built by myself.
It would definitely raise interest in Illumos. Indeed, I have
seen many people who are extremely interested in LX zones.
The LX zones code is minimally invasive on Illumos itself, and
is mostly segregated out.
I hope you find this of interest.
LX-branded zones: Revival
• Encouraged that the LX-branded work was salvageable,
Joyent engineer Jerry Jelinek reintegrated the LX brand
into SmartOS on March 20, 2014...
• ...and started the (substantial) work to modernize it
• Guiding principles for LX-branded zone work:
• Do it all in the open
• Do it all on SmartOS master (illumos-joyent)
• Add base illumos facilities wherever possible
• Aim to upstream to illumos when we’re done
LX-branded zones: Progress
• Working assiduously over the course of 2014, progress
was difficult but steady:
• Ubuntu 10.04 booted in April
• Ubuntu 12.04 booted in May
• Ubuntu 14.04 booted in July
• 64-bit Ubuntu 14.04 booted in October (!)
• Going into 2015, it was becoming increasingly difficult to
find Linux software that didn’t work...
LX-branded zones: Working well...
...and, um, well received
Docker + SmartOS: Provisioning?
• With the binary problem being tackled, focus turned to
the mechanics of integrating Docker with the SmartOS
facilities for provisioning
• Provisioning a SmartOS zone operates via the global
zone that represents the control plane of the machine
• docker is a single binary that functions as both client
and server — and with too much surface area to run in
the global zone, especially for a public cloud
• docker has also embedded Go- and Linux-isms that
we did not want in the global zone; we needed to find a
different approach...
Docker Remote API
• While docker is a single binary that can run on the
client or the server, it does not run in both at once…
• docker (the client) communicates with docker (the
server) via the Docker Remote API
• The Docker Remote API is expressive, modern and
robust (i.e. versioned), allowing for docker to
communicate with Docker backends that aren’t docker
• The clear approach was therefore to implement a
Docker Remote API endpoint for SmartDataCenter
Aside: SmartDataCenter
• Orchestration software for SmartOS-based clouds
• Unlike other cloud stacks, not designed to run arbitrary
hypervisors, sell legacy hardware or get 160 companies
to agree on something
• SmartDataCenter is designed to leverage the SmartOS
differentiators: ZFS, DTrace and (esp.) zones
• Runs both the Joyent Public Cloud and business-critical
on-premises clouds at well-known brands
• Born proprietary — but made entirely open source on
November 6, 2014: http://github.com/joyent/sdc
SmartDataCenter: Architecture
Booter
AMQP
broker
Public
API
Customer
portal
ZFS-based multi-tenant filesystem
VirtualNIC
VirtualNIC
Virtual
SmartOS
(OS virt.)
...
VirtualNIC
VirtualNICLinux
Guest
(HW virt.)
...
VirtualNIC
VirtualNIC
Windows
Guest
(HW virt.)
...
VirtualNIC
VirtualNIC
Virtual OS
or Machine
...
SmartOS kernel
(network booted)
SmartOS kernel
(flash booted)
Provisioner
Instrumenter
Heartbeater
DHCP/TFTP
AMQP
AMQP agents
Public HTTP
Head-node
Compute node
Tens/hundreds per
head-node
. . .
SDC 7 core services
BinderDNS
Operator
portal
. . .
Firewall
SmartDataCenter: Core Services
Analytics
aggregator
Key/Value
Service
(Moray)
Firewall
API
(FWAPI)
Virtual
Machine
API
(VMAPI)
Directory
Service
(UFDS)
Designation
API
(DAPI)
Workflow
API
Network
API
(NAPI)
Compute-
Node API
(CNAPI)
Image
API
Alerts &
Monitoring
(Amon)
Packaging
API
(PAPI)
Service
API
(SAPI)
DHCP/
TFTP
AMQP
DNS
Booter
AMQP
broker
Binder
Public
API
Customer
portal
Public HTTP
Operator
portal
Operator
Services Manta
Other DCs
Note: Service
interdependencies not
shown for readability
Head-node
Other core services
may be provisioned on
compute nodes
SDC7 Core Services
SmartDataCenter + Docker
• Implementing an SDC-wide endpoint for the Docker
remote API allows us to build in terms of our established
core services: UFDS, CNAPI, VMAPI, Image API, etc.
• Has the welcome side-effect of virtualizing the notion of
Docker host machine: Docker containers can be placed
anywhere within the data center
• From a developer perspective, one less thing to manage
• From an operations perspective, allows for a flexible
layer of management and control: Docker API endpoints
become a potential administrative nexus
• As such, virtualizing the Docker host is somewhat
analogous to the way ZFS virtualized the filesystem...
SmartDataCenter + Docker: Challenges
• Some Docker constructs have (implicitly) encoded co-
locality of Docker containers on a physical machine
• Some of these constructs (e.g., --volumes-from) we
will discourage but accommodate by co-scheduling
• Others (e.g., host directory-based volumes) we are
implementing via NFS backed by Manta, our (open
source!) distributed object storage service
• Moving forward, we are working with Docker to help
assure that the Docker Remote API doesn’t create new
implicit dependencies on physical locality
SmartDataCenter + Docker: Networking
• Parallel to our SmartOS and Docker work, we have
been working on next-generation software-defined
networking for SmartOS and SmartDataCenter
• Goal was to use standard encapsulation/decapsulation
protocols (i.e., VXLAN) for overlay networks
• We have taken a kernel-based (and ARP-inspired)
approach to assure scale
• Complements SDC’s existing in-kernel, API-managed
firewall facilities
• All done in the open: on the dev-overlay branch of
SmartOS (illumos-joyent) and as sdc-portolan
Putting it all together: sdc-docker
• Our Docker engine for SDC, sdc-docker, implements
the end points for the Docker Remote API
• Work is young (started in earnest in early fall 2014), but
because it takes advantage of a proven orchestration
substrate, progress has been very quick…
• We will be deploying it into early access production in
the Joyent Public Cloud in Q1CY15
• It’s open source: http://github.com/joyent/sdc-docker;
you can install SDC (either on hardware or on VMware)
and check it out for yourself!
• A demo is worth a thousand slides...
Future of containers in production
• For nearly a decade, we at Joyent have believed that
OS-virtualized containers are the future of computing
• While the efficiency gains are tremendous, they have
not alone been enough to propel containers into the
mainstream
• We believe that the developer ease of Docker combined
with the proven production substrate of SmartOS and
SmartDataCenter yields the best of all worlds
• The future of containers is one without compromise:
developer efficiency, operational elasticity, multi-tenant
security and on-the-metal performance!
Thank you!
• Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for
their work on LX branded zones
• @joshwilsdon, @trentmick, @cachafla and @orlandov
for their work on sdc-docker
• @rmustacc, @wayfaringrob, @fredfkuo and @notmatt
for their work on SDC overlay networking
• The countless engineers who have worked on or with
illumos because they believed in OS-based virtualization

Weitere ähnliche Inhalte

Andere mochten auch

Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guidebcantrill
 
The dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelThe dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelbcantrill
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalbcantrill
 
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to ContainersThe Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containersbcantrill
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)bcantrill
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbookbcantrill
 
node.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontiernode.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontierbcantrill
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in productionbcantrill
 
Bidirektionale Verbindungen für Webanwendungen
Bidirektionale Verbindungen für WebanwendungenBidirektionale Verbindungen für Webanwendungen
Bidirektionale Verbindungen für WebanwendungenMarco Rico Gomez
 
Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!bcantrill
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zonesbcantrill
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsbcantrill
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIbcantrill
 
Presentation oracle super cluster t5-8 technical deep dive
Presentation   oracle super cluster t5-8 technical deep divePresentation   oracle super cluster t5-8 technical deep dive
Presentation oracle super cluster t5-8 technical deep divesolarisyougood
 

Andere mochten auch (15)

Leaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guideLeaping the chasm from proprietary to open: A survivor's guide
Leaping the chasm from proprietary to open: A survivor's guide
 
The dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernelThe dream is alive! Running Linux containers on an illumos kernel
The dream is alive! Running Linux containers on an illumos kernel
 
Why it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metalWhy it’s (past) time to run containers on bare metal
Why it’s (past) time to run containers on bare metal
 
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to ContainersThe Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
 
Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)Joyent circa 2006 (Scale with Rails)
Joyent circa 2006 (Scale with Rails)
 
The DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps PlaybookThe DIY Punk Rock DevOps Playbook
The DIY Punk Rock DevOps Playbook
 
node.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontiernode.js and Containers: Dispatches from the Frontier
node.js and Containers: Dispatches from the Frontier
 
Debugging microservices in production
Debugging microservices in productionDebugging microservices in production
Debugging microservices in production
 
Bidirektionale Verbindungen für Webanwendungen
Bidirektionale Verbindungen für WebanwendungenBidirektionale Verbindungen für Webanwendungen
Bidirektionale Verbindungen für Webanwendungen
 
Run containers on bare metal already!
Run containers on bare metal already!Run containers on bare metal already!
Run containers on bare metal already!
 
Papers We Love: Jails and Zones
Papers We Love: Jails and ZonesPapers We Love: Jails and Zones
Papers We Love: Jails and Zones
 
Oral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generationsOral tradition in software engineering: Passing the craft across generations
Oral tradition in software engineering: Passing the craft across generations
 
Containers for Non-Developers
Containers for Non-DevelopersContainers for Non-Developers
Containers for Non-Developers
 
Docker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote APIDocker's Killer Feature: The Remote API
Docker's Killer Feature: The Remote API
 
Presentation oracle super cluster t5-8 technical deep dive
Presentation   oracle super cluster t5-8 technical deep divePresentation   oracle super cluster t5-8 technical deep dive
Presentation oracle super cluster t5-8 technical deep dive
 

Mehr von bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Presentbcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmakingbcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...bcantrill
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsbcantrill
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systemsbcantrill
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolutionbcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Agebcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesbcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Lawbcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineeringbcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemapsbcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarebcantrill
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?bcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the unionbcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsbcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after darkbcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadershipbcantrill
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathbcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondbcantrill
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindbcantrill
 

Mehr von bcantrill (20)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 

Kürzlich hochgeladen

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Kürzlich hochgeladen (20)

Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Docker and the Future of Containers in Production

  • 1. Docker and the Future of Containers in Production CTO bryan@joyent.com Bryan Cantrill @bcantrill
  • 2. Prehistory: Virtualization as cloud catalyst • In the 1960s — shortly after the dawn of computing! — pundits foresaw a compute utility that would be public and multi-tenant • The vision was four decades too early: it took the internet + commodity computing + virtualization to yield cloud computing • Virtualization is the essential ingredient for multi-tenant operation — but where in the stack to virtualize? • Choices around virtualization capture tensions between elasticity, tenancy, and performance • tl;dr: Virtualization choices drive economic tradeoffs
  • 3. • The historical answer — since the 1960s — has been to virtualize at the level of the hardware: • A virtual machine is presented upon which each tenant runs an operating system of their choosing • There are as many operating systems as tenants • The singular advantage of hardware virtualization: it can run entire legacy stacks unmodified • However, hardware virtualization exacts a heavy price: operating systems are not designed to share resources like DRAM, CPU, I/O devices or the network • Hardware virtualization limits tenancy, elasticity and performance Hardware-level virtualization?
  • 4. • Virtualizing at the application platform layer addresses the tenancy challenges of hardware virtualization • Added advantage of a much more nimble (& developer- friendly!) abstraction… • ...but at the cost of dictating abstraction to the developer • This creates the “Google App Engine problem”: developers are in a straightjacket where toy programs are easy — but sophisticated apps are impossible • Virtualizing at the application platform layer poses many other challenges with respect to security, containment and scalability Platform-level virtualization?
  • 5. • Virtualizing at the OS level hits the sweet spot: • Single OS (i.e., single kernel) allows for efficient use of hardware resources, maximizing tenancy and performance • Disjoint instances are securely compartmentalized by the operating system • Gives users what appears to be a virtual machine (albeit a very fast one) on which to run higher-level software • The ease of a PaaS with the generality of IaaS • Model was pioneered by FreeBSD jails and taken to their logical extreme by Solaris zones — and then aped by Linux containers OS-level virtualization!
  • 6. OS-level virtualization in the cloud • Joyent runs OS containers in the cloud via SmartOS (our illumos derivative) — and we have run containers in multi-tenant production since ~2006 • Core SmartOS facilities are container-aware and optimized: Zones, ZFS, DTrace, Crossbow, SMF, etc. • SmartOS also supports hardware-level virtualization — but we have long advocated OS-level virtualization for new build out • We emphasized their operational characteristics (performance, elasticity, tenancy), and for many years we were a lone voice...
  • 7. Containers as PaaS foundation? • Some saw the power of OS containers to facilitate up- stack platform-as-a-service abstractions • For example, dotCloud — a platform-as-a-service provider — build their PaaS on OS containers • Hearing that many were interested in their container orchestration layer (but not their PaaS), dotCloud open sourced their container-based orchestration layer...
  • 9. Docker revolution • Docker has used the rapid provisioning + shared underlying filesystem of containers to allow developers to think operationally • Developers can encode dependencies and deployment practices into an image • Images can be layered, allowing for swift development • Images can be quickly deployed — and re-deployed • Docker will do to apt what apt did to tar
  • 10. Docker’s challenges • The Docker model is the future of containers • Docker’s challenges are largely around production deployment: security, network virtualization, persistence • Security concerns are real enough that for multi-tenancy, OS containers are currently running in hardware VMs (!!) • SmartOS, we have spent a decade addressing these concerns — and are proven in production… • Could we combine the best of both worlds? • Could we somehow deploy Docker containers as SmartOS zones?
  • 11. Docker + SmartOS: Linux binaries? • First (obvious) problem: while it has been designed to be cross-platform, Docker is Linux-centric • While Docker could be ported, the encyclopedia of Docker images will likely forever remain Linux binaries • SmartOS is Unix — but it isn’t Linux… • Could we somehow natively emulate Linux — and run Linux binaries directly on the SmartOS kernel?
  • 12. OS emulation: An old idea • Operating systems have long employed system call emulation to allow binaries from one operating system run on another on the same instruction set architecture • Combines the binary footprint of the emulated system with the operational advantages of the emulating system • Sun first did this with SunOS 4.x binaries on Solaris 2.x • In mid-2000s, Sun developed zone-based OS emulation for Solaris: branded zones • Several brands were developed — notably including an LX brand that allowed for Linux emulation
  • 13. LX-branded zones: Life and death • The LX-branded zone worked for RHEL 3 (!): glibc 2.3.2 + Linux 2.4 • Remarkable amount of work was done to handle device pathing, signal handling, /proc — and arcana like TTY ioctls, ptrace, etc. • Worked for a surprising number of binaries! • But support was only for 2.4 kernels and only for 32-bit; 2.6 + 64-bit appeared daunting… • Support was ripped out of the system on June 11, 2010 • Fortunately, this was after the system was open sourced in June 2005 — and the source was out there...
  • 14. LX-branded zones: Resurrection! • In January 2014, David Mackay, an illumos community member, announced that he was able to resurrect the LX brand —and that it appeared to work! Linked below is a webrev which restores LX branded zones support to Illumos: http://cr.illumos.org/~webrev/DavidJX8P/lx-zones-restoration/ I have been running OpenIndiana, using it daily on my workstation for over a month with the above webrev applied to the illumos-gate and built by myself. It would definitely raise interest in Illumos. Indeed, I have seen many people who are extremely interested in LX zones. The LX zones code is minimally invasive on Illumos itself, and is mostly segregated out. I hope you find this of interest.
  • 15. LX-branded zones: Revival • Encouraged that the LX-branded work was salvageable, Joyent engineer Jerry Jelinek reintegrated the LX brand into SmartOS on March 20, 2014... • ...and started the (substantial) work to modernize it • Guiding principles for LX-branded zone work: • Do it all in the open • Do it all on SmartOS master (illumos-joyent) • Add base illumos facilities wherever possible • Aim to upstream to illumos when we’re done
  • 16. LX-branded zones: Progress • Working assiduously over the course of 2014, progress was difficult but steady: • Ubuntu 10.04 booted in April • Ubuntu 12.04 booted in May • Ubuntu 14.04 booted in July • 64-bit Ubuntu 14.04 booted in October (!) • Going into 2015, it was becoming increasingly difficult to find Linux software that didn’t work...
  • 18. ...and, um, well received
  • 19. Docker + SmartOS: Provisioning? • With the binary problem being tackled, focus turned to the mechanics of integrating Docker with the SmartOS facilities for provisioning • Provisioning a SmartOS zone operates via the global zone that represents the control plane of the machine • docker is a single binary that functions as both client and server — and with too much surface area to run in the global zone, especially for a public cloud • docker has also embedded Go- and Linux-isms that we did not want in the global zone; we needed to find a different approach...
  • 20. Docker Remote API • While docker is a single binary that can run on the client or the server, it does not run in both at once… • docker (the client) communicates with docker (the server) via the Docker Remote API • The Docker Remote API is expressive, modern and robust (i.e. versioned), allowing for docker to communicate with Docker backends that aren’t docker • The clear approach was therefore to implement a Docker Remote API endpoint for SmartDataCenter
  • 21. Aside: SmartDataCenter • Orchestration software for SmartOS-based clouds • Unlike other cloud stacks, not designed to run arbitrary hypervisors, sell legacy hardware or get 160 companies to agree on something • SmartDataCenter is designed to leverage the SmartOS differentiators: ZFS, DTrace and (esp.) zones • Runs both the Joyent Public Cloud and business-critical on-premises clouds at well-known brands • Born proprietary — but made entirely open source on November 6, 2014: http://github.com/joyent/sdc
  • 22. SmartDataCenter: Architecture Booter AMQP broker Public API Customer portal ZFS-based multi-tenant filesystem VirtualNIC VirtualNIC Virtual SmartOS (OS virt.) ... VirtualNIC VirtualNICLinux Guest (HW virt.) ... VirtualNIC VirtualNIC Windows Guest (HW virt.) ... VirtualNIC VirtualNIC Virtual OS or Machine ... SmartOS kernel (network booted) SmartOS kernel (flash booted) Provisioner Instrumenter Heartbeater DHCP/TFTP AMQP AMQP agents Public HTTP Head-node Compute node Tens/hundreds per head-node . . . SDC 7 core services BinderDNS Operator portal . . . Firewall
  • 23. SmartDataCenter: Core Services Analytics aggregator Key/Value Service (Moray) Firewall API (FWAPI) Virtual Machine API (VMAPI) Directory Service (UFDS) Designation API (DAPI) Workflow API Network API (NAPI) Compute- Node API (CNAPI) Image API Alerts & Monitoring (Amon) Packaging API (PAPI) Service API (SAPI) DHCP/ TFTP AMQP DNS Booter AMQP broker Binder Public API Customer portal Public HTTP Operator portal Operator Services Manta Other DCs Note: Service interdependencies not shown for readability Head-node Other core services may be provisioned on compute nodes SDC7 Core Services
  • 24. SmartDataCenter + Docker • Implementing an SDC-wide endpoint for the Docker remote API allows us to build in terms of our established core services: UFDS, CNAPI, VMAPI, Image API, etc. • Has the welcome side-effect of virtualizing the notion of Docker host machine: Docker containers can be placed anywhere within the data center • From a developer perspective, one less thing to manage • From an operations perspective, allows for a flexible layer of management and control: Docker API endpoints become a potential administrative nexus • As such, virtualizing the Docker host is somewhat analogous to the way ZFS virtualized the filesystem...
  • 25. SmartDataCenter + Docker: Challenges • Some Docker constructs have (implicitly) encoded co- locality of Docker containers on a physical machine • Some of these constructs (e.g., --volumes-from) we will discourage but accommodate by co-scheduling • Others (e.g., host directory-based volumes) we are implementing via NFS backed by Manta, our (open source!) distributed object storage service • Moving forward, we are working with Docker to help assure that the Docker Remote API doesn’t create new implicit dependencies on physical locality
  • 26. SmartDataCenter + Docker: Networking • Parallel to our SmartOS and Docker work, we have been working on next-generation software-defined networking for SmartOS and SmartDataCenter • Goal was to use standard encapsulation/decapsulation protocols (i.e., VXLAN) for overlay networks • We have taken a kernel-based (and ARP-inspired) approach to assure scale • Complements SDC’s existing in-kernel, API-managed firewall facilities • All done in the open: on the dev-overlay branch of SmartOS (illumos-joyent) and as sdc-portolan
  • 27. Putting it all together: sdc-docker • Our Docker engine for SDC, sdc-docker, implements the end points for the Docker Remote API • Work is young (started in earnest in early fall 2014), but because it takes advantage of a proven orchestration substrate, progress has been very quick… • We will be deploying it into early access production in the Joyent Public Cloud in Q1CY15 • It’s open source: http://github.com/joyent/sdc-docker; you can install SDC (either on hardware or on VMware) and check it out for yourself! • A demo is worth a thousand slides...
  • 28. Future of containers in production • For nearly a decade, we at Joyent have believed that OS-virtualized containers are the future of computing • While the efficiency gains are tremendous, they have not alone been enough to propel containers into the mainstream • We believe that the developer ease of Docker combined with the proven production substrate of SmartOS and SmartDataCenter yields the best of all worlds • The future of containers is one without compromise: developer efficiency, operational elasticity, multi-tenant security and on-the-metal performance!
  • 29. Thank you! • Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for their work on LX branded zones • @joshwilsdon, @trentmick, @cachafla and @orlandov for their work on sdc-docker • @rmustacc, @wayfaringrob, @fredfkuo and @notmatt for their work on SDC overlay networking • The countless engineers who have worked on or with illumos because they believed in OS-based virtualization