This presentation discussed persistent storage options for Docker containers using VMware technologies:
- VMware has released a vSphere driver for Flocker to enable migration of containers and their data volumes between ESXi hosts while preserving data on VMware vSphere and Virtual SAN storage.
- VMware's vSphere Integrated Containers and Photon Platform provide unified hybrid platforms for running containers with persistence and leveraging existing VMware infrastructure investments.
- Future developments may include a distributed file system for cloud-native applications to provide high availability, scalability and access control across multiple backend storage clusters.
2. • This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these
features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or
sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not
been determined.
Disclaimer
CONFIDENTIAL 2
3. Agenda
CONFIDENTIAL 3
1 Intro to Cloud-Native Applications
2 Containers 101
3 Persistent Data in Containers
4 VMware Solutions for Cloud-Native Storage
5 Looking Towards the Future
4. What are Cloud-Native Applications?
CONFIDENTIAL 4
Developer access via APIs
Microservices, not
monolithic stacks
Continuous integration
and deployment
App-defined Availability
Built for scale
Decoupled from
infrastructure
Application
5. Hardware
OS Kernel
OS File system
Userspace
Container
Appprocess
Appprocess
Appprocess
Appprocess
Appprocess
Container
Appprocess
Appprocess
Linux Containers
CONFIDENTIAL 55
• App-level Isolation
– Isolation at individual kernel
subsystem level (e.g. filesystem,
CPU resources, etc.)
– User-level process (LXC,
libcontainer) orchestrates these
subsystems to create a container
• Existed for Many Years
– Solaris Zones, FreeBSD Jails, etc.
• Why?
– Process isolation
– Reproducible environment
– Enables management at scale
6. is a “Shipping Container” for Code
Frictionless deployment and
maximum portability
On developer laptop:
Then on server:
That’s it!!
A natural fit for 3rd Platform,
12 factor, microservices
It makes DevOps much much easier
CONFIDENTIAL 6
Developers because …
~# docker build my_app
~# docker push my_app
~#
~# docker pull my_app
~# docker run my_app
~#
7. Containers Are Stateless “Cattle”
CONFIDENTIAL 7
Source: “CERN Data Centre Evolution”
http://www.slideshare.net/gmccance/cern-data-centre-evolution
8. But…What about Your Data?
• If you start a new container, you might lose
all the data from the old one!
• “Stateful” data needs to be accessed and
protected separately
• Original model: Persist stateful data to
noncontainerized managed storage
• But, problems arise:
– No local control over storage management
– Latency/access issues
– Issues at scale
CONFIDENTIAL 8
9. Containerized Storage Apps are Rapidly Increasing
CONFIDENTIAL 9
Thousands of DB apps
Millions of downloads
10. Bring in the Container Data Volumes!
• Usage
– Contains persistent data for local containers
– Appears as directory within host file system
(e.g. “docker run –v /mount/mydata/”)
– Can store on external storage and
mount/unmount from a host
• Benefits
– Manage and preserve your stateful data
– Utilize storage platform data services
Volumes open up new possibilities
for containerized applications!
CONFIDENTIAL 10
Host
Container
Storage Platform
Container
Data
Volumes
11. Container Storage Use Cases
CONFIDENTIAL 11
Unshared Volumes Shared Volumes Persist to External Storage
Use Case: Running container-
based SQL or noSQL DB
Use Case: Sharing a set of tools or
content across app instances
Use Case: Object store for
retention/archival, DBaaS for
config/transactions
Host
C C
Storage Platform
Host
C C
Host
C C
Host
C C
Storage Platform
Host
C C
Host
C C
Cloud
Storage
API API
12. Containerized Storage in Practice
• Application composed of series of
loosely coupled microservices
– Scheduled by an application orchestrator
(e.g. Kubernetes, Mesos)
– Microservices coordinate via REST API
• Each microservice consists of multiple
stateless and stateful containers
– API frontend container
– Database engine container
– Actual data is persisted to a container
data volume (supported by
external storage)
CONFIDENTIAL 12
Application
Orchestrator
Microservice 1
API
Front-End
Database
Engine
Data Volume
Microservice 2
API
Front-End
Database
Engine
Data Volume
Microservice 2
API
Front-End
Database
Engine
Data Volume
19. • Integrate “upwards” to Orchestration Frameworks
– Docker Swarm and Compose via Flocker Docker Plugin
– Mesosphere/Marathon
– Kubernetes (coming later), CoreOS
• Integrate “downwards” to storage vendors
– vSphere driver enables awesome coverage
– Growing ClusterHQ partner network
– Also support OpenStack Cinder
– Integrate with public cloud storage backends
• Add depth of storage capabilities
– Make data portable: enable migration of data volumes between clouds,
different stages of software development lifecycle
– Enable snapshotting, cloning, backup/restore, HA, DR…
Get in touch! clusterhq.com/contact @clusterhq
CONFIDENTIAL 19
ClusterHQ Flocker Roadmap
ClusterHQ
20. vSphere Driver for Flocker Details
• Run containerized stateful apps on your current vSphere deployment using open-source
Flocker software
• Move containers + data volumes from one ESX VM to another
• Compatible with ALL vSphere storage (VSAN, VVOL, VMFS, NFS)
• Straightforward install/configure/deploy process
• Free!
• Available at https://github.com/vmware/vsphere-flocker-driver
CONFIDENTIAL 20
21. Native Docker on vSphere
CONFIDENTIAL
21
ESX VM2
Container
DB App
ESX VM1
Container
DB App
VMDK1
+ Container Volume
VMDK2
ESX VM2
Container
DB App
ESX VM1
Container
DB App
vSphere + Flocker
Move
Container
Move
Container
VMDK1 VMDK2Container Volume
VMDK
When container moves, data volume stays
on host VMDK. Database starts on new
VM without any of its data.
Data Volume stored on separate VMDK.
When container moves, VMDK moves with
it. Database keeps its data!
23. VMware Virtual SAN 6.1
CONFIDENTIAL 23
Radically Simple Hyperconverged Storage for VMs
vSphere + Virtual SAN
…
• Software-defined storage optimized for
VMs
• Hypervisor-converged architecture
• Runs on any standard x86 server
• Pools HDD/SSD into a shared datastore
• Delivers enterprise-level scalability and
performance
• Managed through per-VM storage policies
• Deeply integrated with the VMware stack
Overview
Hard disksSSD
Hard disks
SSD
Hard disks
SSD
Virtual SAN Datastore
24. But Don’t Take Our Word for It…
CONFIDENTIAL 24
“ClusterHQ and VMware supported our
evaluation of Flocker with vSphere and
Virtual SAN, providing us building blocks
for container persistence in our Docker-
based PaaS.”
26. Unified Hybrid Platform
vSphere Integrated Containers
• Give developers the flexibility, portability and
speed containers deliver
• Leverage existing investments in VMware
infrastructure, people, processes and
management tools
• DRS, vMotion, HA/DR
• Storage and Network Integration (VSAN
and NSX)
• Apply same isolation, data persistence,
networking, management and robust service
levels you have today
• No rebuilding or re-architecture required
• Full compatibility with broad ecosystem of
existing tools
CONFIDENTIAL 26
NSX
VSAN
vRealize
vSphere
vSphere Integrated Containers
(Instant Clone, Project Bonneville, Photon OS)
27. Cloud-Native Platform
VMware Photon Platform
CONFIDENTIAL 27
Photon Controller
Photon Machine
Photon Platform
Host controller & scheduler:
High scale distributed control plane,
includes Lightwave
Compute host:
Lightweight hypervisor, based on ESX
Includes Photon OS
Deep integration with modern, open
source frameworks & app platforms
28. Looking Towards the Future of Storage
• Distributed infrastructure
– Global management
– Automation friendly
• Scalable troubleshooting
– Decentralized analytics
– Information gathering, processing,
prediction
• IT-friendly GUI, scripting
– Infrastructure operations
• Dev-friendly APIs, CLI
– Application integration
CONFIDENTIAL 28
Physical Servers
Infrastructures
Pools of Resources
Virtual SAN Datastore
Magnetic
Devices
Flash
Devices
Magnetic
Devices
Flash
Devices
Storage Abstraction
Distributed storage
platform
storageinfrastructure
management
distributedmonitoringUI/APIs
29. VDI files
A Distributed File System for Cloud-Native Apps
• Hyper-converged scale-out file system
• Relies on Object Storage backend
– Hardware management
– Resource provisioning, discovery
– Distributed parallel data path
– High availability
• Backend: VSAN, others…
– Can span multiple backend “clusters”
• Design Requirements
– POSIX file system
– Cloud scale: files, clients, clones
– Per file / directory access control
– O(1) snapshot / clone creation
CONFIDENTIAL 29
Distributed storage management service
VSAN (Object) store VSAN (Object) store
Distributed File system
NoSQL
31. What’s Now?
• vSphere driver for Flocker 1.0 available now for free at:
https://github.com/vmware/vsphere-flocker-driver
• Extending functionality for VMware-based container data volumes
• Storage integration with vSphere Integrated Containers
• Storage integration with Photon Platform
Follow along on Twitter: @theVSaraswat
CONFIDENTIAL 31
What’s Next?
Customer Question: What type of storage applications and databases do you see becoming more popular among your cloud-native developers and DevOps teams?
Will provide script
What do I mean by the future…
Showcased at VMworld 2015 as a Technology Preview, VMware vSphere Integrated Containers will enable customers to run any application, including containerized applications, on-premises with VMware vSphere or in the public cloud with VMware vCloud Air in production. The new offering will allow IT operations teams to apply the same secure isolation, data persistence, networking, management and robust service levels to containers they have with virtual machines today. VMware Integrated Containers will boost customer container initiatives by taking advantage of their existing investments in VMware infrastructure, people, processes and management tools while giving developers the flexibility, portability and speed containers deliver. VMware’s new offering is open and integrated with CoreOS, Docker, Kubernetes, Mesosphere Datacenter Operating System, and Pivotal CF.
By bringing together all the necessary capabilities in a single offering to run enterprise-class containers in production, VMware will help customers to accelerate time to value, reduce risk and assure long-term viability. Additionally, customers will have tremendous choice and flexibility over the hardware, operating systems and third-party software to use in support of their containers initiatives.
With VMware vSphere Integrated Containers, VMware vSphere can run any application, including containerized applications. For some customers looking to run containerized applications at scale, they need a container-optimized solution built for high churn of workloads and featuring an API-first model.
Demonstrated at VMworld 2015 as a Technology Preview, VMware Photon Platform is designed for customers planning to build out large greenfield pools of computing capacity that solely run cloud-native applications as well as dynamic continuous integration environments, sizable data analytics clusters running Hadoop or Spark, Software as a Service (SaaS) or Platform as a Service (PaaS) deployments.
The VMware Photon Platform includes:
VMware Photon Controller – a multi-tenant, API-driven controller optimized for scale, churn and high-availability. Automation-savvy DevOps teams will be able to speed the creation of thousands of new containers per minute, and support hundreds of thousands of total simultaneous workloads. The controller will be released as an open source project to help encourage broad input, testing and adoption from customers, partners, and the community at large. The controller will also incorporate Project Lightwave which provides enterprise-grade trust and security for containers.
VMware Photon Machine – a combination of Project Photon OS, a lightweight Linux operating system for containerized applications and optimized for VMware vSphere and VMware vCloud Air, and the ESX Microvisor – a core compute hypervisor – based on the proven core of VMware ESXi featuring “just the right level of functionality” to run containerized applications at cloud-scale.
What are those requirements…
1) Storage infrastructure management.
Reference Esx Cloud (Photon Control) and the use cases it has been architected for. Show some impressive (hopefully) material from the Esx Cloud launch, including specific examples of use cases.
We are talking about infrastructures of 1000s of hosts where one can run 100Ks VMs or 1M containers. The infrastructure includes both compute and storage resources. How do we manage such massive infrastructures in a scalable yet effective way. The key ideas are:
- Provide a global view of the infrastructure configuration and health
- Allow for fast and effective “zoom in” to any problem areas and issues
- Do all the above through sophisticated data collection and analysis
- However such collection/analysis cannot be done in a centralized way (as is the case with traditional virtual environments). It has to happen in a distributed way, wherein status data (logs, traces, performance statistics, etc) is extracted on every host, analyzed and stored locally. Then depending on what the user is looking for, such data can be aggregated and synthesized using distributed protocols and only present the final result through a “bird’s-eye view” to the users.
- Also, all this work should be done not only through UI for admins, but also can be done programmatically to be part of automated infrastructure management services and even part of the application logic that runs on such infrastructure.
Let’s look at a distributed file system that we are building and experimenting with. A system designed with the above requirements in mind.
First of all, it is still a HCI architecture with the DFS software running on every compute host in your infrastructure.
The DFS scales to 10s of thousands of hosts, millions of clients (like VMs and containers).
It relies on an object store like VSAN and the corresponding distributed storage management service I talked about a few minutes ago, to perform:
Hardware and overall infrastructure management
Discovery of resources such as clusters of storage. In fact, a single DFS instance may span many VSAN clusters even across geographic locations.
It uses objects on VSAN to store both data and metadata for the FS.
Availability, endurance and performance of those objects is ensured by VSAN. This greatly simplifies the design and scalability of the FS.
The DFS utilizes the highly parallel IO path of VSAN to implement extremely scalable data and metadata operations – practically unlimited scale as long as you add physical resources and objects to the backend.
The use cases of such a DFS are very broad ranging from file services (home directories and file shares) for VDI deployments without having to invest in special purpose file services and appliances. To container deployments, NoSQL services. To big data and analytics.
There are four key requirements…
Let’s see a quick demo of an experimental DFS we are kicking the tires with in the Lab. We will use a containers use case with a hypothetical new CN Application that Rawlinson seems to be very excited about.