Chip Childers is the VP of Apache CloudStack and Principal Engineer at SunGard Availability Services.
Apache CloudStack is open source software that can deploy and manage large networks of virtual machines as a scalable IaaS cloud platform. It is a top-level project at the Apache Software Foundation.
CloudStack enables cloud operators to design, install, support, upgrade and scale diverse cloud environments. It also allows application owners to easily consume infrastructure services so that infrastructure does not get in the way of delivering applications to end users.
3. What’s Apache CloudStack?
Apache CloudStack is open
source software designed to
deploy and manage large
networks of virtual machines, as
a highly available, highly
scalable Infrastructure as a
Service (IaaS) cloud computing
platform.
CloudStack is a Top-Level
Project at the Apache Software
Foundation.
4. We Enable
Cloud Operators
To design, install, support, upgrade and scale their diverse cloud
environments
Application Owners
To easily consume infrastructure services, so that infrastructure
gets out of the way of delivering applications to the end users
5. So They Can Enable
The Application Users
Your end users want access to their applications,
all the time from anywhere.
They couldn’t care less about the environment
supporting the apps they use…
They care about the business results they achieve
using these applications.
6. Benefits of CloudStack
Self Service
CapitalLeverageWorkforceLeverage
Management
Automation
Workload
Standardization
Remove IT as a service delivery critical path
Reduce IT operational costs
Consistent application and service deployment
Usage Metering
Centralized
Management
Smarter
Virtualization
Visibility into user and line of business usage
Manage complete infrastructure, regardless of scale
Drive reduced capital requirements
8. Why do we care about the users?
We are the users
Builds the next generation of developers
Drives project sustainability
Improves quality
9. Users Driving the Project leads to
Strong support for both traditional and cloud-era workloads
Flexible deployment options and infrastructure choice
Real-world experiences with scale
Upgrades that work
New technology integrations by and for the operators
Testing of our APIs from diverse consumer tools
11. Layer 3 Networking (EC2 Style)
…
DB
Security
Group
Web
Security
Group
… …
Web
VM
Web
VM
Web
VM
Web
VM
DB
VM
Web
VM
DB
VM
Web
VM
12. Guest Virtual Layer-2 Network
Guest 1
VM 1
Guest 1
VM 2
Guest 1
VM 3
Guest 1 Virtual Network
10.1.1.0/24
Gateway 10.1.1.1
Guest 10.1.1.2
Guest 10.1.1.3
Guest 10.1.1.4
Guest 1
Virtual Router
Guest 2
VM 1
Guest 2
VM 2
Guest 2
VM 3
Guest 2 Virtual Network
10.1.1.0/24
Gateway
10.1.1.1
Guest 10.1.1.2
Guest 10.1.1.3
Guest 10.1.1.4
Guest 2
Virtual Router
Public IP
65.37.141.24
65.37.141.80
Public IP
65.37.141.11
65.37.141.36
Internet
13. Multi-tier Network
Private IP
10.1.1.112
DHCP, DNS
User-data
Public IP
65.37.141.112
10.1.1.1
Web VM
1
10.1.1.3
Web VM
2
10.1.1.4
Web VM
3
10.1.1.5
Web VM
4
Netscaler
Load
Balancer
Private IP
10.1.1.111
Public IP
65.37.141.111 Juniper
SRX
Firewall
Virtual
Router
Virtual Network
10.1.1.0/24
VLAN 100
Virtual Network
10.1.2.0/24
VLAN 1001
10.1.2.21
10.1.2.18
10.1.2.38
10.1.2.39
10.1.2.31
App VM
1 10.1.3.21
Virtual Network
10.1.3.0/24
VLAN 141
10.1.2.24
App VM
2 10.1.3.45
10.1.3.24
DB VM 1
DHCP, DNS,
User-data
DHCP, DNS
User-data,
Source-NAT,
VPN
Public IP
65.37.141.115
Virtual
Router
Virtual
Router
14. Unified Multi-tier Network
10.1.1.1
Web
VM 1
10.1.1.3
Web
VM 2
10.1.1.4
Web
VM 3
10.1.1.5
Web
VM 4
Virtual
Network
10.1.1.0/24
VLAN 100
Virtual
Network
10.1.2.0/24
VLAN 1001
10.1.2.31
App
VM 1
Virtual
Network
10.1.3.0/24
VLAN 141
10.1.2.24
App
VM 2
10.1.3.24
DB VM
1
Virtual Router Customer
Premises
IPSec or SSL site-to-site VPN
Internet
Monitoring VLAN
Virtual Router Services
• IPAM
• DNS
• LB [intra]
• S-2-S VPN
• Static Routes
• ACLs
• NAT, PF
• FW [ingress & egress]
• BGP
Load Balancer
16. Infrastructure Model
Management Server Farm
Management and provisioning tasks
Zone
Collection of pods, network offerings and secondary storage
Pod
Collection of clusters in the same failure boundary
Cluster
A grouping of hosts and their associated storage
Hosts
Servers onto which services will be provisioned
Primary Storage
VM disk storage
Network
Logical network associated with service offerings
Secondary Storage
Template, snapshot and ISO storage Zone
CloudStack Pod
Cluster
Host
Host
Network
Primary
Storage
VM
VM
CloudStack Pod
Cluster
Secondary
Storage
17. Deployment Architecture
Hypervisor is the basic unit of scale.
Cluster consists of one ore more hosts
of same hypervisor
All hosts in cluster have access to
shared (primary) storage
Pod is one or more clusters, usually
with L2 switches.
Availability Zone has one or more
pods, has access to secondary
storage.
One or more zones represent a cloud
Pod 1
….
Cluster N
L2
Host 2
Cluster 1
Host 1 Primary
Storage
Zone 1
….
L3
Secondary
Storage
Pod N
Management
Server
Cluster
Internet
18. Management Server Cluster
MS is stateless. MS can be
deployed as physical server
or VM
Single MS node can manage
up to 10K hosts. Multiple
nodes can be deployed for
scale or redundancy
RHEL 5.4+, Ubuntu 10.0.4,
Fedora 16
Replica
Infrastructure
Resources
User API
Admin API
Load
Balancer
Management
Server
Managemen
t Server
MySQL
Replication
19. Software Architecture
Management Server
Orchestration Engine
- Drives long running VM
operations
- Syncs between resources
managed and DB
- Generates events
Resource
Management
Cluster
Management
Job
Management
DB
UI
Cloud
Portal
CLI
Other
Clients
Deployment Planning
Network Gurus
Network Elements
Hypervisor Gurus
Database
Access
Alert & Event
Management
PluginAPI
Resource API
Hypervisor
Resources
Network
Resources
Storage
Resources
Image
Resources
Snapshot
Resources
REST API
OAM&P API End User API EC2 API
Pluggable Service API
Engine
Other
APIs
Security Adapters
Account Management
Connectors
ACL & Authentication
- Accounts, Domains, and Projects
- ACL, limits checking
Services API
ServicesAPI
Console Proxy
Management
Template
Access
HA
Usage
Calculations
Additional
Services
Event Bus
Message Bus
Usage
Server
20. Got Scale?
Running in production at >30,000 physical hosts, supported by only 4
management server instances
Even greater scale by using CloudStack in a Regional model
Remember: The separation of control, management and data planes is
critical for cloud platform scale.
21. Upcoming Releases
4.1.0 - currently being voted on for release
4.2.0 - entering feature freeze this week
22. What’s in 4.1?
Lots of internal architectural changes (we want to speed up
development of new integrations)
20 new features
24 “improvements”
155 bug fixes
23. Architectural Changes
Converted from custom injection framework to Spring
Lots of refactoring:
Storage plugin model
Network plugin model
API implementation refactored (remains compatible)
There’s a theme here: We are making CloudStack more
flexible, both for developers and operators…
24. New Features in 4.1
API, UI and Integration Options:
AWS style regions
Event pub-sub framework (RabbitMQ implementation)
Advanced search within the UI
API Server request throttling
API Discoverer Service
Users resetting their own passwords
Users directly changing their API keys
EC2 query API
Cloudmonkey CLI
25. New Features in 4.1
Networking:
Nicira integration adds L3 functionality
Persistent networks without a running VM
Autoscale (Netscaler implementation)
Juniper SRX & F5 Big IP inline mode
Egress FW rules for guest networks
Open vSwitch support for KVM
26. New Features in 4.1
Compute:
Support for passing custom VMX settings to vSphere on VM
create
Adding and removing Virtual NICs from VMs
Reset SSH key within a VM
Storage:
Volume resizing
S3-backed secondary storage
27. Great, That’s 4.1…
What about 4.2?
Many features proposed for 4.1 were moved to 4.2
(Remember that time-based release thing?)
58 New Features scheduled for 4.2!
16 of them are already finished, including testing
Some of them won’t make the release (again, time-based!)
40 Improvements (same caveats as features)
28. Sample 4.2 Features
Midonet and BigSwitch SDN
integrations
Enhanced Baremetal provisioning
VM I/O Throttling
Hyper-V 2012 Support
LXC Support
Cisco VSG integration
Cisco ASA 1000V
VM Affinity Rules
Eliminate NFS layer for S3 secondary
storage
Zone-wide primary storage
Security group isolation in Advanced
Network zones
Dedicating resources to domains and
accounts
IP Address reservation w/o a vNIC
allocation
Improved synchronization between
CloudStack and what’s actually running
on the hosts
vSphere DVS support
UI Plugin framework
29. And we’re just getting started…
Come join us at
http://cloudstack.apache.org
How many of you are using IaaS today?How many of those are using public cloud? Anyone using anything other than Amazon? Anyone work for a service provider?Any interesting use cases that folks want to talk about?
Cloud Operators (Small Teams, Enterprise IT Department, Public Cloud Operators)Care about the underlying architectureOwe their users an SLAThe goal is to support their users in just the right waysCloud Users (Application Developers, IT Operations, DevOps Teams)Differing workload stylesCare about speed and flexibilityFocused on supporting the real end users…
The value of any infrastructure should be tied to the value that the applications running on it provide to the end usersOur project’s job is to make the operator happy, in service of the application owners, in service of the end users. While lots of hype surrounds the tech in this space, but that hype frequently does a disservice to the IT community by distracting from the end goal.
Empower users to “serve themselves”— removing IT from the critical path of the service deliveryAutomate previously labour intensive tasks, helping to reduce IT operation costs and deliver fasterReduces complexity and variability by using standard workloads which ensures consistency with each application and service deploymentRetains visibility into resource allocation and line of business usage on a real-time levelIncreased server/admin ratio and delivers benefits of scale— even if deployed globally
Don’t consider these numbers absolute in terms of installed clouds, instead focus on the correlation between the development and user community growth.Users in this context are primarily the operatorsThis level of correlation is a strong indicator of project longevity.
Dual Workload Support means:Choice of Hypervisor (or even Bare Metal servers)Choice in networking modelsChoice in storage typesChoice in Availability LevelsStrong ACL models to protect important applications from the “power” of automation to turn it off