IBM Technology Day 2013 Smarter Computing P Perdaems Salle Rome
3-Years-of-OpenStack-Intel-IT
1. 3 years of OpenStack with Intel IT
Das Kamhout – Principal Engineer, Cloud Architect @dkamhout
Greg Bunce – Automation and Integration Lead
Sridhar Mahankali – Cloud Architect
2. 6,500 IT Employees
59 IT sites globally
150,000 Connected Systems
40,000 Handheld Devices
100,000 Intel Employees
164 Intel Sites across 63 Countries
68 Data Centers
25% reduction with virtualization
inspire employees
IT is business
changing traditional thinking
service reliability
Intel Confidential
3. Intel Data Center Profile
Intel has five major groups currently driving individual data center requirements
(DOMES):
Design
Design Computing
§ Supports chip design community
D
Office
General Purpose
§ Supports typical IT and customer services
O
Manufacturing § Supports fabrication and assembly
M
Enterprise § Enterprise applications supporting e Business and
ERPE
Services
S § External facing applications
4. Cloud: Experience from our Design Computing Grid
IT Learnings
• Abstracted the hardware
• Abstracted the location
• Service management
• Service provisioning
1 Source: Intel IT internal analysis. Savings from DCV expected to deliver net present value over 8 years. Intel IT white paper: “Intel IT Data Center Solutions: Strategies to Improve Efficiency” http://communities.intel.com/docs/
DOC-4220
2 Source. Intel IT white paper on projected Intel net present value. “Realizing Data Center Savings with an Accelerated Server Refresh Strategy” http://communities.intel.com/docs/DOC-3489
Data Center
Virtualization
Offloading Design Workloads
to Virtual Linux* Clusters
$0M
2006-1H 2006 2007 2008
$33M
$25M
0.99M
1.18M
1.85M
2.31M
59%
63%
70%
78%
80% Utilization and
an estimated $200M Value
5. Hosting Business Goals
Increase Velocity, Zero Downtime, Grow with Flat Budget
Velocity
<1hr for VMs
Reduce Incidents
Scheduled Downtimes the norm
Sustain Operations
Velocity
Idea to Production in <1 day
Zero Downtime
“Always On”
for Apps/Services
Grow with Flat Budget
Increase in Engineer:Server and TB
Ratio
6. Server Landing Process (before Q4 2010)
Customer enters
Request in RADAR
Reassess
Requirements
Enter KCDB
escalation info
Request Backup setup
for VM if needed
Server Request
Fulfilled
Assign to Site
DC Ops
Representative
Request Network
Addresses
(Primary/Backup)
Install
OS Using
Altiris
Configure
Backup NIC
on all VMs
Post Build
Verification
Enroll VM in ISD
Care Patching
(Sat 8-2)
Install Heartbeat
Monitoring for
the VM
Grant User
Permissions
Close
IPRO
Ticket
Notify
AM
Pick-up Approved Dedicated
Server Requests
Assign Existing or
Purchase Server
Create IPRO Requests for
Dedicated Server LandingIPRO
Pick-up VM
Requests
Check
Capacity
Validate Capacity
in SHERPA
Assign LUNs;
Create cutsheet for VM
Sherpa/CPA
Forecast
Capacity Mgmt Worksheet
Analyze Further and Design
Solution Location (Customer,
AM, Technologist Involvement)
Validate Configuration
in SHERPA
Create
Engagement
Agreement (EA)
Notify Customer
of Server
Availability
Address
Server
Issues
Implement
EA
Decommission
Request
IPRO
RADAR @http:// hosting.intel.com AM Updates Customer of Status
Got
Capacity?
Physical
or Virtual?
Solution
Possible?
YesNo
Standard
Request?
Review all Requests in
HUM meeting. Assign server
requests (Virtual & Dedicated)
to SERVER AM
Gather App and Server
Requirements at Discovery
Meeting with Customer;
discuss EA/Costs
Yes
No
Customer
Accepts
Yes
No
Related
Decommission
Yes
No
Enter forecast in
SHERPA/CPS
Forecasting Tool
Cutsheet
Pickup/Create IPRO
and/or Cutsheet
Landing Request
Requestor/
Customer
Account
Manager
(AM)
MAS
Technologist
DISIHS
ADS
Procureme
nt
DC
Operations
2009
90 days physical
24 days virtual
2010-11
<3hrs virtual
2 weeks for networks
2012-2013
<30 minutes compute
storage and network
Next up Idea to Production Service in < 1 day
8. Intel IT Cloud Strategic Direction
Deliver the necessary changes in
how we expose applications/data
to improve end user productivity
Drive the transformation to a
large-scale automated
Hybrid Cloud infrastructure
Accelerate the transformation of
the Enterprise IT industry to
Cloud
10. Intel IT Cloud Quick History
Design Grid since 1990’s
60k servers across 60+
datacenters
Cloud’s Uncle
Enterprise Private Cloud
2010
13k VMs across 10
datacenters
75% of Enterprise
Server Requests
80% virtualized
Open Source Private
Cloud 2012
1.5k VMs across 2
datacenters
Running cloud-aware
and some traditional
apps
11. Vision: Federated, Interoperable, Open Hybrid Cloud
• Federated: IT manages one set of identities,
authorizations, and set of security review
processes. Users get seamless integration with
systems and apps.
• Interoperable: Standard service orchestration
and management, enabling elastic operation and
flexibility, while minimizing lock-in.
• Open: Includes open source and open
standards. Common APIs and abstraction layers
to rapidly consume cloud services among
providers.
11
App components will run across public and private clouds
Public Cloud
Service Provider
Public Cloud
Service Provider
Enterprise
Private Cloud
IaaS, PaaS, and/or
Saas
12. Enterprise Adoption Roadmap - Path to Open Cloud Ecosystem
Year 1 Year 2 Year 3 Year 4 Year 5
End
User
App
Dev
App
Owner
IT Ops
Federated,
Interoperable,
and Open Cloud
Simple SaaS
Enterprise
Legacy Apps
Compute,
Storage, and
Network
Simple
Compute
IaaS
Simple SaaS
Enterprise
Legacy Apps
Cloud Aware
Apps
Complex
Compute
IaaS
Simple
Compute
IaaS
Compute,
Storage, and
Network
Complex
SaaS
Hybrid SaaS
Full Private
IaaS
Hybrid IaaS
Cloud Aware
Apps
Legacy Apps
Private PaaS Hybrid PaaS
Cloud Aware
Apps
Legacy Apps
Consumers
LegacyApplicationsondedicatedInfrastructureStart
13. Intel IT Pre-OpenStack – Private Cloud Gen 1
Year 1 Year 2
End
User
App
Dev
App
Owner
IT Ops
Enterprise
Legacy Apps
Compute,
Storage, and
Network
Simple
Compute
IaaS
Enterprise
Legacy Apps
Cloud Aware
Apps
Complex
Compute
IaaS
Simple
Compute
IaaS
Compute,
Storage, and
Network
Consumers
LegacyApplicationsondedicatedInfrastructureStart
• Provides Self-Service to App Teams
• Connect to ALL available infrastructure
• $14M savings through resource pooling
• Internal Code for logic/gui
But…
• Cloud-aware app teams needed more
• Too much technical debt to create full
IaaS
14. Intel IT Post-OpenStack Private Cloud Gen2
Year 1 Year 2 Year 3
End
User
App
Dev
App
Owner
IT Ops
Enterprise
Legacy Apps
Compute,
Storage, and
Network
Simple
Compute
IaaS
Enterprise
Legacy Apps
Cloud Aware
Apps
Complex
Compute
IaaS
Simple
Compute
IaaS
Compute,
Storage, and
Network
Full Private
IaaS
Cloud Aware
Apps
Legacy Apps
Private PaaS
Consumers
LegacyApplicationsondedicatedInfrastructureStart
• 2011 investigated all open
and proprietary solutions
• Analysis led to decision:
OpenStack for Private IaaS
• June 2012 online for
production cloud-aware apps
But…
• Need a public cloud solution
• Legacy apps need love too
15. Intel IT OpenStack – Hybrid Cloud and the future
Year 3 Year 4 Year 5
End
User
App
Dev
App
Owner
IT Ops
Federated,
Interoperable,
and Open Cloud
Full Private
IaaS
Hybrid IaaS
Cloud Aware
Apps
Legacy Apps
Private PaaS Private PaaS
Cloud Aware
Apps
Legacy Apps
Consumers
LegacyApplicationsondedicatedInfrastructureStart
• Live Migration Enabled
• Single Control Plane
• 2 POCs for Hybrid
OpenStack in progress
Very close to our year 5 goal
17. Key Concepts
• Abstract users from underlying Cloud providers while exposing key HW
features
• Support multiple cloud providers, both private and public
• Common identity and entitlement services for reuse across interfaces
• Open Source first, minimize proprietary API lock-in
• Minimize internal technical debt, be part of the community to scale
• Stay pragmatic, as we expand – not always 100% greenfield
• Support cloud-aware and traditional apps
18. Technical Strategy (AS IS)
IaaS
Public Clouds
Internal Network Exclave
App Owner/
Developer
PaaS & DBaaS
IaaS
• Started in 2010
• Use our own capacity before paying an external
provider
• Intel IT at Service Provider size
• Use public cloud for specific purpose (SaaS, some IaaS)
PaaS & DBaaS
On Premise
Firewall
19. Technical Strategy (TO BE)
Public Clouds
Internal Network Exclave
IaaS
Smart orchestration layer
• Move apps/data among clouds via policies
• Deliver security, capacity and cost optimization
Orchestration
Burst
Firewall
On Premise
App Owner/
Developer
PaaS & DBaaS
20. 20
Why Intel IT Selected OpenStack for its IaaS Control Plane
• Velocity:
− Yields direct control over the capabilities that business demands and is forward-
leaning in terms of application / service development, delivery, and operations
− Geared toward Agile Methodologies, DevOps, and Continuous Integration /
Continuous Delivery (CI/CD) & Deployment
• Capability:
− OpenStack automation platform which is defined by its APIs
− Provide granular on-demand services which seed innovation by satisfying simple-
to-complex use cases to deliver at the pace business demands
• Efficiency & Quality:
− We leverage the same tool-chain used by the OpenStack community for developing,
building, validating, and deploying our data center operating system
21. API
Open Stack Control Plane
Open Source
HW/SW Stack
(KVM, Ceph)
Nova Cinder Swift
Heat
Neutron
GUI CLI
Managed Infrastructure
Keystone Auth
Phase 2014
1. Open Stack Control Plane
manages Mixed Infrastructure
2. Absorbing all existing VM
Lifecycle management
Self-Serve and Admin UI
Ceilometer
Active
Directory
Service
Management
Hypervisor A
Existing Infrastructure
SDN SN/NAS
10 Internal Data Centers
2 External Data Centers
All VMs controlled by
OpenStack
22. Areas to Close for Enterprise
Keep VMs up for traditional/legacy apps:
1. Shared Block Storage – for boot volumes, and data
2. Live Migration for maintenance of hosts – working in some implementations
3. Restart of instances when host fails
4. Disaster Recovery
5. Connect to Infrastructure where this already works
Enable a federated Hybrid cloud environment:
1. End users interface allowing for seamless use across zones, regions, and across clouds
2. Identity federated across instances and clouds
3. Orchestration across global/multiple instances
Highly Available Infrastructure Services (cloud built as cloud)
Rolling Upgrades – initial improvements in Icehouse
Secure, Auditable – Role Based Access, Regulatory Compliance, Audit Trails
23. Items for 2013 Completion
Compute
• Always on VMs -
– Boot From Volume (Block) þ
– Live Migration þ
– Restart on Failure ☐
• API Endpoint Encryption (SSL for all API communication) þ
• Highly Available 99.999% APIs ☐
Storage
• Object Storage Proxy Highly Available þ
• Harden open distributed block storage solution ☐
Networking
• Self-Service Network Services þ
• SDN Network Integration þ
• Load Balancer as a Service ☐ (temp internal only solution in place)
Support Enterprise and Cloud Aware Workloads
Transforming entire Datacenter to Software Exposed
24. 2014 Focus Areas
• Rolling Upgrades – no tenant downtime for resources or services
• Connection into ALL existing infrastructure – Single Control Plane
• Disaster Recovery between sites for VM tenants
• Restart of VM when host fails
• Hybrid Cloud enabled through Horizon
• Use OpenStack to do traditional work – Backup and Recovery, Bare Metal
Provisioning, LB, FW, and more
• Use OpenStack to replace internal code – DBaaS, LBaaS
24
27. Major Workforce Shifts
• Training
− IT Sysadmins retrained for CLI and Scripting fundamentals
− All developers put into the ops fire… take tickets, root cause, and learn hands on
− Key technologies taught broadly; OpenStack*, Linux*, Python*
• Scope
− From Technical Depth to Technical Breadth
− Sysadmins understand and can solve issues in compute, storage, network and tenant
operations/tasks
− DevOps as the working model
− Small team of experts
− Automate everything vs. Knowledge Base articles
• IT shifts away from being the STOP sign bearers
Broad changes to skills and methods
29. Intel IT Open Cloud: Result
Agility Automation Efficiency
29
30. Are you involved?
• Join us on Wednesday at 2pm in Room B407 For the Enterprise
BoF Kick-Off
• Hear more from Intel IT at 5:20pm on Wednesday in B312
• Help us create blueprints – Go Community!!!!
31. Wrap Up - Summary
• Our Direction = Federated, Interoperable and Open Cloud
− Strong success with our Enterprise Private Cloud (Gen1)
− Open Cloud (Gen2) in production
− Connecting our existing infrastructure to single control plane (OpenStack)
− Lots of space and opportunity for us all to contribute
• Changes required to run cloud at scale
− Culture
− Skills
− Business processes
− Technology