SlideShare a Scribd company logo
1 of 50
Learning about NetflixOSS
For Oct 2013 @TriangleDevops

Andrew Spyker
@aspyker
Some content from @ma4jpb
Agenda
• How did I get here?
•
•
•
•
•

Netflix and Netflix OSS platform overview
Runtime components
Management components
Build components
Automated test and cleanliness components
2
About me …
• IBM STSM of Performance Architect and Strategy
• Eleven years in performance in WebSphere
–
–
–
–

Led the App Server Performance team for years
Small sabbatical focused on IBM XML technology
Work in Emerging Technology Institute and CTO Office
Starting to look at cloud service operations

• Email: aspyker@us.ibm.com
–
–
–
–

Blog: http://ispyker.blogspot.com/
Linkedin: http://www.linkedin.com/in/aspyker
Twitter: http://twitter.com/aspyker
Github: http://www.github.com/aspyker

• Triangle dad that enjoys technology as well as running, wine and poker
3
Develop or maintain a service today?
• Develop – starting
• Maintain – starting
• More on this later ….
http://www.flickr.com/photos/stevendepolo/

4
What qualifies me to talk?
• My shirt?
• Of cloud prize ~ 25 nominees
– Personally
• Best example mash-up sample

– My IBM team
• Best portability enhancement

– More on this coming …
•

http://techblog.netflix.com/2013/09/netflixoss-meetup-s1e4-cloud-prize.html
5
Seriously, how did I get here?
• Plenty of experience with performance and scale on
standardized benchmarks (SPEC/TPC)
– Non representative of how to (web) scale
• Pinning, biggest monolithic DB “wins”, hand tuned for fixed size

– Out of date on modern architecture for mobile/cloud

• Created Acme Air
– http://bit.ly/acmeairblog

• Demonstrated that we could achieve (web) scale runs
– 4B+ Mobile/Browser request/day
– With modern mobile and cloud best practices

6
Demo

7
What was shown?
• Peak performance and scale – You betcha!
• Operational visibility – Only during the run via
nmon collection and post-run visualization
•
•
•
•

True operational visibility - nope
Devops – nope
HA and DR – nope
Manual and automatic elastic scaling - nope
8
What next?
• Went looking for what best industry practices around
devops and high availability at web scale existed
– Many have documented via research papers and on
highscalability.com – Google, Twitter, Facebook, Linkedin,
etc.

• Why Netflix?
– Documented not only on their tech blog, but also have
released working OSS on github
– Also, given dependence on Amazon, they are a clear
bellwether of web scale public cloud availability
9
Steps to NetflixOSS understanding
• Recoded Acme Air application to make use of NetflixOSS
runtime components
• Worked to implement a NetflixOSS devops and high
availability setup around Acme Air (on EC2) run at previous
levels of scale and performance
• Worked to port NetflixOSS runtime and devops/high
availability servers to IBM Cloud (SoftLayer) and RightScale

• Through public collaboration with Netflix technical team
– Google groups, github and meetups
10
Why?
• To prove that advanced cloud high availability
and devops platform wasn’t “tied” to Amazon
• To understand how we can advance IBM cloud
platforms for our customers
• To understand how we can host our IBM
public cloud services better
11
Agenda
• How did I get here?
• Netflix and Netflix OSS platform overview
•
•
•
•

Runtime components
Management components
Build components
Automated test and cleanliness components
12
My view of Netflix goals
• As a business
– Be the best streaming media provider in the world
– Make best content deals based on real data/analysis

• Technology wise
– Have the most availability possible
– Measure all things by “stream starts per unit of time”
• Any dip in that relates back to the business

– Do this at web scale
13
Standing on the shoulder of a giants
• Public Cloud (Amazon)
– When adding streaming, Netflix decided they
• Shouldn’t invest in building data centers worldwide
• Had to plan for the streaming business to be very big

– Embraced cloud architecture paying only for what they need

• Open Source
– Many parts of runtime depend on open source
• Linux, Apache Tomcat, Apache Cassandra, etc.

– Realized that Amazon wasn’t enough
• Started a cloud platform on top that would
eventually be open sourced - NetflixOSS
http://en.wikipedia.org/wiki/
File:Andre_in_the_late_%2780s.jpg

14
Faleure
• What is failing?
– Underlying IaaS problems
• Instances, racks, availability zones, regions

– Software issues
• Operating system, servers, application code

Inspiration

– Surrounding services
• Other application services, DNS, user registries, etc.

• How is a component failing?
–
–
–
–

Fails and disappears altogether
Intermittently fails
Works, but is responding slowly
Works, but is causing users a poor experience
15
Overview of Amazon EC2
•

Amazon launches instances into availability zones
– Instances of various sizes (compute, storage, etc.)

•

Regions independent of each other
Regions only connected over the Internet
Regions contain availability zones
Availability zones are isolated from each over
Availability zones are connected /w low-latency links

Availability
Zone

Availability
Zone

Internet

This gives a high level of resilience to outages
– Unlikely to affect multiple availability zones or regions

•

Availability
Zone

Organized into regions and availability zones
–
–
–
–
–

•

EC2 Region
(US East)

Amazon requires customer be aware of this
topology to take advantage of its benefits within
their application

EC2 Region
(US West)

Availability
Zone

Availability
Zone

Availability
Zone

16
NetflixOSS
• “Technical
indigestion as a
service” - @adrianco
• netflix.github.io
• 30+ OSS projects
• Expanding every day

17
NetflixOSS – for today
• For today
– Focus on mid tier web
app and micro service
servers
– Devops servers and tools
– Skipping some just for
simplicity

• For another time
– Big data
– Data tier
– Caching

18
Agenda
• How did I get here?
• Netflix and Netflix OSS platform overview
• Runtime components
• Management components
• Build components
• Automated test and cleanliness components
19
Acme Air As A Sample

ELB

Web App
Front End
(REST services)

App Service
(Authentication)

Data Tier

Greatly simplified …

20
Micro-services architecture
• Decompose system into isolated services that can be developed
separately
• Why?
– They can fail independently vs. fail together monolythically
– They can be developed and released with difference velocities by
different teams

• To show this we created separate “auth service” for Acme Air
• In a typical customer facing application any single front end
invocation could spawn 20-30 calls to services and data sources

21
How do services advertise themselves?
• Upon web app startup, Karyon server is started
– Karyon will configure (via Archaius) the application
– Karyon will register the location of the instance with Eureka
• Others can know of the existence of the service
• Lease based so instances continue to check in updating list of available instances

– Karyon will also expose a JMX console, healthcheck URL
• Devops can change things about the service via JMX
• The system can monitor the health of the instance

App Service
(Authentication)

Name, Port
IP address,
Healthcheck url

Karyon
Tomcat

Eureka
Eureka
Server(s)
Eureka
Server(s)
Eureka
Server(s)
Server(s)

config.properties, auth-service.properties
Or remote Archaius stores
22
How do consumers find services?
• Service consumers query eureka at startup and
periodically to determine location of dependencies
– Can query based on availability zone and cross
availability zone
Web App
Front End
(REST services)
Eureka client
Tomcat

What “auth-service”
instances exist?
Eureka
Eureka
Server(s)
Eureka
Server(s)
Eureka
Server(s)
Server(s)

23
Demo

24
How does the consumer call the service?
• Protocols impls have eureka aware load balancing support build in
– In client load balancing -- does not require separate LB tier

• Ribbon – REST client
– Pluggable load balancing scheme
– Built in failure recovery support (retry next server, mark instance as failing, etc.)

• Other eureka enabled clients – memcached (EVCache), asystanax coming
(Priam and Cassandra)
Web App
Front End
(REST services)

Call
“auth-service”

Ribbon
REST
client
Eureka
client

App Service
App Service
(Authentication)
App Service
(Authentication)
App Service
(Authentication)
(Authentication)

25
How to deploy this with HA?
Instances?
• Deploy across AZs
• Using AutoScalingGroups in
EC2 managed by Asgard

Eureka?
•
•

DNS and Elastic IP trickery
Deployed across AZs

•

For clients to find eureka servers
–

– ASG manages recovery

–

•

For new eureka servers
–
–
–

•

DNS TXT record for domain lists AZ TXT
records
AZ TXT records have list of Eureka servers

Look for list of eureka servers IP’s for the AZ
it’s coming up in
Look for unassigned elastic IP’s, grab one and
assign it to itself
Sync with other already assigned IP’s that
likely are hosting Eureka server instances

Simpler configurations with less HA are
available
26
Protect yourself from unhealthy services
• Wrap all calls to services with Hystrix command pattern
– Hystrix implements circuit breaker pattern
– Executes command using semaphore or separate thread
pool to guarantee return within finite time to caller
– If a unhealthy service is detected, start to call fallback
implementation (broken circuit) and periodically check if
main implementation works (reset circuit)

Execute
auth-service
call

Call
“auth-service”

Hystrix

Web App
Front End
(REST
services)

Ribbon REST
client

App Service
App Service
(Authentication)
App Service
(Authentication)
App Service
(Authentication)
(Authentication)

Fallback implementation
27
Does Hystrix do more?
• Main reason for Hystrix is
protect yourself from
dependencies, but …
• Once you have a layer of
indirection take advantage of it,
Hystrix can provide
– Caching
– Visualization
• Aggregated via Turbine

– Request collapsing

• Programming models
– Sync, Async, Reactive (RxJava)
28
Agenda
• How did I get here?
• Netflix and Netflix OSS platform overview
• Runtime components
• Management components

• Build components
• Automated test and cleanliness components
29
Ability to reconfigure - Archaius
• Using dynamic properties, can
easily change properties across
cluster of applications, either

Application

– NetflixOSS named props
• Hystrix timeouts for example

Runtime

– Custom dynamic props
Hierarchy

• High throughput achieved by
polling approach
• HA of configuration source
dependent on what source you
use

URL

JMX
Karyon
Console

Persisted DB
Application Props
Libraries
Container

– HTTP server, database, etc.
DynamicIntProperty prop =
DynamicPropertyFactory.getInstance().getIntProperty("myProperty", DEFAULT_VALUE);
int value = prop.get(); // value will change over time based on configuration

30
ASGard
EC2 Region
(US East)

Availability
Zone

Tell EC2 to start
these instances and
Keep this many
Instances running
Availability
Zone

Web App
App Service
(REST App Service
Services)
(Authentication)
App Service
(Authentication)
(Authentication)

App Service
App Service
App Service
(Authentication)
(Authentication)
App Service
(Authentication)
(Authentication)

Availability
Zone

Web App
App Service
(REST App Service
Services)
(Authentication)
App Service
(Authentication)
(Authentication)

App Service
App Service
App Service
(Authentication)
(Authentication)
App Service
(Authentication)
(Authentication)

Web App
App Service
(REST App Service
Services)
(Authentication)
App Service
(Authentication)
(Authentication)

App Service
App Service
App Service
(Authentication)
(Authentication)
App Service
(Authentication)
(Authentication)

• Asgard is the missing EC2 console for AutoScalingGroup mgmt.
31
– EC2 only has CLI for ASG management
Asgard creates an “application”
• Enforces common practices for deploying code
– Common approach to linking auto scaling groups to launch configs,
ELB’s, security groups, scaling policies and AMIs

• Adds missing concept to the EC2 domain model – “application”
– Extends clustering to applications vs. AMI’s

• Example
–
–
–
–

Application – app1
Cluster – app1-env
Autoscaling group version n – app1-env-v009
Autoscaling group version n+1 – app1-env-v010

32
Asgard devops procedures
•
•
•
•

Fast rollback
Canary testing
Red/Black pushes
More through REST interfaces
– Adhoc processes but enforced through Asgard model

• More coming using Glisten and Amazon SWF

33
Demo

34
Augmenting the ELB tier - Zuul
• Zuul adds devops support in the front tier routing
–
–
–
–
–

Stress testing (squeeze testing)
Canary testing
Dynamic routing
Load Shedding
Debugging

• And some common function
–
–
–
–
–

Authentication
Security
Static response handling
Multi-region resiliency (DR for ELB tier)
Insight

Amazon
ELB

Filter
Filter
Filter
Filters

Zuul
Zuul
Zuul
Edge
Service

Edge
Service

• Through dynamically deployable filters (written in Groovy)
• Eureka aware using ribbon, and archaius like shown in runtime section
35
Monitoring - Servo
• Annotation based publishing through JMX of
application metrics
• Filters, Observers, and Pollers to publish metrics
– Can export metrics to CloudWatch and other monitors

• The entire Netflix monitoring infrastructure
hasn’t been open sourced due to complexity and
priority

36
A note on the next three projects
• I haven’t personally worked with the projects
• Given the audience, I included as I believe
they will be of interest

37
Edda
• Polls Amazon config and stores the data in a
queriable database
• Provides a searchable view of Amazon
deployments
– Searchable in ways not possible from Amazon API’s

• Provides a historical view
– For correlation of problems to changes
– Likely less of an issue in clouds that expose all changes
38
Ice
• Cloud spend and usage analytics
• Communicates with billing API to give
birds eye view of cloud spend with drill
down to region, availability zone, and
service team through application groups
• Watches on-demand, used and unused
reserved instances and instance sizes to
help optimize
• Not point in time
– Shows trends to help predict future
optimizations
39
Denominator
• Java Library and CLI for cross DNS configuration
• Allows for common, quicker (than using various
DNS provider UI) and automated DNS updates
• Plugins have been developed by various DNS
providers

40
Agenda
•
•
•
•

How did I get here?
Netflix and Netflix OSS platform overview
Runtime components
Management components

• Build components
• Automated test and cleanliness components
41
Get baked!
• Caution: Flame/troll bait ahead!!
• Netflix takes the approach of baking images as part of build such that
– Instance boot-up doesn’t depend on outside servers
– Instance boot-up only starts servers already set to run
– New code = new instances (never update instances in place)

• Why?
– Critical when launching hundreds of servers at a time
– Goal to reduce the failure points in places where dynamic system
configuration doesn’t provide value
– Speed of elastic scaling, boot and go
– Discourages ad hoc changes to server instances

• Criticism – “Netflix is ruining the cloud”
– Overhead of AMI’s for every code version
– Ties to Amazon AMI’s (would this work for containers – I think yes)

42
AMInator
• Starting image/volume
– Foundational image created (maybe via loopback),
base AMI with common software created/tested
independently

• Aminator running – Bakery
– Bakery obtains a known EBS volume of the base
image from a pool
– Bakery mounts volume and provisions the
application (apt/deb or yum/rpm)
– Bakery snapshots and registers snapshot

• Recent work to add other provisioning such as chef
as plugins
• I have used hand built AMI’s thus far, but blog
states developers can go through CI builds and
have running test instances within 15 minutes of
code being checked in

43
Agenda
•
•
•
•
•

How did I get here?
Netflix and Netflix OSS platform overview
Runtime components
Management components
Build components

• Automated test and cleanliness components
44
The Simian Army
• A bunch of automated “monkeys” that
perform automated system administration
tasks
• Anything that is done by a human more than
once can and should be automated
• Absolutely necessary at web scale
45
Good Monkeys
• Janitor Monkey
– Somewhat a mitigation for baking approach
– Will mark and sweep unused resources
(instances, volumes, snapshots, ASG’s,
launch configs, images, etc.)
– Owners notified, then removed

• Conformity Monkey

http://www.flickr.com/photos/sonofgroucho/5852049290

– Check instances are conforming to rules
around security, ASG/ELB, age, status/health
check, etc.

46
Back to high availability
• Failure is inevitable. Don’t try to avoid it!
• How do you know if your backup is good?
– Try to restore from your backup every so often
– Better to ensure backup works before you have a crashed
system and find out your backup is broken

• How do you know if your system is HA?
– Try to force failures every so often
– Better to force those failures during office hours
– Better to ensure HA before you have a down system and
angry users
– Best to learn from failures and add automated tests
47
Bad Monkeys
• Open Sourced – Chaos Monkey
– Used to randomly terminate instances
– Now block network, burn cpu, kill
processes, fail amazon api, fail dns, fail
dynamo, fail s3, introduce network
errors/latency, detach volumes, fill disk,
burn I/O
http://www.flickr.com/photos/27261720@N00/132750805

• Not yet open sourced
– Chaos Gorilla
• Kill all instances in an availability zone

– Chaos Kong
• Kill all instances in an entire region

– Latency Monkey
• Introduce latency into service calls directly
(ribbon server side)
48
Agenda
• Blah, blah, blah
• How can I learn more?
• How do I play with this?
• Let’s write some code!
49
Want to play?
• NetflixOSS blog and github
– http://techblog.netflix.com
– http://github.com/Netflix

• Acme Air, NetflixOSS AMI’s
– Try Asgard/Eureka with a real application
– http://bit.ly/aa-AMIs

• See what we ported to IBM Cloud (video)
– http://bit.ly/noss-sl-blog

• Fork and submit pull requests to Acme Air
– http://github.com/aspyker/acmeair-netflix

50

More Related Content

What's hot

Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)QAware GmbH
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
 
DCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application TransformationDCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application TransformationDocker, Inc.
 
On-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad Afanah
On-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad AfanahOn-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad Afanah
On-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad AfanahDocker, Inc.
 
Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...
Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...
Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...Docker, Inc.
 
Container orchestration overview
Container orchestration overviewContainer orchestration overview
Container orchestration overviewWyn B. Van Devanter
 
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...Oleg Shalygin
 
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...Edureka!
 
OpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetesOpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetesSamuel Terburg
 
Enabling Production Grade Containerized Applications through Policy Based Inf...
Enabling Production Grade Containerized Applications through Policy Based Inf...Enabling Production Grade Containerized Applications through Policy Based Inf...
Enabling Production Grade Containerized Applications through Policy Based Inf...Docker, Inc.
 
Kubernetes Networking 101
Kubernetes Networking 101Kubernetes Networking 101
Kubernetes Networking 101Kublr
 
Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...
Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...
Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...Lucas Jellema
 
Container World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container OrchestratorsContainer World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container OrchestratorsLee Calcote
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflixaspyker
 
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...Brian Grant
 
Simple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVMSimple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVMJamie Coleman
 
ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...
ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...
ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...DynamicInfraDays
 
Netflix0SS Services on Docker
Netflix0SS Services on DockerNetflix0SS Services on Docker
Netflix0SS Services on DockerDocker, Inc.
 

What's hot (20)

Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
Kubernetes One-Click Deployment: Hands-on Workshop (Munich)
 
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation Slides
 
On Prem Container Cloud - Lessons Learned
On Prem Container Cloud - Lessons LearnedOn Prem Container Cloud - Lessons Learned
On Prem Container Cloud - Lessons Learned
 
Docker Kubernetes Istio
Docker Kubernetes IstioDocker Kubernetes Istio
Docker Kubernetes Istio
 
DCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application TransformationDCEU 18: 5 Patterns for Success in Application Transformation
DCEU 18: 5 Patterns for Success in Application Transformation
 
On-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad Afanah
On-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad AfanahOn-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad Afanah
On-the-Fly Containerization of Enterprise Java & .NET Apps by Amjad Afanah
 
Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...
Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...
Microservices + Events + Docker = A Perfect Trio by Docker Captain Chris Rich...
 
Container orchestration overview
Container orchestration overviewContainer orchestration overview
Container orchestration overview
 
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
GCP - Continuous Integration and Delivery into Kubernetes with GitHub, Travis...
 
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...
 
OpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetesOpenShift Enterprise 3.1 vs kubernetes
OpenShift Enterprise 3.1 vs kubernetes
 
Enabling Production Grade Containerized Applications through Policy Based Inf...
Enabling Production Grade Containerized Applications through Policy Based Inf...Enabling Production Grade Containerized Applications through Policy Based Inf...
Enabling Production Grade Containerized Applications through Policy Based Inf...
 
Kubernetes Networking 101
Kubernetes Networking 101Kubernetes Networking 101
Kubernetes Networking 101
 
Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...
Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...
Introduction into Docker Containers, the Oracle Platform and the Oracle (Nati...
 
Container World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container OrchestratorsContainer World 2017 - Characterizing and Contrasting Container Orchestrators
Container World 2017 - Characterizing and Contrasting Container Orchestrators
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
WSO2Con US 2015 Kubernetes: a platform for automating deployment, scaling, an...
 
Simple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVMSimple tweaks to get the most out of your JVM
Simple tweaks to get the most out of your JVM
 
ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...
ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...
ContainerDays NYC 2015: "Container Orchestration Compared: Kubernetes and Doc...
 
Netflix0SS Services on Docker
Netflix0SS Services on DockerNetflix0SS Services on Docker
Netflix0SS Services on Docker
 

Viewers also liked

Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Jeremy Edberg
 
Netflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksNetflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksSudhir Tonse
 
Spring Cloud Netflix OSS
Spring Cloud Netflix OSSSpring Cloud Netflix OSS
Spring Cloud Netflix OSSSteve Hall
 
Optimizing the Ops in DevOps
Optimizing the Ops in DevOpsOptimizing the Ops in DevOps
Optimizing the Ops in DevOpsGordon Haff
 
Netflix IT Ops 2014 Roadmap
Netflix IT Ops 2014 RoadmapNetflix IT Ops 2014 Roadmap
Netflix IT Ops 2014 Roadmapmike d. kail
 
(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...
(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...
(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...Amazon Web Services
 
Disruption of Enterprise IT and DevOps
Disruption of Enterprise IT and DevOpsDisruption of Enterprise IT and DevOps
Disruption of Enterprise IT and DevOpsmike d. kail
 
Consumer Science and Product Development at Netflix - OSCON 2012
Consumer Science and Product Development at Netflix - OSCON 2012Consumer Science and Product Development at Netflix - OSCON 2012
Consumer Science and Product Development at Netflix - OSCON 2012Matt Marenghi
 
(DVO203) The Life of a Netflix Engineer Using 37% of the Internet
(DVO203) The Life of a Netflix Engineer Using 37% of the Internet(DVO203) The Life of a Netflix Engineer Using 37% of the Internet
(DVO203) The Life of a Netflix Engineer Using 37% of the InternetAmazon Web Services
 
Shepherding change: leading your DevOps transformation
Shepherding change: leading your DevOps transformationShepherding change: leading your DevOps transformation
Shepherding change: leading your DevOps transformationMike McGarr
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2aspyker
 
How Netflix thinks of DevOps. Spoiler: we don’t.
How Netflix thinks of DevOps. Spoiler: we don’t.How Netflix thinks of DevOps. Spoiler: we don’t.
How Netflix thinks of DevOps. Spoiler: we don’t.Dianne Marsh
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talksRuslan Meshenberg
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkAdrian Cockcroft
 
Spring Boot + Netflix Eureka
Spring Boot + Netflix EurekaSpring Boot + Netflix Eureka
Spring Boot + Netflix Eureka心 谷本
 
20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers SoftwareDevOps Chicago
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012Amazon Web Services
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapJosh Evans
 

Viewers also liked (20)

Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)
 
Netflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksNetflix Cloud Platform Building Blocks
Netflix Cloud Platform Building Blocks
 
Spring Cloud Netflix OSS
Spring Cloud Netflix OSSSpring Cloud Netflix OSS
Spring Cloud Netflix OSS
 
Optimizing the Ops in DevOps
Optimizing the Ops in DevOpsOptimizing the Ops in DevOps
Optimizing the Ops in DevOps
 
Netflix IT Ops 2014 Roadmap
Netflix IT Ops 2014 RoadmapNetflix IT Ops 2014 Roadmap
Netflix IT Ops 2014 Roadmap
 
(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...
(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...
(ENT209) Netflix Cloud Migration, DevOps and Distributed Systems | AWS re:Inv...
 
Disruption of Enterprise IT and DevOps
Disruption of Enterprise IT and DevOpsDisruption of Enterprise IT and DevOps
Disruption of Enterprise IT and DevOps
 
Consumer Science and Product Development at Netflix - OSCON 2012
Consumer Science and Product Development at Netflix - OSCON 2012Consumer Science and Product Development at Netflix - OSCON 2012
Consumer Science and Product Development at Netflix - OSCON 2012
 
(DVO203) The Life of a Netflix Engineer Using 37% of the Internet
(DVO203) The Life of a Netflix Engineer Using 37% of the Internet(DVO203) The Life of a Netflix Engineer Using 37% of the Internet
(DVO203) The Life of a Netflix Engineer Using 37% of the Internet
 
Shepherding change: leading your DevOps transformation
Shepherding change: leading your DevOps transformationShepherding change: leading your DevOps transformation
Shepherding change: leading your DevOps transformation
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2
 
How Netflix thinks of DevOps. Spoiler: we don’t.
How Netflix thinks of DevOps. Spoiler: we don’t.How Netflix thinks of DevOps. Spoiler: we don’t.
How Netflix thinks of DevOps. Spoiler: we don’t.
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talks
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New York
 
Spring Boot + Netflix Eureka
Spring Boot + Netflix EurekaSpring Boot + Netflix Eureka
Spring Boot + Netflix Eureka
 
20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software20140708 - Jeremy Edberg: How Netflix Delivers Software
20140708 - Jeremy Edberg: How Netflix Delivers Software
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012
 
AWS Lambda
AWS LambdaAWS Lambda
AWS Lambda
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the Gap
 

Similar to NetflixOSS for Triangle Devops Oct 2013

Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSaspyker
 
Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalaspyker
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐Pahud Hsieh
 
How IT at Getty Images Brokers Cloud Services
How IT at Getty Images Brokers Cloud ServicesHow IT at Getty Images Brokers Cloud Services
How IT at Getty Images Brokers Cloud ServicesRightScale
 
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)Amazon Web Services
 
Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersAmazon Web Services
 
AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)
AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)
AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)Amazon Web Services
 
PowerPoint Presentation
PowerPoint PresentationPowerPoint Presentation
PowerPoint Presentationlalitjangra9
 
Netflix Cloud Platform and Open Source
Netflix Cloud Platform and Open SourceNetflix Cloud Platform and Open Source
Netflix Cloud Platform and Open Sourceaspyker
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Emerson Eduardo Rodrigues Von Staffen
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...Amazon Web Services
 
NDev Talk - Serverless Design Patterns
NDev Talk - Serverless Design PatternsNDev Talk - Serverless Design Patterns
NDev Talk - Serverless Design PatternsRyan Green
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless Editionecobold
 
Building a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsBuilding a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsAvere Systems
 
How Serverless Changes DevOps
How Serverless Changes DevOpsHow Serverless Changes DevOps
How Serverless Changes DevOpsRichard Donkin
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionLecole Cole
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithMarkus Eisele
 
Azure Functions Real World Examples
Azure Functions Real World Examples Azure Functions Real World Examples
Azure Functions Real World Examples Yochay Kiriaty
 

Similar to NetflixOSS for Triangle Devops Oct 2013 (20)

Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSS
 
Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinal
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
 
How IT at Getty Images Brokers Cloud Services
How IT at Getty Images Brokers Cloud ServicesHow IT at Getty Images Brokers Cloud Services
How IT at Getty Images Brokers Cloud Services
 
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to Containers
 
AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)
AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)
AWS re:Invent 2016: Accenture Cloud Platform Serverless Journey (ARC202)
 
PowerPoint Presentation
PowerPoint PresentationPowerPoint Presentation
PowerPoint Presentation
 
Netflix Cloud Platform and Open Source
Netflix Cloud Platform and Open SourceNetflix Cloud Platform and Open Source
Netflix Cloud Platform and Open Source
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
 
NDev Talk - Serverless Design Patterns
NDev Talk - Serverless Design PatternsNDev Talk - Serverless Design Patterns
NDev Talk - Serverless Design Patterns
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless Edition
 
Building a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsBuilding a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for Analysts
 
How Serverless Changes DevOps
How Serverless Changes DevOpsHow Serverless Changes DevOps
How Serverless Changes DevOps
 
Windows Azure introduction
Windows Azure introductionWindows Azure introduction
Windows Azure introduction
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless Edition
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
Azure Functions Real World Examples
Azure Functions Real World Examples Azure Functions Real World Examples
Azure Functions Real World Examples
 

More from aspyker

Herding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes PublicHerding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes Publicaspyker
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientistsaspyker
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2aspyker
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayentaaspyker
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containersaspyker
 
SRECon Lightning Talk
SRECon Lightning TalkSRECon Lightning Talk
SRECon Lightning Talkaspyker
 
Container World 2018
Container World 2018Container World 2018
Container World 2018aspyker
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1aspyker
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17aspyker
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4aspyker
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integrationaspyker
 
Netflix and Containers: Not A Stranger Thing
Netflix and Containers:  Not A Stranger ThingNetflix and Containers:  Not A Stranger Thing
Netflix and Containers: Not A Stranger Thingaspyker
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3aspyker
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016aspyker
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1aspyker
 

More from aspyker (20)

Herding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes PublicHerding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes Public
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containers
 
SRECon Lightning Talk
SRECon Lightning TalkSRECon Lightning Talk
SRECon Lightning Talk
 
Container World 2018
Container World 2018Container World 2018
Container World 2018
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integration
 
Netflix and Containers: Not A Stranger Thing
Netflix and Containers:  Not A Stranger ThingNetflix and Containers:  Not A Stranger Thing
Netflix and Containers: Not A Stranger Thing
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
 
Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1Netflix Open Source Meetup Season 4 Episode 1
Netflix Open Source Meetup Season 4 Episode 1
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

NetflixOSS for Triangle Devops Oct 2013

  • 1. Learning about NetflixOSS For Oct 2013 @TriangleDevops Andrew Spyker @aspyker Some content from @ma4jpb
  • 2. Agenda • How did I get here? • • • • • Netflix and Netflix OSS platform overview Runtime components Management components Build components Automated test and cleanliness components 2
  • 3. About me … • IBM STSM of Performance Architect and Strategy • Eleven years in performance in WebSphere – – – – Led the App Server Performance team for years Small sabbatical focused on IBM XML technology Work in Emerging Technology Institute and CTO Office Starting to look at cloud service operations • Email: aspyker@us.ibm.com – – – – Blog: http://ispyker.blogspot.com/ Linkedin: http://www.linkedin.com/in/aspyker Twitter: http://twitter.com/aspyker Github: http://www.github.com/aspyker • Triangle dad that enjoys technology as well as running, wine and poker 3
  • 4. Develop or maintain a service today? • Develop – starting • Maintain – starting • More on this later …. http://www.flickr.com/photos/stevendepolo/ 4
  • 5. What qualifies me to talk? • My shirt? • Of cloud prize ~ 25 nominees – Personally • Best example mash-up sample – My IBM team • Best portability enhancement – More on this coming … • http://techblog.netflix.com/2013/09/netflixoss-meetup-s1e4-cloud-prize.html 5
  • 6. Seriously, how did I get here? • Plenty of experience with performance and scale on standardized benchmarks (SPEC/TPC) – Non representative of how to (web) scale • Pinning, biggest monolithic DB “wins”, hand tuned for fixed size – Out of date on modern architecture for mobile/cloud • Created Acme Air – http://bit.ly/acmeairblog • Demonstrated that we could achieve (web) scale runs – 4B+ Mobile/Browser request/day – With modern mobile and cloud best practices 6
  • 8. What was shown? • Peak performance and scale – You betcha! • Operational visibility – Only during the run via nmon collection and post-run visualization • • • • True operational visibility - nope Devops – nope HA and DR – nope Manual and automatic elastic scaling - nope 8
  • 9. What next? • Went looking for what best industry practices around devops and high availability at web scale existed – Many have documented via research papers and on highscalability.com – Google, Twitter, Facebook, Linkedin, etc. • Why Netflix? – Documented not only on their tech blog, but also have released working OSS on github – Also, given dependence on Amazon, they are a clear bellwether of web scale public cloud availability 9
  • 10. Steps to NetflixOSS understanding • Recoded Acme Air application to make use of NetflixOSS runtime components • Worked to implement a NetflixOSS devops and high availability setup around Acme Air (on EC2) run at previous levels of scale and performance • Worked to port NetflixOSS runtime and devops/high availability servers to IBM Cloud (SoftLayer) and RightScale • Through public collaboration with Netflix technical team – Google groups, github and meetups 10
  • 11. Why? • To prove that advanced cloud high availability and devops platform wasn’t “tied” to Amazon • To understand how we can advance IBM cloud platforms for our customers • To understand how we can host our IBM public cloud services better 11
  • 12. Agenda • How did I get here? • Netflix and Netflix OSS platform overview • • • • Runtime components Management components Build components Automated test and cleanliness components 12
  • 13. My view of Netflix goals • As a business – Be the best streaming media provider in the world – Make best content deals based on real data/analysis • Technology wise – Have the most availability possible – Measure all things by “stream starts per unit of time” • Any dip in that relates back to the business – Do this at web scale 13
  • 14. Standing on the shoulder of a giants • Public Cloud (Amazon) – When adding streaming, Netflix decided they • Shouldn’t invest in building data centers worldwide • Had to plan for the streaming business to be very big – Embraced cloud architecture paying only for what they need • Open Source – Many parts of runtime depend on open source • Linux, Apache Tomcat, Apache Cassandra, etc. – Realized that Amazon wasn’t enough • Started a cloud platform on top that would eventually be open sourced - NetflixOSS http://en.wikipedia.org/wiki/ File:Andre_in_the_late_%2780s.jpg 14
  • 15. Faleure • What is failing? – Underlying IaaS problems • Instances, racks, availability zones, regions – Software issues • Operating system, servers, application code Inspiration – Surrounding services • Other application services, DNS, user registries, etc. • How is a component failing? – – – – Fails and disappears altogether Intermittently fails Works, but is responding slowly Works, but is causing users a poor experience 15
  • 16. Overview of Amazon EC2 • Amazon launches instances into availability zones – Instances of various sizes (compute, storage, etc.) • Regions independent of each other Regions only connected over the Internet Regions contain availability zones Availability zones are isolated from each over Availability zones are connected /w low-latency links Availability Zone Availability Zone Internet This gives a high level of resilience to outages – Unlikely to affect multiple availability zones or regions • Availability Zone Organized into regions and availability zones – – – – – • EC2 Region (US East) Amazon requires customer be aware of this topology to take advantage of its benefits within their application EC2 Region (US West) Availability Zone Availability Zone Availability Zone 16
  • 17. NetflixOSS • “Technical indigestion as a service” - @adrianco • netflix.github.io • 30+ OSS projects • Expanding every day 17
  • 18. NetflixOSS – for today • For today – Focus on mid tier web app and micro service servers – Devops servers and tools – Skipping some just for simplicity • For another time – Big data – Data tier – Caching 18
  • 19. Agenda • How did I get here? • Netflix and Netflix OSS platform overview • Runtime components • Management components • Build components • Automated test and cleanliness components 19
  • 20. Acme Air As A Sample ELB Web App Front End (REST services) App Service (Authentication) Data Tier Greatly simplified … 20
  • 21. Micro-services architecture • Decompose system into isolated services that can be developed separately • Why? – They can fail independently vs. fail together monolythically – They can be developed and released with difference velocities by different teams • To show this we created separate “auth service” for Acme Air • In a typical customer facing application any single front end invocation could spawn 20-30 calls to services and data sources 21
  • 22. How do services advertise themselves? • Upon web app startup, Karyon server is started – Karyon will configure (via Archaius) the application – Karyon will register the location of the instance with Eureka • Others can know of the existence of the service • Lease based so instances continue to check in updating list of available instances – Karyon will also expose a JMX console, healthcheck URL • Devops can change things about the service via JMX • The system can monitor the health of the instance App Service (Authentication) Name, Port IP address, Healthcheck url Karyon Tomcat Eureka Eureka Server(s) Eureka Server(s) Eureka Server(s) Server(s) config.properties, auth-service.properties Or remote Archaius stores 22
  • 23. How do consumers find services? • Service consumers query eureka at startup and periodically to determine location of dependencies – Can query based on availability zone and cross availability zone Web App Front End (REST services) Eureka client Tomcat What “auth-service” instances exist? Eureka Eureka Server(s) Eureka Server(s) Eureka Server(s) Server(s) 23
  • 25. How does the consumer call the service? • Protocols impls have eureka aware load balancing support build in – In client load balancing -- does not require separate LB tier • Ribbon – REST client – Pluggable load balancing scheme – Built in failure recovery support (retry next server, mark instance as failing, etc.) • Other eureka enabled clients – memcached (EVCache), asystanax coming (Priam and Cassandra) Web App Front End (REST services) Call “auth-service” Ribbon REST client Eureka client App Service App Service (Authentication) App Service (Authentication) App Service (Authentication) (Authentication) 25
  • 26. How to deploy this with HA? Instances? • Deploy across AZs • Using AutoScalingGroups in EC2 managed by Asgard Eureka? • • DNS and Elastic IP trickery Deployed across AZs • For clients to find eureka servers – – ASG manages recovery – • For new eureka servers – – – • DNS TXT record for domain lists AZ TXT records AZ TXT records have list of Eureka servers Look for list of eureka servers IP’s for the AZ it’s coming up in Look for unassigned elastic IP’s, grab one and assign it to itself Sync with other already assigned IP’s that likely are hosting Eureka server instances Simpler configurations with less HA are available 26
  • 27. Protect yourself from unhealthy services • Wrap all calls to services with Hystrix command pattern – Hystrix implements circuit breaker pattern – Executes command using semaphore or separate thread pool to guarantee return within finite time to caller – If a unhealthy service is detected, start to call fallback implementation (broken circuit) and periodically check if main implementation works (reset circuit) Execute auth-service call Call “auth-service” Hystrix Web App Front End (REST services) Ribbon REST client App Service App Service (Authentication) App Service (Authentication) App Service (Authentication) (Authentication) Fallback implementation 27
  • 28. Does Hystrix do more? • Main reason for Hystrix is protect yourself from dependencies, but … • Once you have a layer of indirection take advantage of it, Hystrix can provide – Caching – Visualization • Aggregated via Turbine – Request collapsing • Programming models – Sync, Async, Reactive (RxJava) 28
  • 29. Agenda • How did I get here? • Netflix and Netflix OSS platform overview • Runtime components • Management components • Build components • Automated test and cleanliness components 29
  • 30. Ability to reconfigure - Archaius • Using dynamic properties, can easily change properties across cluster of applications, either Application – NetflixOSS named props • Hystrix timeouts for example Runtime – Custom dynamic props Hierarchy • High throughput achieved by polling approach • HA of configuration source dependent on what source you use URL JMX Karyon Console Persisted DB Application Props Libraries Container – HTTP server, database, etc. DynamicIntProperty prop = DynamicPropertyFactory.getInstance().getIntProperty("myProperty", DEFAULT_VALUE); int value = prop.get(); // value will change over time based on configuration 30
  • 31. ASGard EC2 Region (US East) Availability Zone Tell EC2 to start these instances and Keep this many Instances running Availability Zone Web App App Service (REST App Service Services) (Authentication) App Service (Authentication) (Authentication) App Service App Service App Service (Authentication) (Authentication) App Service (Authentication) (Authentication) Availability Zone Web App App Service (REST App Service Services) (Authentication) App Service (Authentication) (Authentication) App Service App Service App Service (Authentication) (Authentication) App Service (Authentication) (Authentication) Web App App Service (REST App Service Services) (Authentication) App Service (Authentication) (Authentication) App Service App Service App Service (Authentication) (Authentication) App Service (Authentication) (Authentication) • Asgard is the missing EC2 console for AutoScalingGroup mgmt. 31 – EC2 only has CLI for ASG management
  • 32. Asgard creates an “application” • Enforces common practices for deploying code – Common approach to linking auto scaling groups to launch configs, ELB’s, security groups, scaling policies and AMIs • Adds missing concept to the EC2 domain model – “application” – Extends clustering to applications vs. AMI’s • Example – – – – Application – app1 Cluster – app1-env Autoscaling group version n – app1-env-v009 Autoscaling group version n+1 – app1-env-v010 32
  • 33. Asgard devops procedures • • • • Fast rollback Canary testing Red/Black pushes More through REST interfaces – Adhoc processes but enforced through Asgard model • More coming using Glisten and Amazon SWF 33
  • 35. Augmenting the ELB tier - Zuul • Zuul adds devops support in the front tier routing – – – – – Stress testing (squeeze testing) Canary testing Dynamic routing Load Shedding Debugging • And some common function – – – – – Authentication Security Static response handling Multi-region resiliency (DR for ELB tier) Insight Amazon ELB Filter Filter Filter Filters Zuul Zuul Zuul Edge Service Edge Service • Through dynamically deployable filters (written in Groovy) • Eureka aware using ribbon, and archaius like shown in runtime section 35
  • 36. Monitoring - Servo • Annotation based publishing through JMX of application metrics • Filters, Observers, and Pollers to publish metrics – Can export metrics to CloudWatch and other monitors • The entire Netflix monitoring infrastructure hasn’t been open sourced due to complexity and priority 36
  • 37. A note on the next three projects • I haven’t personally worked with the projects • Given the audience, I included as I believe they will be of interest 37
  • 38. Edda • Polls Amazon config and stores the data in a queriable database • Provides a searchable view of Amazon deployments – Searchable in ways not possible from Amazon API’s • Provides a historical view – For correlation of problems to changes – Likely less of an issue in clouds that expose all changes 38
  • 39. Ice • Cloud spend and usage analytics • Communicates with billing API to give birds eye view of cloud spend with drill down to region, availability zone, and service team through application groups • Watches on-demand, used and unused reserved instances and instance sizes to help optimize • Not point in time – Shows trends to help predict future optimizations 39
  • 40. Denominator • Java Library and CLI for cross DNS configuration • Allows for common, quicker (than using various DNS provider UI) and automated DNS updates • Plugins have been developed by various DNS providers 40
  • 41. Agenda • • • • How did I get here? Netflix and Netflix OSS platform overview Runtime components Management components • Build components • Automated test and cleanliness components 41
  • 42. Get baked! • Caution: Flame/troll bait ahead!! • Netflix takes the approach of baking images as part of build such that – Instance boot-up doesn’t depend on outside servers – Instance boot-up only starts servers already set to run – New code = new instances (never update instances in place) • Why? – Critical when launching hundreds of servers at a time – Goal to reduce the failure points in places where dynamic system configuration doesn’t provide value – Speed of elastic scaling, boot and go – Discourages ad hoc changes to server instances • Criticism – “Netflix is ruining the cloud” – Overhead of AMI’s for every code version – Ties to Amazon AMI’s (would this work for containers – I think yes) 42
  • 43. AMInator • Starting image/volume – Foundational image created (maybe via loopback), base AMI with common software created/tested independently • Aminator running – Bakery – Bakery obtains a known EBS volume of the base image from a pool – Bakery mounts volume and provisions the application (apt/deb or yum/rpm) – Bakery snapshots and registers snapshot • Recent work to add other provisioning such as chef as plugins • I have used hand built AMI’s thus far, but blog states developers can go through CI builds and have running test instances within 15 minutes of code being checked in 43
  • 44. Agenda • • • • • How did I get here? Netflix and Netflix OSS platform overview Runtime components Management components Build components • Automated test and cleanliness components 44
  • 45. The Simian Army • A bunch of automated “monkeys” that perform automated system administration tasks • Anything that is done by a human more than once can and should be automated • Absolutely necessary at web scale 45
  • 46. Good Monkeys • Janitor Monkey – Somewhat a mitigation for baking approach – Will mark and sweep unused resources (instances, volumes, snapshots, ASG’s, launch configs, images, etc.) – Owners notified, then removed • Conformity Monkey http://www.flickr.com/photos/sonofgroucho/5852049290 – Check instances are conforming to rules around security, ASG/ELB, age, status/health check, etc. 46
  • 47. Back to high availability • Failure is inevitable. Don’t try to avoid it! • How do you know if your backup is good? – Try to restore from your backup every so often – Better to ensure backup works before you have a crashed system and find out your backup is broken • How do you know if your system is HA? – Try to force failures every so often – Better to force those failures during office hours – Better to ensure HA before you have a down system and angry users – Best to learn from failures and add automated tests 47
  • 48. Bad Monkeys • Open Sourced – Chaos Monkey – Used to randomly terminate instances – Now block network, burn cpu, kill processes, fail amazon api, fail dns, fail dynamo, fail s3, introduce network errors/latency, detach volumes, fill disk, burn I/O http://www.flickr.com/photos/27261720@N00/132750805 • Not yet open sourced – Chaos Gorilla • Kill all instances in an availability zone – Chaos Kong • Kill all instances in an entire region – Latency Monkey • Introduce latency into service calls directly (ribbon server side) 48
  • 49. Agenda • Blah, blah, blah • How can I learn more? • How do I play with this? • Let’s write some code! 49
  • 50. Want to play? • NetflixOSS blog and github – http://techblog.netflix.com – http://github.com/Netflix • Acme Air, NetflixOSS AMI’s – Try Asgard/Eureka with a real application – http://bit.ly/aa-AMIs • See what we ported to IBM Cloud (video) – http://bit.ly/noss-sl-blog • Fork and submit pull requests to Acme Air – http://github.com/aspyker/acmeair-netflix 50