SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
and containers
Andrew Spyker (@aspyker) - Engineering Manager
Not
About Netflix
● 86.7M members
● A few thousand employees
● 190+ countries
● > ⅓ NA internet download traffic
● 500+ Microservices
● Many 10’s of thousands VM’s
● 3 regions across the world
Netflix has a elastic, cloud native, immutable microservice
architecture using full devops built on VM’s!
3
Why are we messing around with containers?
Technical motivating factors for containers
● Simpler management of compute resources
● Simpler deployment packaging artifacts for compute jobs
● Need for a consistent local developer environment
Sampling of realized container benefits
Media Encoding - encoding research development time
● VM’s platform to container platform - 1 month vs. 1 week
Continuous Integration Testing
● Build all Netflix codebases in hours
● Saves development 100’s of hours of debugging
Edge Re-architecture using NodeJS
● Focus returns to app development
● Simplifies, speeds test and deployment
5
Batch applications
Multi-tenant (cgroups/Mesos)
historically used for batch
Linux cgroups
What do batch users want?
● Simple shared resources, run till done, job files
● NOT
○ EC2 Instance sizes, autoscaling, AMI OS’s
● WHY
○ Offloads resource management ops, Simpler
Introducing Titus
Batch
Job Management
Resource Management & Optimization
Container Execution
Integration
Workflow, Data Analysis, Adhoc
Upstream Systems
Netflix Batch Job Examples
● Algorithm Model Training (with GPU’s)
Netflix Batch Job Examples
● Media Encoding
● Digital Watermarking
1 1
Netflix Batch Job Examples
Open Connect
CDN Reporting
Adhoc
Reporting
● Docker helped generalize use cases
● Scheduling required (GPU, elastic)
● Initially ignored failures (with retries)
● Time sensitive batch came later
Lessons Learned from Batch
Current Container Usage - Batch
● 1000’s of container hosts (g2, m4, r3 instances)
● 1000’s containers / hour average
● Large spikes of CI testing and Digital Watermarking
From day of 10/26
● 47K containers
● Bursts of 1000
containers in a
minute
Service applications
Why Services in containers?
Theory Reality
Developer
Opportunities to evolve our baking
● Java focused supported AMI, baking works well!
● However, wanted to allow
○ other stacks to evolve independent of OS updates
○ simplified builds (vs. Java and OS based tooling)
○ reliable smaller instances for dynamic languages
○ ability to develop locally with same image
Services are just long running batch?
Services
Job Management
Resource Management & Optimization
Container Execution
Integration
Service Apps
Batch
19
Nope, not that easy - Titus Details
19
Titus UITitus UI
Docker
Registry
Docker
Registry
Rhea
container
container
container
docker
Titus Agent
metrics agent
Titus executor
logging agent
zfs
Mesos agent
docker
RheaTitus API
Cassandra
Titus Master
Job Management &
Scheduler
S3
Zookeeper
Docker
Registry
EC2 Autocaling
API
Mesos Master
Titus UI
Fenzo
container
Pod & VPC network
drivers
containercontainer
AWS
metadata proxy
Integration
AWS VM’sCI/CD
Services more complex
● Services resize constantly and run forever
○ Autoscaling
○ Hard to upgrade underlying hosts
● Require IPC integration
○ Routable IPs, service discovery
○ Ready for traffic vs. just started/stopped
● Existing well defined dev, deploy, runtime & ops tools
Real networking is hard
Multi-tenant
Need IP per container - in VPC
Using security groups
Using IAM roles
Considering network resource isolation
Enabling VPC Networking
No IP, SecGrp A
Task 0
SecGrp Y,Z
Task 1 Task 2 Task 3
Titus EC2 Host VMeth1
ENI1
SecGrp=A
eth2
ENI2
SecGrp=X
eth3
ENI3
SecGrp=Y,Z
IP 1
IP 2
IP 3
pod root
veth<id>
app
SecGrp X
pod root
veth<id>
app
SecGrp X
pod root
veth<id>
appapp
veth<id>
Linux Policy Based
Routing + Traffic Control
Titus
EC2
Metadata
Proxy
169.254.169.254
IPTables NAT (*)
* **
169.254.169.254
Non-routable IP
*
Solutions
● VPC Networking driver
○ Supports ENI’s - full IP functionality
○ Scheduled security groups
○ Support traffic control (resource isolation)
● EC2 Metadata proxy
○ Adds container “node” identity
○ Delivers IAM roles
Reuse existing infrastructure services
VMVM
EC2
AWSAutoScaler
VMs
App
Cloud Platform
(metrics, IPC, health)
VPC
Netflix Cloud Infrastructure (VM’s + Containers)
Atlas Eureka Edda
Enable them for containers
VMVM
EC2
AWSAutoScaler
VMs
App
Cloud Platform
(metrics, IPC, health)
VPC
Netflix Cloud Infrastructure (VM’s + Containers)
VMVM
Atlas
TitusJobControl
Containers
App
Cloud Platform
(metrics, IPC, health)
Eureka Edda
VMVM
Batch
Containers
Spinnaker
Deploy based
on new images
tags
Basic resource
requirements
IAM Roles & Sec
Groups per
container
Deploy
Strategies
Same as VM’s
Easily see
health &
discovery
Secure Multi-tenancy
Common to VM’s and tiered security needed
● Protect the reduced host IAM role, Allow containers to have specific IAM
roles
● Needed to support same security groups in container networking as VM’s
User namespacing
● Docker 1.10 - Introduced User Namespaces
● Didn’t work /w shared networking NS
● Docker 1.11 - Fixed shared networking NS’s
● But, namespacing is per daemon, Not per container, as hoped
● Waiting on Linux
● Considering mass chmod / ZFS clones
Titus Advanced Scheduling
● Support for AZ balancing
● Multiple instance types selected based on workload
● Elastic underlying common resource pool
○ Bin packing managed transparently across all apps
● Hard and soft constraints
● Resource affinity and task affinity
● Capacity guarantees (critical tier)
34
Fenzo - Keep resource scheduling extensible
Fenzo - Extensible Scheduling Library
Features:
● Heterogeneous resources & tasks
● Autoscaling of mesos cluster
○ Multiple instance types
● Plugins based scheduling objectives
○ Bin packing, etc.
● Plugins based constraints evaluator
○ Resource affinity, task locality, etc.
● Scheduling actions visibility
Current Container Usage - Service
● Still small ~ 2000 long running containers
● NodeJS Device UI Scripts Apps
● Stream Processing Jobs - Flink
● Various Internal Dashboards
Questions?
Andrew Spyker (@aspyker) - Engineering Manager

Weitere ähnliche Inhalte

Was ist angesagt?

CS80A Foothill College Open Source Talk
CS80A Foothill College Open Source TalkCS80A Foothill College Open Source Talk
CS80A Foothill College Open Source Talkaspyker
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Programaspyker
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2aspyker
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflixaspyker
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
 
Netflix oss season 1 episode 3
Netflix oss season 1 episode 3 Netflix oss season 1 episode 3
Netflix oss season 1 episode 3 Ruslan Meshenberg
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talksRuslan Meshenberg
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2aspyker
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayentaaspyker
 
Timed Text At Netflix
Timed Text At NetflixTimed Text At Netflix
Timed Text At NetflixRohit Puri
 
Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1aspyker
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientistsaspyker
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4aspyker
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17aspyker
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Docker, Inc.
 
KURMA - A Containerized Container Platform - KubeCon 2016
KURMA - A Containerized Container Platform - KubeCon 2016KURMA - A Containerized Container Platform - KubeCon 2016
KURMA - A Containerized Container Platform - KubeCon 2016Apcera
 
Cloudsolutionday 2016: Docker & FAAS at getvero.com
Cloudsolutionday 2016: Docker & FAAS at getvero.comCloudsolutionday 2016: Docker & FAAS at getvero.com
Cloudsolutionday 2016: Docker & FAAS at getvero.comAWS Vietnam Community
 
How Docker EE Helps Open Doors at Assa Abloy
How Docker EE Helps Open Doors at Assa AbloyHow Docker EE Helps Open Doors at Assa Abloy
How Docker EE Helps Open Doors at Assa AbloyDocker, Inc.
 

Was ist angesagt? (20)

CS80A Foothill College Open Source Talk
CS80A Foothill College Open Source TalkCS80A Foothill College Open Source Talk
CS80A Foothill College Open Source Talk
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 3 Episode 2
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
Netflix oss season 1 episode 3
Netflix oss season 1 episode 3 Netflix oss season 1 episode 3
Netflix oss season 1 episode 3
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talks
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
 
The new Netflix API
The new Netflix APIThe new Netflix API
The new Netflix API
 
Timed Text At Netflix
Timed Text At NetflixTimed Text At Netflix
Timed Text At Netflix
 
Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
 
KURMA - A Containerized Container Platform - KubeCon 2016
KURMA - A Containerized Container Platform - KubeCon 2016KURMA - A Containerized Container Platform - KubeCon 2016
KURMA - A Containerized Container Platform - KubeCon 2016
 
Cloudsolutionday 2016: Docker & FAAS at getvero.com
Cloudsolutionday 2016: Docker & FAAS at getvero.comCloudsolutionday 2016: Docker & FAAS at getvero.com
Cloudsolutionday 2016: Docker & FAAS at getvero.com
 
How Docker EE Helps Open Doors at Assa Abloy
How Docker EE Helps Open Doors at Assa AbloyHow Docker EE Helps Open Doors at Assa Abloy
How Docker EE Helps Open Doors at Assa Abloy
 

Ähnlich wie Netflix and Containers: Not A Stranger Thing

Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integrationaspyker
 
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...Amazon Web Services
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
Kubernetes: від знайомства до використання у CI/CD
Kubernetes: від знайомства до використання у CI/CDKubernetes: від знайомства до використання у CI/CD
Kubernetes: від знайомства до використання у CI/CDStfalcon Meetups
 
AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)
AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)
AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)Amazon Web Services
 
Moving Applications into Azure Kubernetes
Moving Applications into Azure KubernetesMoving Applications into Azure Kubernetes
Moving Applications into Azure KubernetesHussein Salman
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101Vishwas N
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...NETWAYS
 
Pivotal Container Service Overview
Pivotal Container Service Overview Pivotal Container Service Overview
Pivotal Container Service Overview VMware Tanzu
 
Structured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, AccentureStructured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, AccentureDocker, Inc.
 
DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...
DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...
DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...Docker, Inc.
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014Hojoong Kim
 
Pivotal Container Service (PKS) at SF Cloud Foundry Meetup
Pivotal Container Service (PKS) at SF Cloud Foundry MeetupPivotal Container Service (PKS) at SF Cloud Foundry Meetup
Pivotal Container Service (PKS) at SF Cloud Foundry Meetupcornelia davis
 
Building and Deploying a Static Application using Jenkins and Docker in AWS
Building and Deploying a Static Application using Jenkins and Docker in AWSBuilding and Deploying a Static Application using Jenkins and Docker in AWS
Building and Deploying a Static Application using Jenkins and Docker in AWSijtsrd
 
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)Amazon Web Services
 
Netflix Titus WASP October 2017
Netflix Titus WASP October 2017Netflix Titus WASP October 2017
Netflix Titus WASP October 2017Andrew Leung
 
AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)
AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)
AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)Amazon Web Services
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerAmazon Web Services
 
Kubernetes is all you need
Kubernetes is all you needKubernetes is all you need
Kubernetes is all you needVishwas N
 

Ähnlich wie Netflix and Containers: Not A Stranger Thing (20)

Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integration
 
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration...
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Kubernetes: від знайомства до використання у CI/CD
Kubernetes: від знайомства до використання у CI/CDKubernetes: від знайомства до використання у CI/CD
Kubernetes: від знайомства до використання у CI/CD
 
Openshift Workshop
Openshift Workshop Openshift Workshop
Openshift Workshop
 
AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)
AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)
AWS re:Invent 2016: Introduction to Container Management on AWS (CON303)
 
Moving Applications into Azure Kubernetes
Moving Applications into Azure KubernetesMoving Applications into Azure Kubernetes
Moving Applications into Azure Kubernetes
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...OSDC 2018 | Three years running containers with Kubernetes in Production by T...
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
 
Pivotal Container Service Overview
Pivotal Container Service Overview Pivotal Container Service Overview
Pivotal Container Service Overview
 
Structured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, AccentureStructured Container Delivery by Oscar Renalias, Accenture
Structured Container Delivery by Oscar Renalias, Accenture
 
DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...
DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...
DockerCon SF 2015 : Reliably shipping containers in a resource rich world usi...
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
Pivotal Container Service (PKS) at SF Cloud Foundry Meetup
Pivotal Container Service (PKS) at SF Cloud Foundry MeetupPivotal Container Service (PKS) at SF Cloud Foundry Meetup
Pivotal Container Service (PKS) at SF Cloud Foundry Meetup
 
Building and Deploying a Static Application using Jenkins and Docker in AWS
Building and Deploying a Static Application using Jenkins and Docker in AWSBuilding and Deploying a Static Application using Jenkins and Docker in AWS
Building and Deploying a Static Application using Jenkins and Docker in AWS
 
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
AWS re:Invent 2016: Development Workflow with Docker and Amazon ECS (CON302)
 
Netflix Titus WASP October 2017
Netflix Titus WASP October 2017Netflix Titus WASP October 2017
Netflix Titus WASP October 2017
 
AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)
AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)
AWS re:Invent 2016: Running Microservices on Amazon ECS (CON309)
 
SRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and DockerSRV409 Deep Dive on Microservices and Docker
SRV409 Deep Dive on Microservices and Docker
 
Kubernetes is all you need
Kubernetes is all you needKubernetes is all you need
Kubernetes is all you need
 

Mehr von aspyker

SRECon Lightning Talk
SRECon Lightning TalkSRECon Lightning Talk
SRECon Lightning Talkaspyker
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3aspyker
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016aspyker
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalaspyker
 
Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014aspyker
 
Netflix s2e1lightningtalk
Netflix s2e1lightningtalkNetflix s2e1lightningtalk
Netflix s2e1lightningtalkaspyker
 
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@PulseGoing Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulseaspyker
 

Mehr von aspyker (9)

SRECon Lightning Talk
SRECon Lightning TalkSRECon Lightning Talk
SRECon Lightning Talk
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Ibm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinalIbm cloud nativenetflixossfinal
Ibm cloud nativenetflixossfinal
 
Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014Docker Demo IBM Impact 2014
Docker Demo IBM Impact 2014
 
Netflix s2e1lightningtalk
Netflix s2e1lightningtalkNetflix s2e1lightningtalk
Netflix s2e1lightningtalk
 
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@PulseGoing Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
Going Cloud Native with IBM Cloud and NetflixOSS for Dev@Pulse
 

Kürzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Kürzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Netflix and Containers: Not A Stranger Thing

  • 1. and containers Andrew Spyker (@aspyker) - Engineering Manager Not
  • 2. About Netflix ● 86.7M members ● A few thousand employees ● 190+ countries ● > ⅓ NA internet download traffic ● 500+ Microservices ● Many 10’s of thousands VM’s ● 3 regions across the world
  • 3. Netflix has a elastic, cloud native, immutable microservice architecture using full devops built on VM’s! 3 Why are we messing around with containers?
  • 4. Technical motivating factors for containers ● Simpler management of compute resources ● Simpler deployment packaging artifacts for compute jobs ● Need for a consistent local developer environment
  • 5. Sampling of realized container benefits Media Encoding - encoding research development time ● VM’s platform to container platform - 1 month vs. 1 week Continuous Integration Testing ● Build all Netflix codebases in hours ● Saves development 100’s of hours of debugging Edge Re-architecture using NodeJS ● Focus returns to app development ● Simplifies, speeds test and deployment 5
  • 8. What do batch users want? ● Simple shared resources, run till done, job files ● NOT ○ EC2 Instance sizes, autoscaling, AMI OS’s ● WHY ○ Offloads resource management ops, Simpler
  • 9. Introducing Titus Batch Job Management Resource Management & Optimization Container Execution Integration Workflow, Data Analysis, Adhoc Upstream Systems
  • 10. Netflix Batch Job Examples ● Algorithm Model Training (with GPU’s)
  • 11. Netflix Batch Job Examples ● Media Encoding ● Digital Watermarking 1 1
  • 12. Netflix Batch Job Examples Open Connect CDN Reporting Adhoc Reporting
  • 13. ● Docker helped generalize use cases ● Scheduling required (GPU, elastic) ● Initially ignored failures (with retries) ● Time sensitive batch came later Lessons Learned from Batch
  • 14. Current Container Usage - Batch ● 1000’s of container hosts (g2, m4, r3 instances) ● 1000’s containers / hour average ● Large spikes of CI testing and Digital Watermarking From day of 10/26 ● 47K containers ● Bursts of 1000 containers in a minute
  • 16. Why Services in containers? Theory Reality Developer
  • 17. Opportunities to evolve our baking ● Java focused supported AMI, baking works well! ● However, wanted to allow ○ other stacks to evolve independent of OS updates ○ simplified builds (vs. Java and OS based tooling) ○ reliable smaller instances for dynamic languages ○ ability to develop locally with same image
  • 18. Services are just long running batch? Services Job Management Resource Management & Optimization Container Execution Integration Service Apps Batch
  • 19. 19 Nope, not that easy - Titus Details 19 Titus UITitus UI Docker Registry Docker Registry Rhea container container container docker Titus Agent metrics agent Titus executor logging agent zfs Mesos agent docker RheaTitus API Cassandra Titus Master Job Management & Scheduler S3 Zookeeper Docker Registry EC2 Autocaling API Mesos Master Titus UI Fenzo container Pod & VPC network drivers containercontainer AWS metadata proxy Integration AWS VM’sCI/CD
  • 20. Services more complex ● Services resize constantly and run forever ○ Autoscaling ○ Hard to upgrade underlying hosts ● Require IPC integration ○ Routable IPs, service discovery ○ Ready for traffic vs. just started/stopped ● Existing well defined dev, deploy, runtime & ops tools
  • 22. Multi-tenant Need IP per container - in VPC Using security groups Using IAM roles Considering network resource isolation
  • 23. Enabling VPC Networking No IP, SecGrp A Task 0 SecGrp Y,Z Task 1 Task 2 Task 3 Titus EC2 Host VMeth1 ENI1 SecGrp=A eth2 ENI2 SecGrp=X eth3 ENI3 SecGrp=Y,Z IP 1 IP 2 IP 3 pod root veth<id> app SecGrp X pod root veth<id> app SecGrp X pod root veth<id> appapp veth<id> Linux Policy Based Routing + Traffic Control Titus EC2 Metadata Proxy 169.254.169.254 IPTables NAT (*) * ** 169.254.169.254 Non-routable IP *
  • 24. Solutions ● VPC Networking driver ○ Supports ENI’s - full IP functionality ○ Scheduled security groups ○ Support traffic control (resource isolation) ● EC2 Metadata proxy ○ Adds container “node” identity ○ Delivers IAM roles
  • 25. Reuse existing infrastructure services VMVM EC2 AWSAutoScaler VMs App Cloud Platform (metrics, IPC, health) VPC Netflix Cloud Infrastructure (VM’s + Containers) Atlas Eureka Edda
  • 26. Enable them for containers VMVM EC2 AWSAutoScaler VMs App Cloud Platform (metrics, IPC, health) VPC Netflix Cloud Infrastructure (VM’s + Containers) VMVM Atlas TitusJobControl Containers App Cloud Platform (metrics, IPC, health) Eureka Edda VMVM Batch Containers
  • 28. Deploy based on new images tags
  • 29. Basic resource requirements IAM Roles & Sec Groups per container Deploy Strategies Same as VM’s
  • 31.
  • 32.
  • 33. Secure Multi-tenancy Common to VM’s and tiered security needed ● Protect the reduced host IAM role, Allow containers to have specific IAM roles ● Needed to support same security groups in container networking as VM’s User namespacing ● Docker 1.10 - Introduced User Namespaces ● Didn’t work /w shared networking NS ● Docker 1.11 - Fixed shared networking NS’s ● But, namespacing is per daemon, Not per container, as hoped ● Waiting on Linux ● Considering mass chmod / ZFS clones
  • 34. Titus Advanced Scheduling ● Support for AZ balancing ● Multiple instance types selected based on workload ● Elastic underlying common resource pool ○ Bin packing managed transparently across all apps ● Hard and soft constraints ● Resource affinity and task affinity ● Capacity guarantees (critical tier) 34
  • 35. Fenzo - Keep resource scheduling extensible Fenzo - Extensible Scheduling Library Features: ● Heterogeneous resources & tasks ● Autoscaling of mesos cluster ○ Multiple instance types ● Plugins based scheduling objectives ○ Bin packing, etc. ● Plugins based constraints evaluator ○ Resource affinity, task locality, etc. ● Scheduling actions visibility
  • 36. Current Container Usage - Service ● Still small ~ 2000 long running containers ● NodeJS Device UI Scripts Apps ● Stream Processing Jobs - Flink ● Various Internal Dashboards
  • 37. Questions? Andrew Spyker (@aspyker) - Engineering Manager