SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Netflix
Massively Scalable
Highly Available
Immutable
Infrastructure
Amer Ather
Netflix Performance Engineering
Netflix Facts
❖ Leading Video streaming Service
❖ 140+ million paid subscribers globally
❖ 190 counties
❖ Millions of hours watched per month
❖ 13 Billion spent on content per year
❖ 15% world’s Internet bandwidth
❖ 1998 - Netflix was founded
❖ 1999 - DVD distribution launched
❖ 2007 - Video stream launched
❖ 2010 - Expanded into Canada
❖ 2014 - Expanded into Europe
❖ 2016 - Globally Launched
Amer Ather
Netflix Performance Engineering
Load Balancing across AWS Regions
❖ Multiple active AWS regions
❖ Traffic load balanced across 3
AWS regions
❖ Takes into account geographical
location of subscriber
❖ Enough capacity to handle region
failures gracefully
❖ Region failover is handled via
Netflix Gateway (Zuul) and DNS
steering
Note: Netflix avoids unbalanced regions by
shifting portion of local traffic to remote regions
Netflix
Control
Plane
us-west-2
eu-west-1
us-east-1
Amer Ather
Netflix Performance Engineering
Netflix Cloud
Gateway
Zuul
Amer Ather
Netflix Performance Engineering
Zuul - Front Door to Netflix Ecosystem
Self Service Routing
❖ Traffic sharding
❖ Gradual migration
❖ Canary and
squeeze testing
❖ Authentication
❖ Security rules to
reject traffic from
bad devices
Resiliency and LB
❖ Failover around server
failures: slow response,
GC
❖ Graceful traffic ramp up
to newly launched
instances
❖ Blacklist bad instances
❖ Prevent overloading
❖ Track server utilization
Anomaly Detection
❖ Aggregate error rates
to detect if service is
in trouble
❖ Contextual alerting
about anomalies to
support and service
teams
❖ Helps with root cause
and correlation
Amer Ather
Netflix Performance Engineering
Netflix
Edge Services
API and Edge PaaS
Amer Ather
Netflix Performance Engineering
API - Netflix Edge Services
❖ Tier 1 Service
❖ Serves Netflix devices
❖ Compose calls to mid tier services
required to construct a response
❖ Orchestrate UI request to mid tier
services
❖ Fallback logic to avert customer
facing outages
❖ Abstract away mid tier changes
from UI development
❖ Facade over the entirety of Netflix
mid-tier services
❖ Proxy device requests to reduce
network chattiness and latency
❖ Promotes request/response model
that best fits device unique
requirement
Amer Ather
Netflix Performance Engineering
Edge PaaS - Netflix Edge Services
Decouple
device UI
development
from API and
mid Tier service
changes
Per device
endpoints
customized for
device type for
a richer
experience
Each endpoint is
isolated in
container for
better visibility
and debugging
Node Quark platform
for ease of node.js
development and
integration with Netflix
platform
Titus Container
Platform
Cloud
Deployment via
Spinnaker CI/CD
Platform
RSL
Remote
Service
Layer
for data
access to
API tier via
remote calls
R
S
L
Device specific
instead of
traditional REST
API
(Device
code is
mostly
written in
javascript)
nodejs
Amer Ather
Netflix Performance Engineering
Resiliency
and
Concurrency
Amer Ather
Netflix Performance Engineering
Load Shedding (server)
❖ When service is running in steady state:
concurrency = service time x service rate
❖ Requests in excess of this concurrency
limit cannot be serviced immediately.
❖ Service has two options: queue or reject
❖ Netflix services reject requests over the
set limit to avoid oversaturation
❖ Server-side throttling is performed by
setting up a cap on concurrent request a
service can handle
Netflix microservices uses servletFilter mechanism, as part of
platform library, for intercepting interesting requests and throttle
it based on current load on the server
Amer Ather
Netflix Performance Engineering
Fault Tolerance (Client)
Netflix microservices based on gRPC
do not use Ribbon and Hystrix
libraries, as features offered are
already provided in gRPC.
❖ Services protects itself from latency and failure
conditions:
➢ 5xx response, connection refused and timeout
❖ Retry request can be routed to next server due to load
balancing until max retries are reached
❖ Fail fast and rapid recovery
❖ Fallback and graceful degrade
❖ Fallback to failure paths to avert outages
❖ Stop cascading failures
❖ AWS Zone aware load balancing (Zone Affinity)
Netflix microservices use Ribbon RestClient and Hystrix
library to setup latency and failure tolerance to downstream
dependency service.
Amer Ather
Netflix Performance Engineering
Auto Discover Concurrency Limits
❖ Setting concurrency limits manually in a changing environment is challenging
❖ Require constant care and monitoring due to change in load characteristics
❖ Better approach: Identify concurrency limits dynamically and throttle requests before service
degrades
❖ Concept is borrowed from TCP congestion control algorithms:
➢ congestion window to determine packets transferred without incurring timeouts
➢ Tracks minimum and time sampled latency ratio => RTTnoload/RTTactual
➢ Grow (increase request rate) window if ratio = 1
➢ Shrink (decrease request rate) window if ratio < 1
❖ Limit is adjusted using a formula: newLimit = currentLimit x (RTTnoload/RTTactual) + queueSize
Netflix microservices are in the process of
migrating to gRPC from internal Ribbon IPC
mechanism. Netflix has open sourced gRPC
library , for dynamically auto-detecting
concurrency limits of the service
queueSize is tunable, that determines how fast queue can grow
Amer Ather
Netflix Performance Engineering
Netflix
Microservices
Amer Ather
Netflix Performance Engineering
Microservice Architecture
Architecture designed to decompose one large monolithic application into suite
of small services. Where each service:
❖ Implements different sets of business logic
❖ Is a software module exposed on network via web API
❖ Interacts via some form of RPC mechanism; Netflix Ribbon, gRPC
❖ RPC is a thin layer over standards: HTTP/1.1, HTTP/2.0 transports
❖ Exchanges data via: JSON, Protocol Buffers
❖ Builds, deploys, upgrades and scales independently
❖ Can be developed in different languages: java, python, go, nodejs..
❖ Is free to choose its own datastore for persistence: cassandra,
memcache, redis, elasticsearch, mongoDB..
❖ Platform libraries supports: Retry, Timeouts, Load balancing, Fall back
❖ Massively scalable due to loose coupling, stateless model and data
sharding
Amer Ather
Netflix Performance Engineering
Microservices Design Rules
❖ Services should not share data or database
❖ Services expose their data and functionality only through well defined service interface
❖ Transaction should not span multiple services as it violates their autonomy
❖ One service should not lock resources of another service
❖ API first, that takes into account upstream service requirements (client) and dependency on
downstream services. Externalizable (open to public) without major effort
❖ Split service into multiple microservices when functions performed by a service have no strong
relationship with one another.
Ideally, each service team should own the release cycle as well as the production
operations (DevOps) of their service
Amer Ather
Netflix Performance Engineering
Monolithic vs MicroServices
➢ Data Center Architecture
➢ Design for predictable scalability
➢ Relational DB: Oracle, mySQL
➢ Strong consistency
➢ Shared database
➢ Serial and synchronized processing
➢ Design to avoid failures
➢ Infrequent and slower updates
➢ Manual management
➢ Failures may result in outage
➢ Limited scalability due to stateful
design
➢ Cloud Architecture
➢ Decomposed and decentralized
➢ Design for elastic scale
➢ Polyglot persistence (mix of datastores)
➢ Eventual consistency
➢ Sharded datasets
➢ Parallel and async processing
➢ Design for failure
➢ Frequent updates (more features)
➢ Self-management (DevOps, CI/CD)
➢ Massively scalable due to stateless
design goals
➢ Immutable infrastructure
Amer Ather
Netflix Performance Engineering
RESTful Service (Web API)
❖ A platform that exposes data as a resource on which to operate
All client actions to resource (identified by URI) are represented by HTTP CRUD methods:
➢ POST/PATCH:Create | GET: Read | PUT: Update | DELETE: Deletion
➢ HTTP status codes (2xx, 3xx, 4xx, 5xx) are returned with response.
❖ Server response is sent in JSON
❖ A simple client (curl) can be used to invoke REST methods
❖ Each request/response is stateless and thus can be cached and massively scaled.
❖ Client maintains state and furnishes to server at every request
Amer Ather
Netflix Performance Engineering
“REST-ish” API - Netflix Falcor
❖ REST interface works well for large hypermedia resources
❖ WebApp deals with structured data, that can be large number of
small resources, e.g video metadata
❖ Latency becomes a major constraint when fetching these small
resources via REST calls on mobile networks.
➢ Rendering Netflix home page on device may require 20-30
REST calls to server
❖ REST-ish API where developer wants to do more with a REST call
❖ REST-ish API is less RESTful and more RPC
➢ URL of a resource is used to invokes a procedure call
➢ URL query string becomes RPC parameters
❖ Falcor represents data as one giant JSON model and offer async
API, that allows data to be pushed to model via callback
❖ Same benefits of REST (cache consistency, loose coupling)
❖ Batching multiple requests results in a single network request.
❖ Falcor represent data as JSON graph by using references. This
avoids duplicates and stale data by storing at one place
JSON graph detects duplicates
and avoids stale data
Amer Ather
Netflix Performance Engineering
gRPC for Microservices (Benefits)
❖ RPC framework for building microservices that uses HTTP/2 transport to support advanced features:
➢ Request multiplexing (streams) , pipelining, Server push; Binary protocol
❖ Protocol Buffers are used for defining and serializing structured data into efficient binary format No JSON scheme.
❖ A client can invoke a method on a different machine as if it were a local object. RPC methods becomes a RPC endpoints
❖ Netty transport provides async and non-blocking IO
❖ Strongly typed and versioned. Simplified API (struct in, struct out)
❖ Decouples the interface from any specific programming language via IDL
❖ Automated code generation to implement service interfaces (API): clients, server, data models, metrics, logging, tracing,
failover, retry, deadline, cancellation etc.. Support plugin to extend features
❖ HttpRule in service definition to define mapping of an RPC method to HTTP REST methods
Amer Ather
Netflix Performance Engineering
Netflix
Caching
And
Persistence Tier
Amer Ather
Netflix Performance Engineering
Netflix Global Cache (EVCache)
❖ Stateless microservices often maintain state in caches or persistent tier
❖ Caches offer loose coupling by maintaining states for stateless services
❖ EVCache is a RAM+NVMe based key-value store (memcache) that offers low latency and
scalable caching solution. Optimized for Cloud and Netflix use cases.
❖ Maintain state in-region and across region via global replication design to serve
requests originated from any region, using Kafka based cross-replication replication
➢ Eventual tunable consistency model that tolerates inconsistency for some time
➢ Asynchronous replication, keeps local cache operations not to be affected by transient
failures in updating caches in other region
➢ Avoid “thundering herd” scenario that may result due to cold caches after region failover.
❖ Caching tier is used for caching computed data and data retrieved from persistence
store like: Cassandra, S3, DB…
❖ Evcache tier is also used for replica and instance cache warming:
➢ To recover data from lost Evcache instances
➢ To scale up caching tier for more storage and network capacity
Amer Ather
Netflix Performance Engineering
Netflix
Immutable
Infrastructure
Amer Ather
Netflix Performance Engineering
What is Immutable Infrastructure
❖ Never be modified in production, merely replaced with the new updated one
❖ No reboot or individual server changes in production during its lifespan
❖ Changes are made to base image and then deployed on new server instances
➢ Older server instances are terminated at successful deployment
❖ Rollback changes in case of problem
❖ Guarantees known stable state, if frequently destroyed and deployed. No configuration drift
❖ Follow Infrastructure-as-a-code methodology, that rebuilds the whole environment from the scratch
by easy to adjust manifests.
Netflix microservices offer “fast property” that allows enabling/disabling limited features dynamically while
service is running in production.
Amer Ather
Netflix Performance Engineering
Immutable infrastructure
(CI/CD Platform)Updates, Canary analysis, and Deployments
are fully automated and architestrated via
continuous Integration and Continuous
Deployment or Delivery (CI/CD) platform
Public Cloud as Immutable Infrastructure
❖ Disposable or throw away cloud instances
❖ Decide what infrastructure and services to manage. Public cloud providers
offer number of useful managed services
❖ Cloud deployable entities: VM, Firecracker, Containers, Fargate, Lambda
❖ No hardware to repair or troubleshoot. Just provision a new one
❖ Failures are non-event. Health check failures result in redirected traffic
❖ Bad or terminated instances are replaced without human intervention
❖ No service down time due to massive deployment and fault tolerance
❖ Elastic capacity and pay-as-you-go model
❖ Auto scaling rules keep enough resources available to meet load demand
❖ Global reach and availability to execute disaster recovery plans
My presentation on Public Cloud Computing Workshop
Amer Ather
Netflix Performance Engineering
Netflix
Chaos
And
Fault Injection
Amer Ather
Netflix Performance Engineering
Planning for Failure
Regional Failures (Nimble)
❖ Drop in SPS (Stream Per Second) metrics
triggers regional failover
❖ Regional failover is executed in 7 Minutes
❖ Failover efficiency is achieved by keeping dark
capacity online in each region
❖ Dark capacity is whitelisted to take production
traffic at failover time.
Limited Scope Failures (ChAP)
❖ ChAP tests service resilience to
failures and validates fallbacks behave
as expected.
❖ ChAP helps uncover systemic
weaknesses that may occur when
higher latency is induced
❖ ChAP service uses FIT (Failure
Injection) framework for fine grain
control on failure and its impact
❖ Zuul gateway updates requests with
FIT metadata that provides failure
context to microservices involved.
❖ Microservices checks FIT context to
determine if particular request should
be impacted
Amer Ather
Netflix Performance Engineering
Netflix
Encoding
and
Content Delivery
Amer Ather
Netflix Performance Engineering
Open Connect Appliances (OCA)
Features
❖ Netflix Managed CDN
❖ Directed Cached Appliances
❖ Deployed at IXP and ISP
colocations globally
❖ Serve subscribers from location
closer to them for optimal
viewing experience
❖ Reduce ISP and Netflix cost of
transporting content
❖ Local caching leads to reduce
and responsible use of Internet
❖ Network capacity of ~100Gbps
❖ Stores portion of Netflix Catalog
❖ 7x24 monitoring
Periodic Fill and Allocation
❖ Push Fill Methodology
❖ Download popular content
during non-peak bandwidth
❖ Incremental and Tiered Filling
❖ ML models - compute content
popularity to decide what title to
catch by aggregating Title/File
usage and viewing history
❖ HCA algorithm for content
distribution that offers efficient
use of server resources
❖ Adopt algorithms to deal with
dynamics of regional member
preferences, evolving network
conditions, and new markets
Amer Ather
Netflix Performance Engineering
Adaptive Streaming
Adaptive Bitrate (ABR)
❖ Maximize video quality without rebuffer events
❖ Adopt to network events by picking different bitrate
❖ Multiple profiles (format) for every title encoded
❖ Supported Bitrate: 235 kbps - 5 Mb/s (4K video)
❖ Cache one or more files for each quadruple:
➢ title, profile, bitrate, language
➢ E.g: one episode of Crown = 1200 files
❖ Video Codec: H.264/AVC, HEVC, VP9, AV1
❖ Audio Codec: AAC, DD+, ATMOS
❖ Max Resolution: 1080p, 2160p, 4K, HDR, HFR
Per-shot encoding
❖ Dynamic Optimizer Encoding
❖ Allocate bits optimally for best overall quality
❖ Selects best encoding recipe per-shot
❖ Remove redundancy in video stream via
Spatial and temporal prediction/correlation
❖ 64% less bits for the same quality
❖ Good streaming experience at < 200 kbps
❖ 4 GB Plan = 30 Hours of quality viewing
Netflix
Data Pipeline
Amer Ather
Netflix Performance Engineering
Event Stream Processing at Cloud Scale
❖ Collects, aggregate, process and moves data at
cloud scale
➢ 500 billion events or 1.3 PB data per day
➢ Peak traffic : 8 million events/sec or 24 GB/s
❖ Type of event streams flowing into the pipeline:
➢ Video viewing and UI activities
➢ Device error logs, diagnostics, and perf events
❖ Kafka as a replicated persistent message queue
➢ Multiple copies with 12-24 hour retention period
❖ Data is ingested into kafka fronting clusters via
Java Library or via Kafka REST endpoint.
❖ Route events from Kafka to various sinks:
➢ Elasticsearch - for near real time analysis
➢ S3 bucket - imported into Hives for Data
warehouse and Big Data Analytics
➢ consumer kafka tier - used by streaming
services like: Mentis and Spark streaming
Amer Ather
Netflix Performance Engineering
Business Insight
Consumer Insight
❖ Predicts viewing habits
❖ Fuels recommendation
engines
❖ Qualitative research
❖ A/B testing
❖ Adaptive row ordering
❖ Title Placement
❖ ..
Events gathered
❖ Time of day content is
watched
❖ Time spent selecting
content
❖ Playback stopped by
user or network
congestion
❖ Bookmarking
❖ ..
Anomaly Detection
❖ Device firmware
differences
❖ Real time diagnostics
❖ Device health check
❖ Network tput and
congestion differences
across ISP networks
❖ ..
Amer Ather
Netflix Performance Engineering
Netflix
Monitoring
And
Tracing
Amer Ather
Netflix Performance Engineering
Self Service Monitoring and Debugging
Monitoring: Build custom dashboards for better correlation and root cause analysis
❖ Service monitoring (Atlas/Lumen) - Telemetry system for microservice health and auto-scaling
❖ Device monitoring (Mantis) - Event streams from devices filtered for: health check, perf, analytics..
❖ Host monitoring - (Vector) - Web UI for on-demand system monitoring
❖ Ad hoc monitoring - (Abyss) - Low level deep dive performance analysis for escalated issues
Anomaly Detection: Detect slower performing instances, infrastructure issues and faulty hardware
❖ Alerts - Set up alerts on Atlas metrics and filters on Mantis event streams
❖ Chronos - Tracks infrastructure changes: Service or OS. changes are logged to aid root cause
❖ jvmquake - Terminate nodes with abnormal Garbage Collection Time (GC)
❖ aws_io_detection - Monitor and terminate nodes with IO errors
❖ BaseAMI is frequently updated to detect and remedy known cloud infrastructure issues
Tracing: Low level analysis and distributed tracing
❖ FlameGraph - Aggregates cpu profiling data. Help identify hot stacks
❖ FlameScope - identify cause of cpu usage variation at sub-second granularity
❖ Zipkin | Slalom- Distributed or dependency graphing and tracing for microservices
❖ Java Flight Recorder - A profiling and event collection framework available in OpenJDK
Netflix Tech Blogs Resources
❖ Netflix Edge Load Balancing
❖ API - making API Resilient to Failures
❖ Cachie Warming for Stateful Service
❖ Netflix Falcor and Json Graph
❖ Regional Failover in 7 minutes
❖ FIT: Failure Injection Testing
❖ ChAP: Netflix Chaos Automation Platform
❖ Distributing Content to Netflix CDN
❖ Machine Learning to Improve Streaming Quality
❖ Netflix Playback and Downloads
❖ Stream-processing with Mantis
❖ Netflix Stream Data Pipeline
❖ Vector: on-host performance monitoring
❖ Extending Vector with eBPF
❖ Atlas: Netflix Telemetry Platform
❖ Flamegraph: visualize cpu profiling data
❖ FlameScope: Trace Event, Chrome and More Profile Formats
❖ Titus: Running Containers at Scale at Netflix
❖ Spinnaker: Global Continuous Delivery
My BIO
Amer Ather
Netflix Performance Engineering
Thank you
Amer Ather
Netflix Performance Engineering

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web ServicesAmazon Web Services
 
Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)Garvit Anand
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인
성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인
성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인Amazon Web Services Korea
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Animesh Singh
 
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Amazon Web Services
 
DevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation SlidesDevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation SlidesSlideTeam
 
Case study on cloud computing
Case study on cloud computingCase study on cloud computing
Case study on cloud computingSnehal Takawale
 
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...Kai Wähner
 
Introduction to Amazon Lightsail
Introduction to Amazon LightsailIntroduction to Amazon Lightsail
Introduction to Amazon LightsailAmazon Web Services
 
Overview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWSOverview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWSAmazon Web Services
 
What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?Amazon Web Services
 
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...SlideTeam
 
게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...
게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...
게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...NAVER CLOUD PLATFORMㅣ네이버 클라우드 플랫폼
 

Was ist angesagt? (20)

Service mesh
Service meshService mesh
Service mesh
 
Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web Services
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to Serverless
 
Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)Introduction to Amazon Web Services (AWS)
Introduction to Amazon Web Services (AWS)
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인
성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인
성공적인 클라우드 마이그레이션을 위한 디지털 트랜스포메이션 전략 - Gregor Hophe :: AWS 클라우드 마이그레이션 온라인
 
Cloud Migration: A How-To Guide
Cloud Migration: A How-To GuideCloud Migration: A How-To Guide
Cloud Migration: A How-To Guide
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!
 
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Fundamentals of Cloud Computing & AWS
Fundamentals of Cloud Computing & AWSFundamentals of Cloud Computing & AWS
Fundamentals of Cloud Computing & AWS
 
DevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation SlidesDevOps Powerpoint Presentation Slides
DevOps Powerpoint Presentation Slides
 
Case study on cloud computing
Case study on cloud computingCase study on cloud computing
Case study on cloud computing
 
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
Microservices, Containers, Docker and a Cloud-Native Architecture in the Midd...
 
Introduction to Amazon Lightsail
Introduction to Amazon LightsailIntroduction to Amazon Lightsail
Introduction to Amazon Lightsail
 
Overview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWSOverview of AWS by Andy Jassy - SVP, AWS
Overview of AWS by Andy Jassy - SVP, AWS
 
DevOps and Cloud
DevOps and CloudDevOps and Cloud
DevOps and Cloud
 
What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?What is Cloud Computing with Amazon Web Services?
What is Cloud Computing with Amazon Web Services?
 
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
Devops Strategy Roadmap Lifecycle Ppt Powerpoint Presentation Slides Complete...
 
게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...
게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...
게임 산업을 위한 네이버클라우드플랫폼(정낙수 클라우드솔루션아키텍트) - 네이버클라우드플랫폼 게임인더스트리데이 Naver Cloud Plat...
 

Ähnlich wie Netflix Massively Scalable, Highly Available, Immutable Infrastructure

Micro Services Architecture
Micro Services ArchitectureMicro Services Architecture
Micro Services ArchitectureRanjan Baisak
 
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'OpenStack Korea Community
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftRX-M Enterprises LLC
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservicesMithun Arunan
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrailnvirters
 
Building Modern Digital Services on Scalable Private Government Infrastructur...
Building Modern Digital Services on Scalable Private Government Infrastructur...Building Modern Digital Services on Scalable Private Government Infrastructur...
Building Modern Digital Services on Scalable Private Government Infrastructur...Andrés Colón Pérez
 
The Show Must Go On! Using Kafka to Assure TV Signals Reach the Transmitters
The Show Must Go On! Using Kafka to Assure TV Signals Reach the TransmittersThe Show Must Go On! Using Kafka to Assure TV Signals Reach the Transmitters
The Show Must Go On! Using Kafka to Assure TV Signals Reach the TransmittersHostedbyConfluent
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Pace of Innovation at AWS - London Summit Enteprise Track RePlay
Pace of Innovation at AWS - London Summit Enteprise Track RePlayPace of Innovation at AWS - London Summit Enteprise Track RePlay
Pace of Innovation at AWS - London Summit Enteprise Track RePlayAmazon Web Services
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...confluent
 
HTTP_SS_ENTERPRISE_EN
HTTP_SS_ENTERPRISE_ENHTTP_SS_ENTERPRISE_EN
HTTP_SS_ENTERPRISE_ENBernd Thomsen
 
The Need for Complex Analytics from Forwarding Pipelines
The Need for Complex Analytics from Forwarding Pipelines The Need for Complex Analytics from Forwarding Pipelines
The Need for Complex Analytics from Forwarding Pipelines Netronome
 
ONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINAONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINAJunho Suh
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
 
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
 
Building a Service Mesh with NGINX Owen Garrett.pptx
Building a Service Mesh with NGINX Owen Garrett.pptxBuilding a Service Mesh with NGINX Owen Garrett.pptx
Building a Service Mesh with NGINX Owen Garrett.pptxPINGXIONG3
 
LEC_10_Week_10_Server_Configuration_in_Linux.pdf
LEC_10_Week_10_Server_Configuration_in_Linux.pdfLEC_10_Week_10_Server_Configuration_in_Linux.pdf
LEC_10_Week_10_Server_Configuration_in_Linux.pdfMahtabAhmedQureshi
 
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys Corporation
 
The Netflix API for a global service
The Netflix API for a global serviceThe Netflix API for a global service
The Netflix API for a global serviceKatharina Probst
 

Ähnlich wie Netflix Massively Scalable, Highly Available, Immutable Infrastructure (20)

Micro Services Architecture
Micro Services ArchitectureMicro Services Architecture
Micro Services Architecture
 
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache Thrift
 
Move fast and make things with microservices
Move fast and make things with microservicesMove fast and make things with microservices
Move fast and make things with microservices
 
Banv meetup-contrail
Banv meetup-contrailBanv meetup-contrail
Banv meetup-contrail
 
Citrix Day 2014: NetScaler 10.5
Citrix Day 2014: NetScaler 10.5Citrix Day 2014: NetScaler 10.5
Citrix Day 2014: NetScaler 10.5
 
Building Modern Digital Services on Scalable Private Government Infrastructur...
Building Modern Digital Services on Scalable Private Government Infrastructur...Building Modern Digital Services on Scalable Private Government Infrastructur...
Building Modern Digital Services on Scalable Private Government Infrastructur...
 
The Show Must Go On! Using Kafka to Assure TV Signals Reach the Transmitters
The Show Must Go On! Using Kafka to Assure TV Signals Reach the TransmittersThe Show Must Go On! Using Kafka to Assure TV Signals Reach the Transmitters
The Show Must Go On! Using Kafka to Assure TV Signals Reach the Transmitters
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Pace of Innovation at AWS - London Summit Enteprise Track RePlay
Pace of Innovation at AWS - London Summit Enteprise Track RePlayPace of Innovation at AWS - London Summit Enteprise Track RePlay
Pace of Innovation at AWS - London Summit Enteprise Track RePlay
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
 
HTTP_SS_ENTERPRISE_EN
HTTP_SS_ENTERPRISE_ENHTTP_SS_ENTERPRISE_EN
HTTP_SS_ENTERPRISE_EN
 
The Need for Complex Analytics from Forwarding Pipelines
The Need for Complex Analytics from Forwarding Pipelines The Need for Complex Analytics from Forwarding Pipelines
The Need for Complex Analytics from Forwarding Pipelines
 
ONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINAONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINA
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
 
Building a Service Mesh with NGINX Owen Garrett.pptx
Building a Service Mesh with NGINX Owen Garrett.pptxBuilding a Service Mesh with NGINX Owen Garrett.pptx
Building a Service Mesh with NGINX Owen Garrett.pptx
 
LEC_10_Week_10_Server_Configuration_in_Linux.pdf
LEC_10_Week_10_Server_Configuration_in_Linux.pdfLEC_10_Week_10_Server_Configuration_in_Linux.pdf
LEC_10_Week_10_Server_Configuration_in_Linux.pdf
 
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
 
The Netflix API for a global service
The Netflix API for a global serviceThe Netflix API for a global service
The Netflix API for a global service
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Kürzlich hochgeladen (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Netflix Massively Scalable, Highly Available, Immutable Infrastructure

  • 2. Netflix Facts ❖ Leading Video streaming Service ❖ 140+ million paid subscribers globally ❖ 190 counties ❖ Millions of hours watched per month ❖ 13 Billion spent on content per year ❖ 15% world’s Internet bandwidth ❖ 1998 - Netflix was founded ❖ 1999 - DVD distribution launched ❖ 2007 - Video stream launched ❖ 2010 - Expanded into Canada ❖ 2014 - Expanded into Europe ❖ 2016 - Globally Launched Amer Ather Netflix Performance Engineering
  • 3.
  • 4. Load Balancing across AWS Regions ❖ Multiple active AWS regions ❖ Traffic load balanced across 3 AWS regions ❖ Takes into account geographical location of subscriber ❖ Enough capacity to handle region failures gracefully ❖ Region failover is handled via Netflix Gateway (Zuul) and DNS steering Note: Netflix avoids unbalanced regions by shifting portion of local traffic to remote regions Netflix Control Plane us-west-2 eu-west-1 us-east-1 Amer Ather Netflix Performance Engineering
  • 6. Zuul - Front Door to Netflix Ecosystem Self Service Routing ❖ Traffic sharding ❖ Gradual migration ❖ Canary and squeeze testing ❖ Authentication ❖ Security rules to reject traffic from bad devices Resiliency and LB ❖ Failover around server failures: slow response, GC ❖ Graceful traffic ramp up to newly launched instances ❖ Blacklist bad instances ❖ Prevent overloading ❖ Track server utilization Anomaly Detection ❖ Aggregate error rates to detect if service is in trouble ❖ Contextual alerting about anomalies to support and service teams ❖ Helps with root cause and correlation Amer Ather Netflix Performance Engineering
  • 7. Netflix Edge Services API and Edge PaaS Amer Ather Netflix Performance Engineering
  • 8. API - Netflix Edge Services ❖ Tier 1 Service ❖ Serves Netflix devices ❖ Compose calls to mid tier services required to construct a response ❖ Orchestrate UI request to mid tier services ❖ Fallback logic to avert customer facing outages ❖ Abstract away mid tier changes from UI development ❖ Facade over the entirety of Netflix mid-tier services ❖ Proxy device requests to reduce network chattiness and latency ❖ Promotes request/response model that best fits device unique requirement Amer Ather Netflix Performance Engineering
  • 9. Edge PaaS - Netflix Edge Services Decouple device UI development from API and mid Tier service changes Per device endpoints customized for device type for a richer experience Each endpoint is isolated in container for better visibility and debugging Node Quark platform for ease of node.js development and integration with Netflix platform Titus Container Platform Cloud Deployment via Spinnaker CI/CD Platform RSL Remote Service Layer for data access to API tier via remote calls R S L Device specific instead of traditional REST API (Device code is mostly written in javascript) nodejs Amer Ather Netflix Performance Engineering
  • 11. Load Shedding (server) ❖ When service is running in steady state: concurrency = service time x service rate ❖ Requests in excess of this concurrency limit cannot be serviced immediately. ❖ Service has two options: queue or reject ❖ Netflix services reject requests over the set limit to avoid oversaturation ❖ Server-side throttling is performed by setting up a cap on concurrent request a service can handle Netflix microservices uses servletFilter mechanism, as part of platform library, for intercepting interesting requests and throttle it based on current load on the server Amer Ather Netflix Performance Engineering
  • 12. Fault Tolerance (Client) Netflix microservices based on gRPC do not use Ribbon and Hystrix libraries, as features offered are already provided in gRPC. ❖ Services protects itself from latency and failure conditions: ➢ 5xx response, connection refused and timeout ❖ Retry request can be routed to next server due to load balancing until max retries are reached ❖ Fail fast and rapid recovery ❖ Fallback and graceful degrade ❖ Fallback to failure paths to avert outages ❖ Stop cascading failures ❖ AWS Zone aware load balancing (Zone Affinity) Netflix microservices use Ribbon RestClient and Hystrix library to setup latency and failure tolerance to downstream dependency service. Amer Ather Netflix Performance Engineering
  • 13. Auto Discover Concurrency Limits ❖ Setting concurrency limits manually in a changing environment is challenging ❖ Require constant care and monitoring due to change in load characteristics ❖ Better approach: Identify concurrency limits dynamically and throttle requests before service degrades ❖ Concept is borrowed from TCP congestion control algorithms: ➢ congestion window to determine packets transferred without incurring timeouts ➢ Tracks minimum and time sampled latency ratio => RTTnoload/RTTactual ➢ Grow (increase request rate) window if ratio = 1 ➢ Shrink (decrease request rate) window if ratio < 1 ❖ Limit is adjusted using a formula: newLimit = currentLimit x (RTTnoload/RTTactual) + queueSize Netflix microservices are in the process of migrating to gRPC from internal Ribbon IPC mechanism. Netflix has open sourced gRPC library , for dynamically auto-detecting concurrency limits of the service queueSize is tunable, that determines how fast queue can grow Amer Ather Netflix Performance Engineering
  • 15. Microservice Architecture Architecture designed to decompose one large monolithic application into suite of small services. Where each service: ❖ Implements different sets of business logic ❖ Is a software module exposed on network via web API ❖ Interacts via some form of RPC mechanism; Netflix Ribbon, gRPC ❖ RPC is a thin layer over standards: HTTP/1.1, HTTP/2.0 transports ❖ Exchanges data via: JSON, Protocol Buffers ❖ Builds, deploys, upgrades and scales independently ❖ Can be developed in different languages: java, python, go, nodejs.. ❖ Is free to choose its own datastore for persistence: cassandra, memcache, redis, elasticsearch, mongoDB.. ❖ Platform libraries supports: Retry, Timeouts, Load balancing, Fall back ❖ Massively scalable due to loose coupling, stateless model and data sharding Amer Ather Netflix Performance Engineering
  • 16. Microservices Design Rules ❖ Services should not share data or database ❖ Services expose their data and functionality only through well defined service interface ❖ Transaction should not span multiple services as it violates their autonomy ❖ One service should not lock resources of another service ❖ API first, that takes into account upstream service requirements (client) and dependency on downstream services. Externalizable (open to public) without major effort ❖ Split service into multiple microservices when functions performed by a service have no strong relationship with one another. Ideally, each service team should own the release cycle as well as the production operations (DevOps) of their service Amer Ather Netflix Performance Engineering
  • 17. Monolithic vs MicroServices ➢ Data Center Architecture ➢ Design for predictable scalability ➢ Relational DB: Oracle, mySQL ➢ Strong consistency ➢ Shared database ➢ Serial and synchronized processing ➢ Design to avoid failures ➢ Infrequent and slower updates ➢ Manual management ➢ Failures may result in outage ➢ Limited scalability due to stateful design ➢ Cloud Architecture ➢ Decomposed and decentralized ➢ Design for elastic scale ➢ Polyglot persistence (mix of datastores) ➢ Eventual consistency ➢ Sharded datasets ➢ Parallel and async processing ➢ Design for failure ➢ Frequent updates (more features) ➢ Self-management (DevOps, CI/CD) ➢ Massively scalable due to stateless design goals ➢ Immutable infrastructure Amer Ather Netflix Performance Engineering
  • 18. RESTful Service (Web API) ❖ A platform that exposes data as a resource on which to operate All client actions to resource (identified by URI) are represented by HTTP CRUD methods: ➢ POST/PATCH:Create | GET: Read | PUT: Update | DELETE: Deletion ➢ HTTP status codes (2xx, 3xx, 4xx, 5xx) are returned with response. ❖ Server response is sent in JSON ❖ A simple client (curl) can be used to invoke REST methods ❖ Each request/response is stateless and thus can be cached and massively scaled. ❖ Client maintains state and furnishes to server at every request Amer Ather Netflix Performance Engineering
  • 19. “REST-ish” API - Netflix Falcor ❖ REST interface works well for large hypermedia resources ❖ WebApp deals with structured data, that can be large number of small resources, e.g video metadata ❖ Latency becomes a major constraint when fetching these small resources via REST calls on mobile networks. ➢ Rendering Netflix home page on device may require 20-30 REST calls to server ❖ REST-ish API where developer wants to do more with a REST call ❖ REST-ish API is less RESTful and more RPC ➢ URL of a resource is used to invokes a procedure call ➢ URL query string becomes RPC parameters ❖ Falcor represents data as one giant JSON model and offer async API, that allows data to be pushed to model via callback ❖ Same benefits of REST (cache consistency, loose coupling) ❖ Batching multiple requests results in a single network request. ❖ Falcor represent data as JSON graph by using references. This avoids duplicates and stale data by storing at one place JSON graph detects duplicates and avoids stale data Amer Ather Netflix Performance Engineering
  • 20. gRPC for Microservices (Benefits) ❖ RPC framework for building microservices that uses HTTP/2 transport to support advanced features: ➢ Request multiplexing (streams) , pipelining, Server push; Binary protocol ❖ Protocol Buffers are used for defining and serializing structured data into efficient binary format No JSON scheme. ❖ A client can invoke a method on a different machine as if it were a local object. RPC methods becomes a RPC endpoints ❖ Netty transport provides async and non-blocking IO ❖ Strongly typed and versioned. Simplified API (struct in, struct out) ❖ Decouples the interface from any specific programming language via IDL ❖ Automated code generation to implement service interfaces (API): clients, server, data models, metrics, logging, tracing, failover, retry, deadline, cancellation etc.. Support plugin to extend features ❖ HttpRule in service definition to define mapping of an RPC method to HTTP REST methods Amer Ather Netflix Performance Engineering
  • 22. Netflix Global Cache (EVCache) ❖ Stateless microservices often maintain state in caches or persistent tier ❖ Caches offer loose coupling by maintaining states for stateless services ❖ EVCache is a RAM+NVMe based key-value store (memcache) that offers low latency and scalable caching solution. Optimized for Cloud and Netflix use cases. ❖ Maintain state in-region and across region via global replication design to serve requests originated from any region, using Kafka based cross-replication replication ➢ Eventual tunable consistency model that tolerates inconsistency for some time ➢ Asynchronous replication, keeps local cache operations not to be affected by transient failures in updating caches in other region ➢ Avoid “thundering herd” scenario that may result due to cold caches after region failover. ❖ Caching tier is used for caching computed data and data retrieved from persistence store like: Cassandra, S3, DB… ❖ Evcache tier is also used for replica and instance cache warming: ➢ To recover data from lost Evcache instances ➢ To scale up caching tier for more storage and network capacity Amer Ather Netflix Performance Engineering
  • 24. What is Immutable Infrastructure ❖ Never be modified in production, merely replaced with the new updated one ❖ No reboot or individual server changes in production during its lifespan ❖ Changes are made to base image and then deployed on new server instances ➢ Older server instances are terminated at successful deployment ❖ Rollback changes in case of problem ❖ Guarantees known stable state, if frequently destroyed and deployed. No configuration drift ❖ Follow Infrastructure-as-a-code methodology, that rebuilds the whole environment from the scratch by easy to adjust manifests. Netflix microservices offer “fast property” that allows enabling/disabling limited features dynamically while service is running in production. Amer Ather Netflix Performance Engineering
  • 25. Immutable infrastructure (CI/CD Platform)Updates, Canary analysis, and Deployments are fully automated and architestrated via continuous Integration and Continuous Deployment or Delivery (CI/CD) platform
  • 26. Public Cloud as Immutable Infrastructure ❖ Disposable or throw away cloud instances ❖ Decide what infrastructure and services to manage. Public cloud providers offer number of useful managed services ❖ Cloud deployable entities: VM, Firecracker, Containers, Fargate, Lambda ❖ No hardware to repair or troubleshoot. Just provision a new one ❖ Failures are non-event. Health check failures result in redirected traffic ❖ Bad or terminated instances are replaced without human intervention ❖ No service down time due to massive deployment and fault tolerance ❖ Elastic capacity and pay-as-you-go model ❖ Auto scaling rules keep enough resources available to meet load demand ❖ Global reach and availability to execute disaster recovery plans My presentation on Public Cloud Computing Workshop Amer Ather Netflix Performance Engineering
  • 28. Planning for Failure Regional Failures (Nimble) ❖ Drop in SPS (Stream Per Second) metrics triggers regional failover ❖ Regional failover is executed in 7 Minutes ❖ Failover efficiency is achieved by keeping dark capacity online in each region ❖ Dark capacity is whitelisted to take production traffic at failover time. Limited Scope Failures (ChAP) ❖ ChAP tests service resilience to failures and validates fallbacks behave as expected. ❖ ChAP helps uncover systemic weaknesses that may occur when higher latency is induced ❖ ChAP service uses FIT (Failure Injection) framework for fine grain control on failure and its impact ❖ Zuul gateway updates requests with FIT metadata that provides failure context to microservices involved. ❖ Microservices checks FIT context to determine if particular request should be impacted Amer Ather Netflix Performance Engineering
  • 30. Open Connect Appliances (OCA) Features ❖ Netflix Managed CDN ❖ Directed Cached Appliances ❖ Deployed at IXP and ISP colocations globally ❖ Serve subscribers from location closer to them for optimal viewing experience ❖ Reduce ISP and Netflix cost of transporting content ❖ Local caching leads to reduce and responsible use of Internet ❖ Network capacity of ~100Gbps ❖ Stores portion of Netflix Catalog ❖ 7x24 monitoring Periodic Fill and Allocation ❖ Push Fill Methodology ❖ Download popular content during non-peak bandwidth ❖ Incremental and Tiered Filling ❖ ML models - compute content popularity to decide what title to catch by aggregating Title/File usage and viewing history ❖ HCA algorithm for content distribution that offers efficient use of server resources ❖ Adopt algorithms to deal with dynamics of regional member preferences, evolving network conditions, and new markets Amer Ather Netflix Performance Engineering
  • 31. Adaptive Streaming Adaptive Bitrate (ABR) ❖ Maximize video quality without rebuffer events ❖ Adopt to network events by picking different bitrate ❖ Multiple profiles (format) for every title encoded ❖ Supported Bitrate: 235 kbps - 5 Mb/s (4K video) ❖ Cache one or more files for each quadruple: ➢ title, profile, bitrate, language ➢ E.g: one episode of Crown = 1200 files ❖ Video Codec: H.264/AVC, HEVC, VP9, AV1 ❖ Audio Codec: AAC, DD+, ATMOS ❖ Max Resolution: 1080p, 2160p, 4K, HDR, HFR Per-shot encoding ❖ Dynamic Optimizer Encoding ❖ Allocate bits optimally for best overall quality ❖ Selects best encoding recipe per-shot ❖ Remove redundancy in video stream via Spatial and temporal prediction/correlation ❖ 64% less bits for the same quality ❖ Good streaming experience at < 200 kbps ❖ 4 GB Plan = 30 Hours of quality viewing
  • 32. Netflix Data Pipeline Amer Ather Netflix Performance Engineering
  • 33. Event Stream Processing at Cloud Scale ❖ Collects, aggregate, process and moves data at cloud scale ➢ 500 billion events or 1.3 PB data per day ➢ Peak traffic : 8 million events/sec or 24 GB/s ❖ Type of event streams flowing into the pipeline: ➢ Video viewing and UI activities ➢ Device error logs, diagnostics, and perf events ❖ Kafka as a replicated persistent message queue ➢ Multiple copies with 12-24 hour retention period ❖ Data is ingested into kafka fronting clusters via Java Library or via Kafka REST endpoint. ❖ Route events from Kafka to various sinks: ➢ Elasticsearch - for near real time analysis ➢ S3 bucket - imported into Hives for Data warehouse and Big Data Analytics ➢ consumer kafka tier - used by streaming services like: Mentis and Spark streaming Amer Ather Netflix Performance Engineering
  • 34. Business Insight Consumer Insight ❖ Predicts viewing habits ❖ Fuels recommendation engines ❖ Qualitative research ❖ A/B testing ❖ Adaptive row ordering ❖ Title Placement ❖ .. Events gathered ❖ Time of day content is watched ❖ Time spent selecting content ❖ Playback stopped by user or network congestion ❖ Bookmarking ❖ .. Anomaly Detection ❖ Device firmware differences ❖ Real time diagnostics ❖ Device health check ❖ Network tput and congestion differences across ISP networks ❖ .. Amer Ather Netflix Performance Engineering
  • 36. Self Service Monitoring and Debugging Monitoring: Build custom dashboards for better correlation and root cause analysis ❖ Service monitoring (Atlas/Lumen) - Telemetry system for microservice health and auto-scaling ❖ Device monitoring (Mantis) - Event streams from devices filtered for: health check, perf, analytics.. ❖ Host monitoring - (Vector) - Web UI for on-demand system monitoring ❖ Ad hoc monitoring - (Abyss) - Low level deep dive performance analysis for escalated issues Anomaly Detection: Detect slower performing instances, infrastructure issues and faulty hardware ❖ Alerts - Set up alerts on Atlas metrics and filters on Mantis event streams ❖ Chronos - Tracks infrastructure changes: Service or OS. changes are logged to aid root cause ❖ jvmquake - Terminate nodes with abnormal Garbage Collection Time (GC) ❖ aws_io_detection - Monitor and terminate nodes with IO errors ❖ BaseAMI is frequently updated to detect and remedy known cloud infrastructure issues Tracing: Low level analysis and distributed tracing ❖ FlameGraph - Aggregates cpu profiling data. Help identify hot stacks ❖ FlameScope - identify cause of cpu usage variation at sub-second granularity ❖ Zipkin | Slalom- Distributed or dependency graphing and tracing for microservices ❖ Java Flight Recorder - A profiling and event collection framework available in OpenJDK
  • 37. Netflix Tech Blogs Resources ❖ Netflix Edge Load Balancing ❖ API - making API Resilient to Failures ❖ Cachie Warming for Stateful Service ❖ Netflix Falcor and Json Graph ❖ Regional Failover in 7 minutes ❖ FIT: Failure Injection Testing ❖ ChAP: Netflix Chaos Automation Platform ❖ Distributing Content to Netflix CDN ❖ Machine Learning to Improve Streaming Quality ❖ Netflix Playback and Downloads ❖ Stream-processing with Mantis ❖ Netflix Stream Data Pipeline ❖ Vector: on-host performance monitoring ❖ Extending Vector with eBPF ❖ Atlas: Netflix Telemetry Platform ❖ Flamegraph: visualize cpu profiling data ❖ FlameScope: Trace Event, Chrome and More Profile Formats ❖ Titus: Running Containers at Scale at Netflix ❖ Spinnaker: Global Continuous Delivery My BIO Amer Ather Netflix Performance Engineering
  • 38. Thank you Amer Ather Netflix Performance Engineering