Netflix Keystone—Cloud scale event processing pipeline

A NETFLIX ORIGINAL SERVICE
Netflix Keystone - Cloud scale event processing pipeline
[ O’reilly - Turning big data into knowledge ]
Monal Daxini

● Season 1
○ Keystone pipeline - Why? How? What?
● Season 2
○ Trailer
What to expect?

Monal Daxini
I lead the Stream Processing effort in the
Real Time Data Infrastructure team @ Netflix
@monaldax #Netflix #Keystone
Who?

Netflix Is a Data Driven Company
Content
Product
Marketing
Finance
Business Development
Talent
Infrastructure
←CultureofAnalytics→

Netflix Service
Truly Global Internet TV Network

Over 80 Million Members
190 Countries
1000+ devices

125,000,000 hours/day
14,269 years worth / day
37% of Internet traffic at peak

Over
1,000,000,000,000
That’s a huge number!
Keystone Events Processed Every Day

Events Trend
1/2014 80B / day 1/2015 300B / day 1/2016 1T / day

Keystone Scale
Daily Averages
● 700B unique events ingested
● 1T events processed
● 1.5PB / day
● 4K / event
Peak
● 1T unique events ingested/day
● 12.5M / sec
● 35GB / sec
● 10MB / message

● 99.99% / Four 9s
● Initial numbers with VPC
○ 99.9999% / Six 9s
Keystone Availability

In the beginning… Chukwa
EMR
Event
Producers

Q4 2014 - Chukwa / Suro
Event
Producer
Druid
Stream
Consumers
EMR
Consumer
Kafka
Suro Router
Event
Producer
Suro
Kafka
Suro
Proxy

● Support at-least-once processing
● Scale, Multi-tenancy, Ease of Operations
● Enable future value adds - Stream Processing As a Service
● Replace dormant open source software - Chukwa
Why a new pipeline?

Goal - Migrate 1.3 PB of event data to a new Pipeline
in flight, while not losing more that 0.1% of them

Q4 2015 - Keystone
Stream
Consumers
Samza
Router
EMR
Fronting
Kafka
Consumer
Kafka
Control Plane
Event
Producer
KSProxy

Want to know more...
Netflix Tech Blog - Pipeline Evolution

Event flow
Keystone Pipeline As a Service

Keystone
Stream
Consumers
Samza
Routers
EMR
Fronting
Kafka
Clusters
Event
Producer
Consumer
Kafka
Control Plane

Keystone
Stream
Consumers
Samza
Router
EMR
Fronting
Kafka
Event
Producer
Consumer
Kafka
Control Plane

Self Service UI (Keystone Management)

Event Payload is Immutable
At-least-once semantics*
* Once the event makes it to Kafka, there are disaster scenarios where this breaks.

Injected Event Metadata
● GUID
● Timestamp
● Host
● App
`

Custom Extensible Wire Protocol
● Backwards and forwards compatibility
● Support multiple serialization formats
○ JSON, AVRO, Protobuf in the works
● Additional metadata
● Efficient - 10 bytes overhead per message

Netflix Kafka Producer
● Configurable - topic to Kafka clusters routing
● Sticky partitioner
● Prefer event drop than disrupt producer app
● Best effort delivery, ack = 1
● Buffer size tuning based on traffic

Kafka (prod) Footprint
Fronting
Kafka
Front Standby Kafka Consumer Kafka
Number of Clusters 24 24 8
Number of Instances 3000+ 72 1000+
Retention Period 8 to 24 hrs 1 hr 2 to 4 hrs

● Independent zookeeper cluster per Kafka cluster
● 5 nodes per ensemble - 160 zookeeper nodes
● 3 ASGs per cluster, 1 ASG per zone
Kafka (prod) Footprint

● Pioneer Tax
● Started with 0.7, went live with 0.8.2
● Done moving to 0.9 & VPC
● Work closely with Confluent to get patches through
○ OSS contributions
Kafka in the Cloud

● No dynamic topic creation
● Two copies
● Rack / Zone aware partition assignment
● Per Cluster Stay under 10k partitions & 200 brokers
● Leave approx. 40% free disk space on each broker
Fronting Kafka Topics

Want to know more...
Netflix Tech Blog - Kafka in Keystone Pipeline

Samza Job Deployment
● Multiple Samza jobs for one Kafka source topic
● Each job processes messages for one sink
● Each job processes partitions only from one topic
● One checkpoint topic per Kafka source topic and
multiple samza jobs
● Job starts with Immutable Config

● Over 13,000+ docker containers (samza jobs)
● 1,300+ AWS C3-4XL instances
Routing Service Footprint
S3 ElasticSearch Consumer Kafka
Number of containers 7000+ 1500+ 4500+

Routing Latency
Fronting Kafka to Sinks
S3 ElasticSearch Consumer Kafka
1 sec 13 sec 800 ms

Routing Infrastructure
+
Checkpointing
Cluster
+
0.9.1
Go

Router Job Manager
(Control Plane)
EC2 Instances
Zookeeper
(Instance Id assignment)
Job
Job
Job
ksnode
Checkpointing Cluster
ASG

Custom Go
Executor
./runJob
Logs
Snapshots
Attach Volumes
./runJob
./runJob
Reconcile Loop - 1 min
Health Check
What’s running in ksnode?
Zookeeper
(Instance Id assignment)

Logs ZFS Volume
Snapshots
Custom Go
Executor
.
/runJo
b
.
/runJo
b
.
/runJo
b
Go Tools Server
Client
Tools
Stream Logs
Browse through
rotated logs by date
Ksnode Tooling

Yes! You inferred right!
No Mesos & No Yarn

Samza Tweaks to ver 0.9.1
● Using ThreadJobFactory - Simplifies deployment and reduces overhead
● SAMZA-41 - range based static partition range assignment
● SAMZA-775- size based Prefetch buffer
○ Default was count based (OOM), and not bytes based
○ Set per topic per job to 60 * peak bytes / sec over the past week.

Samza Tweaks to ver 0.9.1
● Backported from 0.10
○ SAMZA-655 - environment variable configuration rewriter
■ Pass config from RDS to executor to Docker to Samza Job
○ SAMZA-540 - expose latency related metrics in OffsetManager
■ checkpointed offset gauge

More Info - Monal’s Samza Meetup Slides (10/2015)
Netflix Samza ver 0.9.1 Contributions

Keystone
Stream
Consumers
Samza
Router
EMR
Fronting
Kafka
Consumer
Kafka
Control Plane
Event
Producer
KSProxy

Customer Facing per topic end-to-end dashboard

A True Story
Keystone went live 10.27.2015
2 days later...

A True Story
● 80% Loss over 6 hour period
● Large Kafka clusters were impacts, smaller ones were fine

● At times things go wrong, and there’s no turning back
● Reduce complexity
● Minimize blast radius
● Find a way to start over fresh
Lessons Learned

Samza Router
Fronting
Kafka
Event
Producer
X
Failover
Stand-In
Kafka

● Cold standby Kafka cluster with 3 instances and different instance type
● Different ZooKeeper cluster with no state
● When failover occur
○ Scaling up cluster
○ Creating topic
○ Creating new routing jobs for failover cluster
○ Switch producer traffic!
Failover

Kafka Kong
At least once
a week

Fronting Kafka Failover
Self Service Tool

Fronting Kafka Failover
Fully
Automated
● Time is the essence - failover as fast as 5 minutes

Culture
What does it have to do with
building a pipeline?

"It may well be the most important document
ever to come out of the Valley." 1
Sheryl Sandberg
COO, Facebook
1 Business Insider, 2013

Netflix Culture Deck
Netflix Culture
Freedom & Responsibility

Open source and community participation is
an integral part of our strategy and culture

Not DevOps, but move towards NoOps
You build it! You run it!

My team has
● No dedicated product or project managers
● No separate devops or operations team
We build and run what you saw today!

● This does not mean we are constantly overworked
○ we make wise and simple choices and
○ lean towards automation & self-healing systems
We build and run what you saw today!

We built a pipeline in a year with
A very small team,
Relevant new technology,
Contributed back to OSS, and
Processed over 1 Trillion messages / day
Culture Impact

Create DuploⓇ
Blocks:
Let reusability drive new value
Our Philosophy

Evolution
Keystone
Stream Processing
(SPaaS)
Keystone
Management
Keystone
Messaging
Schema Support

Simple and intuitive interface to
manage all Keystone services
Keystone Management

Keystone Stream Processing
Stream Processing As a Service (SPaaS)
Multi-tenant polyglot support for stream processing engines
BETA

Big Data Systems - streaming
Data Pipeline & Stream processing - Keystone - Samza / Flink (poc)
Playback & edge Operations insight - Mantis
Stream Processing - Spark Streaming
* Metrics & monitoring - Atlas *

SPaaS Architecture
SPaaS
Manager
Container
Runtime
Beam API
or
Framework Specific API
[ Dockerized Job ]
1. Create 2. Submit 3. Launch
Runner
Flink / Spark /
Mantis
Running
Job
1. Submit Job
DSL (SQL)
Job Dashboard
BETA

● Starting out Proof Of Concept with Apache Flink
● Exploring Apache Beam
SPaaS - init( )
BETA

Why Apache Beam?
○ Portable API layer for building sophisticated data processing
applications
○ Unified APi for processing bounded and unbounded data sources
○ Google lineage - Dataflow model, and reflects Google’s current work
SPaaS - “Beam Me Up, Scotty ! "

Why Flink?
○ Flink implements the dataflow model
○ Correctness of results and powerful features for reasoning about time
○ Checkpoints, exactly-once processing
○ Event time, processing time, watermarks, triggers, aligned windows
(fixed, sliding), unaligned windows (dynamic or session windows)
○ Flink’s core is a streaming engine
SPaaS - Flink

More brain food...
Netflix OSS
Samza Meetup Presentation
Netflix Tech Blog
Spark Summit 2015 Talk

Thanks!
Q & A
You can find me at
@monaldax
mdaxini@netflix.com

● Special thanks to all the people who made and released these awesome
resources for free:
○ Photographs by Unsplash
Credits

Netflix Keystone—Cloud scale event processing pipeline

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Netflix Keystone—Cloud scale event processing pipeline

Ähnlich wie Netflix Keystone—Cloud scale event processing pipeline (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Netflix Keystone—Cloud scale event processing pipeline