Going from three nines to four nines using Kafka | Tejas Chopra, Netflix

•

0 gefällt mir•468 views

The document discusses using Apache Kafka to improve data upload availability from 99.9% to 99.99% when moving data between on-premise and cloud storage. It describes using Kafka to trigger uploads to the cloud from on-premise storage with 99.9% availability and using Kafka to split uploads between cloud and on-premise storage as well as rehydrating failed on-premise uploads from the cloud to achieve 99.99% availability. The presentation concludes that Kafka provides high throughput and persistence needed to design effective data rehydration strategies across cloud and on-premise storage for very high availability.

Technologie

Moving from 99.9 to 99.99
availability using Kafka
Tejas Chopra (Netﬂix, Inc.)

Agenda
- Introduction
- Problems with Cloud Storage, and ways around it
- What is availability?
- Uploads with 99.9% availability
- Uploads with 99.99% availability
- Takeaways & Lessons

Introduction
- Senior Software Engineer, Netﬂix
- Keynote Speaker: Cloud, Distributed
Systems, Blockchain
- Senior Software Engineer, Box.
- Datrium, Samsung, Tensilica, Apple,
Inc.

Cloud Conundrums
- Cheap to put data into cloud
- Pay to store it, pay even more to read it
- Solution:
- What if we can store a copy of data on
premise?
- Saves on reads
- Hot data can be on-premise, archival on
cloud
- Security, latency,
- Save millions of dollars per year
- Box: petabytes, Netﬂix: exabytes

Availability
- What is it?
- For on-premise
- For Cloud
- Gartner: Avg cost of downtime:
$5600/min.
- 99.9% : $2.8M
- 99.99%: $291K

99.9% solution
- Upload to on-premise (availability =
99.9%)
- Use kafka events to trigger uploads to
cloud
- Reads served from on-premise if
present, else fetched from cloud

99.99% solution
- Split the incoming stream to both
cloud and on-premise
- Queue failed on-premise requests
using Kafka
- Use cloud to hydrate failed uploads
on-premise
- 99.99% availability

Takeaways and Lessons
- Millions of customers, and billions of ﬁles uploaded: Kafka scales without
downtime. Kafka throughput: thousands of messages per second
- Kafka persistence compared to Kinesis - very critical in designing rehydration
strategies
- Batch handling abilities of Kafka - very useful for non-critical data - thumbnails.
- Tracking of oﬀsets in a partition: left to the consumers.
- Kafka cluster management

Thank you
- Email: chopratejas@gmail.com
- LinkedIn: https://www.linkedin.com/in/chopratejas
- Twitter: @chopra_tejas

Empfohlen

How Much Can You Connect? | Bhavesh Raheja, Disney + HotstarHostedbyConfluent

Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent

Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent

Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...HostedbyConfluent

INTRODUCING: CREATE PIPELINESingleStore

Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent

Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...confluent

From Kafka to BigQuery - Strata SingaporeOfir Sharony

Empfohlen

How Much Can You Connect? | Bhavesh Raheja, Disney + HotstarHostedbyConfluent

Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedHostedbyConfluent

Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent

Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...HostedbyConfluent

INTRODUCING: CREATE PIPELINESingleStore

Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent

Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...confluent

From Kafka to BigQuery - Strata SingaporeOfir Sharony

Solving Hybrid Cloud Data Replication with Apache CassandraAaron Ploetz

ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...Altinity Ltd

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uberconfluent

Putting Kafka Together with the Best of Google Cloud Platform confluent

Should You Read Kafka as a Stream or in Batch? Should You Even Care? | Ido Na...HostedbyConfluent

Capacity Planning Your Kafka Cluster | Jason Bell, DigitalisHostedbyConfluent

Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...HostedbyConfluent

(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services

Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...HostedbyConfluent

Introduction to Data Engineer and Data Pipeline at Credit OKKriangkrai Chaonithi

Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafkaconfluent

Moving 150 TB of data resiliently on Kafka With Quorum Controller on Kubernet...HostedbyConfluent

Introduction to Streaming Distributed Processing with StormBrandon O'Brien

Shift: Real World Migration from MongoDB to CassandraDataStax

Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumarconfluent

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...confluent

Creating an Elastic Platform Using Kafka and Microservices in OpenShift confluent

O'Reilly Media Webcast: Building Real-Time Data PipelinesSingleStore

InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...InfluxData

An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.Data Con LA

stackArmor - Security MicroSummit - McAfeeGaurav "GP" Pal

File Server and Storage Consolidation in the CloudBuurst

Weitere ähnliche Inhalte

Was ist angesagt?