ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Anomaly Detectionon 19 Billion events a day

•Als PPTX, PDF herunterladen•

4 gefällt mir•572 views

P

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.

Kafka, Cassandra and Kubernetes
at Scale –
Real-time Anomaly Detection
on 19 Billion events a day
Paul Brebner
instaclustr.com Technology Evangelist
Cassandra Track, ApacheCon 2019, Thursday September 12th 2019, Las Vegas, USA
https://www.apachecon.com/acna19/s/#/scheduledEvent/1187

Overview
1. Wow! (headlines)
2. Why? (did we do it)
3. What? (does it do)
4. How? (does it work))
5. Well? (how well did it work)
6. So What?

1 Wow!?!
Headlines

50,000 expected
1 Million descended on Woodstock
500,000 reached the venue

The Instaclustr Times
19 Billion
Anomaly Checks A Day!
Instaclustr reveals
Massively Scalable!
Fast! Affordable!
Anomaly Detector Machine
Using Open Source
Apache Cassandra,
Apache Kafka,
Kubernetes & AWS

The Instaclustr Times
19 Billion
Anomaly Checks A Day!
Instaclustr reveals
Massively Scalable!
Fast! Affordable!
Anomaly Detector Machine
Using Open Source
Apache Cassandra,
Apache Kafka,
Kubernetes & AWS

Headline
Numbers
Per Second
• 220,000 Anomaly
checks Per
Second
220000
0
50000
100000
150000
200000
250000
Anomaly checks/s

• 500x better than
previously
published results
for similar system
• 2018, Kafka,
Cassandra, Spark
• Bigger numbers?
440
220000
0
50000
100000
150000
200000
250000
Per Second
Previous published results
Previous published result Anomaly Checks/s
x500
Headline
Numbers
Per Second

• Peak 2.3 Million
Kafka writes/s
• x10 rest of
pipeline
• Kafka as a buffer,
absorbs load
spike
0.2
2.3
0.0
0.5
1.0
1.5
2.0
2.5
Millions per second
Millions Per Second
Anomaly checks/s (M) Peak Kafka writes/s (M)
Headline
Numbers
Millions
Per Second

Headline
Numbers
Daily
• Planetary scale
(population 7.7B)
• 19 Billion (1,000
Million)
checks/day
• 2.5 events per
person per day
• Had to stop
somewhere, but
no upper limit
0
2
4
6
8
10
12
14
16
18
20
Billions per day
Daily Big Numbers (Billions/day)
World Population Anomaly Checks

2 Why?
Project Goals

Project
Goals
Multiple (like Aussie
Rules Football -
AFL)

Project
Goals
• Fast Data
• Real Time
Streams
processing
• < 1s RT

Project
Goals
• Big Data
• Throughput and
Size scale
• no upper limit
• big benchmark
numbers

Cost
Effective
• Incrementally
scalable
• Only pay for what
you use
• High benefit/cost
ratio
• 1/2 car “Malcom”
movie, 1986

Apache
Kafka and
Cassandra
• Technology -
Kafka+Cassandra
use case
• Platform -
Instaclustr’s
Managed
Platform
• Features -
Provisioning,
monitoring,
scaling, and more

Kafka as a
Buffer
• Cost effective for
short load spikes
• E.g. Influx of
unexpected
festival goers
• Prevent
overloading of
rest of pipeline
• All events
(eventually)
processed

Application
Automation
and
Observability
• Complementary
technologies:
• Kubernetes
(automation)
• Prometheus
(monitoring)
• OpenTracing+
• Jaeger (tracing)

3 What?
Does it do?

What does it
do?
Anomaly
Detection
Use Case
Spot unusual events

“Man on Moon”
headlines
• 400,000 people
got them there
• JoAnn Morgan,
Saturn 5
monitoring
engineer
• Only woman in
the control room
for Apollo 11

Anomaly
Detection
Goals
Spot the difference
At speed and scale

Spot the
difference at
speed
• 1 second
maximum
• Streams
processing not
batch

Spot the
difference at
scale
• Keys and
Concurrency
• Multiple keys
• Need Big Data
database
• For Storage and
Processing
capacity

Scalability
• Massive load
(data velocity)
• Increasing load
• No upper bound
• Load spikes
Time
Load

Affordability
• Linear resource
scalability
• Elastic, on-
demand
• Incremental
resources and
cost with
changing load
Time
ResourcesandCost
$x1
$x2
$x3
$x5
$x3
$x4

Anomaly
Detection
Use Cases
Many and varied
Infrastructure
monitoring

Anomaly
Detection
Use Cases
Many and varied
Application
Monitoring

Anomaly
Detection
Use Cases
Many and varied
IoT

Anomaly
Detection
Use Cases
Many and varied
Finance fraud
detection

Anomaly
Detection
Use Cases
Many and varied
Clickstream
analytics

Anomaly
Detection
Use Cases
Many and varied
Drone deliveries

4 How does
it work?
• Anomaly
Detection
• Architecture
• Technologies

Is this our
machine?
• The Audio-Telly-o-
Tally-o Count
• Streams
processing
machine for
counting sleepers
• We’ve advanced
from this 1960’s
technology

How does it
work?
• CUSUM
(Cumulative Sum
Control Chart)
• Statistical
analysis of
historical data

Logical
steps
(1) Events arrive in a
stream
(2) Get the next event from
the stream
(3) Write the event to the
database (4)
(5) Query the historic data
from the database (4)
(6) If there are sufficient
observations, run the
anomaly detector
(7) Was a potential
anomaly detected? Take
appropriate action.

Pipeline
Design
• Design, showing
interaction with
Kafka and
Cassandra
Clusters
• Load generator,
detector pipeline
• 2 thread pools
• To constrain the
number Kafka
consumers (
Kafka partitions)
Limits number of
Kafka Consumers
2 thread pools to
Decouple Kafka Consumers
from rest of pipeline

Cloud
Deployment
Context
• Kafka and
Cassandra
clusters managed
by Instaclustr
• Application in
AWS

Cassandra
• Open Source
• NoSQL Database
• Masterless ring
architecture &
partitioned data
for
• Linear scalability
• High availability
• Fast writes
• Powerful queries
with indexes

Instaclustr
Managed
Apache
Cassandra
Benefits
■ Optimised for low latency/high throughput
■ Automated Provisioning, Monitoring, Management
■ SOC2 certified
■ Multiple cloud providers
■ 24/7 Technical support
■ Automated Health Checks
■ Dynamic scaling
■ Zero downtime migrations
■ New! Certified Apache Cassandra
● Key highlights of the Certification Report include:
ᐨ Performance testing (latency and throughput) comparing the
current version to previous versions
ᐨ 24-hour soak testing (including repairs and replaces)
ᐨ Testing against popular drivers

What is Kafka?
Message flow
Distributed streams
processing
1 Distributed Producers…
2 Send Messages
3 To Distributed Consumers
4 Via Kafka Cluster

Kafka
Key Benefits
■ Fast – high throughput and low latency
■ Scalable – horizontally scalable, just add nodes and
partitions
■ Reliable – distributed and fault tolerant
■ Zero data loss
■ Open Source
■ Heterogeneous data sources and sinks
■ Available as an Instaclustr Managed service

Application
Automation
with
Kubernetes
• AWS EKS
• Kafka load
generator and
Anomaly
Detection
Pipeline deployed
on worker nodes

Kubernetes
• An automation
system for the
management,
scaling and
deployment of
containerized
applications
• Master/worker
Nodes architecture
• Pods are units of
concurrency

Kubernetes
Benefits
• Open Source
• Cloud provider and programming language agnostic
• Develop and test code locally, then deploy at scale
• Helps with resource management – deploy application
to Kubernetes and it manages scaling up/down and
keeping application alive
• More powerful frameworks built on Kubernetes APIs
are becoming available

Observability 1
Prometheus
Monitoring
• Ran using
Kubernetes
Prometheus
Operator
• Grafana for
graphing
• Used to debug,
tune, and observe
business metrics
(TPS, RT) from
100 Pods

Prometheus
Architecture
• Monitoring of
applications and
servers
• Instrumentation
• Pull-based
• Architecture &
Components…

Prometheus
Operator
In production on
Kubernetes
Use Prometheus
Operator to manage
application
complexity and
dynamics

Observability 2
Tracing with
OpenTracing
and Jaeger
• Single traces
• Topology of
system
• Even though this
example has
simple topology,
valuable for
debugging

OpenTracing
Standard API for
distributed tracing
■ Specification, not implementation
■ Need
● Application instrumentation
● OpenTracing tracer
Traced Applications API Tracer implementations
Open Source, Datadog

Jaeger
Tracer
Open Source Tracer
Uber/CNCF

Tracing
across
Kafka topics
More complex
example:
discovering event
flows across
multiple topics
E.g. Kafka ESB

5 How well
did it work?
Scaling Out
From 3 to ???
Cassandra nodes

How well did
it work?
Scaling Out
From 3 to ???
Cassandra nodes
Due to 1:1 read/write
ratio, decreased
compression chunk
size to 1KB
“La Jamais Contente”, first car to reach 100 km/h in 1899 (electric, 68hp)

Scaling
Knobs
• Load generator
(red)
• Cluster sizes and
worker pods
(orange)
• Thread pools,
partitions and
connections
(yellow)
Load Rate Cluster Size
Kafka Consumers =
Kafka Partitions
Concurrency for Cassandra
writes/reads and detection Cluster Size
Cassandra
Connections
Kubernetes Pods

• Kubernetes  easy
to scale application,
just increase Pods
• First attempt, tuned
for 3 node Cassandra
cluster then scaled
out to 24 nodes
• Whoops (blue line)
Cassandra
scalability

Cassandra
scalability -
better
• Then tuned knobs
(thread pools, Pods
and Cassandra
connections) to
maximize throughput
for each configuration
(orange line)
• Also tuned Kafka…
Minimize Cassandra Connections but maximize detector thread pool (pool 2) concurrency

Kafka
Scaling
Kubernetes Pods x
Kafka Consumer
threads

More Kafka Consumers

More Kafka Partitions

Lower Throughput!
0
500000
1000000
1500000
2000000
2500000
0 100 200 300 400 500 600 700
Writes/s
Partitions
Partitions vs Throughput (Writes/s)
6 node x 4 cores/node Kafka Cluster

Kafka
Scaling -
better
Solutions?
Bigger Kafka cluster
Kafka tuning?
num.replica.fetchers = 1
by default, may help to
increase
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
4500000
5000000
0 100 200 300 400 500 600 700
Writes/s
Partitions
Partitions vs Throughput (Writes/s)
Throughput (6 nodes, 4 cores/node) Throughput (9 nodes, 8 cores/node)
Increased Throughput
at 200 partitions

Final system
resources
Cluster Details (all
running in AWS, US
East North Virginia)
■ Instaclustr managed Kafka – EBS: high
throughput 1500, 9 x r4.2xlarge-1500 (1,500 GB
Disk, 61 GB RAM, 8 cores), Apache Kafka
2.1.0, Replication Factor=3
■ Instaclustr managed Cassandra – Extra Large,
48 x i3.2xlarge (1769 GB SSD, 61 GB RAM, 8
cores), Apache Cassandra 3.11.3, Replication
Factor=3
■ AWS EKS Kubernetes Worker Nodes – 2 x
c5.18xlarge (72 cores, 144 GB RAM, 25 Gbps
network), Kubernetes Version 1.10, Platform
Version eks.3

Scaling Out
From 3 to ??
Cassandra nodes
“Pininfarina Battista” the fastest car in the world (2019)
0-100 kph in 2 seconds, top speed 350 kph (electric, 1,900hp).

Scaling Out
• From 3 to 48
Cassandra Nodes
• 1.9 to19 Billion
checks/day
• No upper limit

Resources
• Throughout
(checks per
second) vs cores
for each
subsystem:
• Cassandra >
Workers > Kafka
• Maximum 574
Throughput
CPU Cores
Total cores
Cassandra cores
Kubernetes cores
Kafka cores
574 Cores
@ 220,000 TPS

Cores used -
balance
Cassandra (67%) >
Kubernetes (21%) >
Kafka (12%)
67%
21%
12%
Cores per Sub-system (%)
Cassandra
Kubernetes Workers
Kafka

Maximum
cores used
Cassandra 384 +
Workers 118 +
Kafka 72 =
574 Cores Total
384
118
72
0
100
200
300
400
500
600
700
Cores
Cores Used
574 Total
Cassandra Workers Kafka

Cost –
Affordability
at scale
• Operational $
(AWS instances)
only
• Total $1,000/day
• Can be scaled
with incremental
cost change
48 Cassandra nodes
$1,000/day
$100/day
3 Cassandra nodes

Kafka as a
Buffer
• Kafka acts as a
buffer, can
process 10x the
Cassandra
capacity
• 2.3M/s vs
220,000/s
• Cheaper than
increasing
Cassandra
capacity x10

6 So What?

Some
Takeaways

Takeaways
Technical
■ Kubernetes (+AWS EKS) enabled automation
(deployment, scaling, monitoring) of the application
● Some effort to understand and setup
● But once working it makes application deployment fast, scalable,
repeatable and low cost
■ Prometheus and OpenTracing+Jaeger critical for
debugging, tuning and reporting application
performance and scalability
● Tricky to monitor applications in Kubernetes, but using the
Kubernetes Operators automates the monitoring configuration
■ To achieve near linear scalability and maximize
throughput need to optimize pipeline, by tuning
thread pools and number of Kubernetes Pods to:
● Minimize: Cassandra Connections
● Minimize: Kafka Consumers  Kafka Partitions
● Maximize: Detector thread pool concurrency

Takeaways
Business
■ Kafka+Cassandra enable Fast Streaming+Storage
at Scale
■ Instaclustr Managed Kafka+Cassandra service
● Makes it easy to automate cluster provisioning
(creation/deletion/scaling), and monitoring
● Highly available SLAs
● Proactive cluster monitoring, alerting and maintenance
■ Affordability at Scale
● Low cost Open Source and Commodity Cloud infrastructure
● only pay for what you use, application and Kafka+Cassandra
clusters scale linearly with load so cost only increases
incrementally
■ Application can be easily resized (scaled up and
down) for any workload, no upper limit
■ Lots more use cases using Kafka+Cassandra

Newsflash!
Geospatial Anomaly
Detection

Newsflash!
Geospatial Anomaly
Detection
Compared
performance of
multiple Spatial
representations and
Cassandra
implementations
■ Extensions to detect anomalies over time and space
● E.g. is an event unusual relative to nearest 50 neighbours?
■ How to find neighbours using
● Distance between Latitude/longitude points
● Bounding Box
● Geohashes
● 3D (including 3D Geohashes)
■ Using different Cassandra implementations
● Clustering columns
● Secondary indexes
● Denormalized multiple tables
● Cassandra Lucene Index Plugin

Further
information
■ The complete Anomalia Machina Blog Series (10 Parts):
● Massive scale Kafka and Cassandra deployment for real-time anomaly
detection: 19 Billion events per day https://www.instaclustr.com/massive-
scale-kafka-cassandra-real-time-anomaly-detection/
■ Latest 4-part Geospatial Anomaly Detection blogs:
● https://www.instaclustr.com/geospatial-anomaly-detection-with-kafka-
cassandra/
■ The Open Source Anomalia Machina Code
● https://github.com/instaclustr/AnomaliaMachina
■ All of Paul’s Blogs
● https://www.instaclustr.com/paul-brebner/

Some Anomalies
are easy to detect

“Woodstock was the blip, the tie-dyed anomaly”
Some Anomalies
can be detected
given sufficient time

But other potential
anomalies are harder to
detect
Amazon? Congo? Siberia?
Which fires are worse than normal?

Detect complex
spatio-temporal
anomalies
reliably at scale
with Kafka, Cassandra
& Kubernetes on the
Instaclustr Managed Platform for Open Source
www.instaclustr.com/platform/
The End

Empfohlen

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafkaconfluent

Capture the Streams of Database Changes

Capture the Streams of Database Changes

Capture the Streams of Database Changesconfluent

Flink forward-2017-netflix keystones-paas

Flink forward-2017-netflix keystones-paas

Flink forward-2017-netflix keystones-paasMonal Daxini

Deploying Confluent Platform for Production

Deploying Confluent Platform for Production

Deploying Confluent Platform for Productionconfluent

Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...

Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...

Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...confluent

Netflix Keystone Pipeline at Samza Meetup 10-13-2015

Netflix Keystone Pipeline at Samza Meetup 10-13-2015

Netflix Keystone Pipeline at Samza Meetup 10-13-2015Monal Daxini

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services

Empfohlen

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafkaconfluent

Capture the Streams of Database Changes

Capture the Streams of Database Changes

Capture the Streams of Database Changesconfluent

Flink forward-2017-netflix keystones-paas

Flink forward-2017-netflix keystones-paas

Flink forward-2017-netflix keystones-paasMonal Daxini

Deploying Confluent Platform for Production

Deploying Confluent Platform for Production

Deploying Confluent Platform for Productionconfluent

Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...

Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...

Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...confluent

Netflix Keystone Pipeline at Samza Meetup 10-13-2015

Netflix Keystone Pipeline at Samza Meetup 10-13-2015

Netflix Keystone Pipeline at Samza Meetup 10-13-2015Monal Daxini

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructuremattlieber

Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beamconfluent

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...confluent

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at Dropbox, Mark Smith, Sean Fellowsconfluent

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streamsconfluent

Data pipeline with kafka

Data pipeline with kafka

Data pipeline with kafkaMole Wong

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Beaming flink to the cloud @ netflix ff 2016-monal-daxiniMonal Daxini

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...confluent

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...Paul Brebner

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & StormRan Silberman

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016Monal Daxini

From Three Nines to Five Nines - A Kafka Journey

From Three Nines to Five Nines - A Kafka Journey

From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang

Ingesting Healthcare Data, Micah Whitacre

Ingesting Healthcare Data, Micah Whitacre

Ingesting Healthcare Data, Micah Whitacreconfluent

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulHostedbyConfluent

Unbounded bounded-data-strangeloop-2016-monal-daxini

Unbounded bounded-data-strangeloop-2016-monal-daxini

Unbounded bounded-data-strangeloop-2016-monal-daxiniMonal Daxini

Netflix at-disney-09-26-2014

Netflix at-disney-09-26-2014

Netflix at-disney-09-26-2014Monal Daxini

Jitney, Kafka at Airbnb

Jitney, Kafka at Airbnb

Jitney, Kafka at Airbnbalexismidon

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Monal Daxini

Apache Kafka at LinkedIn

Apache Kafka at LinkedIn

Apache Kafka at LinkedInDiscover Pinterest

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Paul Brebner

Keystone - ApacheCon 2016

Keystone - ApacheCon 2016

Keystone - ApacheCon 2016Peter Bakas

Weitere ähnliche Inhalte

Was ist angesagt?

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructuremattlieber

Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beamconfluent

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...confluent

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at Dropbox, Mark Smith, Sean Fellowsconfluent

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streamsconfluent

Data pipeline with kafka

Data pipeline with kafka

Data pipeline with kafkaMole Wong

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Beaming flink to the cloud @ netflix ff 2016-monal-daxiniMonal Daxini

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...confluent

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...Paul Brebner

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & StormRan Silberman

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016Monal Daxini

From Three Nines to Five Nines - A Kafka Journey

From Three Nines to Five Nines - A Kafka Journey

From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang

Ingesting Healthcare Data, Micah Whitacre

Ingesting Healthcare Data, Micah Whitacre

Ingesting Healthcare Data, Micah Whitacreconfluent

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulHostedbyConfluent

Unbounded bounded-data-strangeloop-2016-monal-daxini

Unbounded bounded-data-strangeloop-2016-monal-daxini

Unbounded bounded-data-strangeloop-2016-monal-daxiniMonal Daxini

Netflix at-disney-09-26-2014

Netflix at-disney-09-26-2014

Netflix at-disney-09-26-2014Monal Daxini

Jitney, Kafka at Airbnb

Jitney, Kafka at Airbnb

Jitney, Kafka at Airbnbalexismidon

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Monal Daxini

Apache Kafka at LinkedIn

Apache Kafka at LinkedIn

Apache Kafka at LinkedInDiscover Pinterest

Was ist angesagt? (20)

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beam

Portable Streaming Pipelines with Apache Beam

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Spring Kafka beyond the basics - Lessons learned on our Kafka journey (Tim va...

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Deploying Kafka at Dropbox, Mark Smith, Sean Fellows

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Data pipeline with kafka

Data pipeline with kafka

Data pipeline with kafka

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Beaming flink to the cloud @ netflix ff 2016-monal-daxini

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & Storm

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016

From Three Nines to Five Nines - A Kafka Journey

From Three Nines to Five Nines - A Kafka Journey

From Three Nines to Five Nines - A Kafka Journey

Ingesting Healthcare Data, Micah Whitacre

Ingesting Healthcare Data, Micah Whitacre

Ingesting Healthcare Data, Micah Whitacre

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul

Unbounded bounded-data-strangeloop-2016-monal-daxini

Unbounded bounded-data-strangeloop-2016-monal-daxini

Unbounded bounded-data-strangeloop-2016-monal-daxini

Netflix at-disney-09-26-2014

Netflix at-disney-09-26-2014

Netflix at-disney-09-26-2014

Jitney, Kafka at Airbnb

Jitney, Kafka at Airbnb

Jitney, Kafka at Airbnb

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015

Apache Kafka at LinkedIn

Apache Kafka at LinkedIn

Apache Kafka at LinkedIn

Ähnlich wie ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Anomaly Detectionon 19 Billion events a day

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Paul Brebner

Keystone - ApacheCon 2016

Keystone - ApacheCon 2016

Keystone - ApacheCon 2016Peter Bakas

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson

[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵

[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵

[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵Amazon Web Services Korea

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Lucidworks

(SPOT302) Availability: The New Kind of Innovator’s Dilemma

(SPOT302) Availability: The New Kind of Innovator’s Dilemma

(SPOT302) Availability: The New Kind of Innovator’s DilemmaAmazon Web Services

Netflix Keystone—Cloud scale event processing pipeline

Netflix Keystone—Cloud scale event processing pipeline

Netflix Keystone—Cloud scale event processing pipelineMonal Daxini

CeiloscaFabio Giannetti

kafka for db as postgres

kafka for db as postgres

kafka for db as postgresPivotalOpenSourceHub

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...Coburn Watson

Apache Cassandra in the Real World

Apache Cassandra in the Real World

Apache Cassandra in the Real WorldJeremy Hanna

OpenStack HAtcp cloud

OpenStack High Availability

OpenStack High Availability

OpenStack High AvailabilityJakub Pavlik

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/HardPaul Brebner

Re invent announcements_2016_hcls_use_cases_mchampion

Re invent announcements_2016_hcls_use_cases_mchampion

Re invent announcements_2016_hcls_use_cases_mchampionMia D Champion

Tuning kafka pipelines

Tuning kafka pipelines

Tuning kafka pipelinesSumant Tambe

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Kai Wähner

Debunking Common Myths in Stream Processing

Debunking Common Myths in Stream Processing

Debunking Common Myths in Stream ProcessingDataWorks Summit/Hadoop Summit

Ähnlich wie ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Anomaly Detectionon 19 Billion events a day (20)

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Keystone - ApacheCon 2016

Keystone - ApacheCon 2016

Keystone - ApacheCon 2016

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and Akka

Streaming Analytics with Spark, Kafka, Cassandra and Akka

[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵

[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵

[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...

Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...

(SPOT302) Availability: The New Kind of Innovator’s Dilemma

(SPOT302) Availability: The New Kind of Innovator’s Dilemma

(SPOT302) Availability: The New Kind of Innovator’s Dilemma

Netflix Keystone—Cloud scale event processing pipeline

Netflix Keystone—Cloud scale event processing pipeline

Netflix Keystone—Cloud scale event processing pipeline

Ceilosca

kafka for db as postgres

kafka for db as postgres

kafka for db as postgres

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...

Surge 2013: Maximizing Scalability, Resiliency, and Engineering Velocity in t...

Apache Cassandra in the Real World

Apache Cassandra in the Real World

Apache Cassandra in the Real World

OpenStack HA

OpenStack High Availability

OpenStack High Availability

OpenStack High Availability

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

Re invent announcements_2016_hcls_use_cases_mchampion

Re invent announcements_2016_hcls_use_cases_mchampion

Re invent announcements_2016_hcls_use_cases_mchampion

Tuning kafka pipelines

Tuning kafka pipelines

Tuning kafka pipelines

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...

Debunking Common Myths in Stream Processing

Debunking Common Myths in Stream Processing

Debunking Common Myths in Stream Processing

Mehr von Paul Brebner

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...Paul Brebner

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining PhilosophersPaul Brebner

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache KafkaPaul Brebner

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...Paul Brebner

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/HardPaul Brebner

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache KafkaPaul Brebner

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...Paul Brebner

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...Paul Brebner

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and PotentialPaul Brebner

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...Paul Brebner

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

0b101000 years of computing: a personal timeline - decade "0", the 1980's

0b101000 years of computing: a personal timeline - decade "0", the 1980's

0b101000 years of computing: a personal timeline - decade "0", the 1980'sPaul Brebner

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...Paul Brebner

How to Improve the Observability of Apache Cassandra and Kafka applications...

How to Improve the Observability of Apache Cassandra and Kafka applications...

How to Improve the Observability of Apache Cassandra and Kafka applications...Paul Brebner

A visual introduction to Apache Kafka

A visual introduction to Apache Kafka

A visual introduction to Apache KafkaPaul Brebner

Automatic Performance Modelling from Application Performance Management (APM)...

Automatic Performance Modelling from Application Performance Management (APM)...

Automatic Performance Modelling from Application Performance Management (APM)...Paul Brebner

Past Experiences and Future Challenges using Automatic Performance Modelling ...

Past Experiences and Future Challenges using Automatic Performance Modelling ...

Past Experiences and Future Challenges using Automatic Performance Modelling ...Paul Brebner

Introduction to programming class 13

Introduction to programming class 13

Introduction to programming class 13Paul Brebner

Mehr von Paul Brebner (20)

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache Kafka

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/Hard

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache Kafka

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and Potential

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

0b101000 years of computing: a personal timeline - decade "0", the 1980's

0b101000 years of computing: a personal timeline - decade "0", the 1980's

0b101000 years of computing: a personal timeline - decade "0", the 1980's

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

How to Improve the Observability of Apache Cassandra and Kafka applications...

How to Improve the Observability of Apache Cassandra and Kafka applications...

How to Improve the Observability of Apache Cassandra and Kafka applications...

A visual introduction to Apache Kafka

A visual introduction to Apache Kafka

A visual introduction to Apache Kafka

Automatic Performance Modelling from Application Performance Management (APM)...

Automatic Performance Modelling from Application Performance Management (APM)...

Automatic Performance Modelling from Application Performance Management (APM)...

Past Experiences and Future Challenges using Automatic Performance Modelling ...

Past Experiences and Future Challenges using Automatic Performance Modelling ...

Past Experiences and Future Challenges using Automatic Performance Modelling ...

Introduction to programming class 13

Introduction to programming class 13

Introduction to programming class 13

Kürzlich hochgeladen

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Slack Application Development 101 Slides

Slack Application Development 101 Slides

Slack Application Development 101 Slidespraypatel2

Unblocking The Main Thread Solving ANRs and Frozen Frames

Unblocking The Main Thread Solving ANRs and Frozen Frames

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Salesforce Community Group Quito, Salesforce 101

Salesforce Community Group Quito, Salesforce 101

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

SQL Database Design For Developers at php[tek] 2024

SQL Database Design For Developers at php[tek] 2024

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

08448380779 Call Girls In Civil Lines Women Seeking Men

08448380779 Call Girls In Civil Lines Women Seeking Men

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Injustice - Developers Among Us (SciFiDevCon 2024)

Injustice - Developers Among Us (SciFiDevCon 2024)

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

IAC 2024 - IA Fast Track to Search Focused AI Solutions

IAC 2024 - IA Fast Track to Search Focused AI Solutions

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Maximizing Board Effectiveness 2024 Webinar.pptx

Maximizing Board Effectiveness 2024 Webinar.pptx

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

The Codex of Business Writing Software for Real-World Solutions 2.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Scaling API-first – The story of a global engineering organization

Scaling API-first – The story of a global engineering organization

Scaling API-first – The story of a global engineering organizationRadu Cotescu

My Hashitalk Indonesia April 2024 Presentation

My Hashitalk Indonesia April 2024 Presentation

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Kürzlich hochgeladen (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Slack Application Development 101 Slides

Slack Application Development 101 Slides

Slack Application Development 101 Slides

Unblocking The Main Thread Solving ANRs and Frozen Frames

Unblocking The Main Thread Solving ANRs and Frozen Frames

Unblocking The Main Thread Solving ANRs and Frozen Frames

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdf

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Salesforce Community Group Quito, Salesforce 101

Salesforce Community Group Quito, Salesforce 101

Salesforce Community Group Quito, Salesforce 101

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

SQL Database Design For Developers at php[tek] 2024

SQL Database Design For Developers at php[tek] 2024

SQL Database Design For Developers at php[tek] 2024

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

08448380779 Call Girls In Civil Lines Women Seeking Men

08448380779 Call Girls In Civil Lines Women Seeking Men

08448380779 Call Girls In Civil Lines Women Seeking Men

Injustice - Developers Among Us (SciFiDevCon 2024)

Injustice - Developers Among Us (SciFiDevCon 2024)

Injustice - Developers Among Us (SciFiDevCon 2024)

IAC 2024 - IA Fast Track to Search Focused AI Solutions

IAC 2024 - IA Fast Track to Search Focused AI Solutions

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Maximizing Board Effectiveness 2024 Webinar.pptx

Maximizing Board Effectiveness 2024 Webinar.pptx

Maximizing Board Effectiveness 2024 Webinar.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptx

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Scaling API-first – The story of a global engineering organization

Scaling API-first – The story of a global engineering organization

Scaling API-first – The story of a global engineering organization

My Hashitalk Indonesia April 2024 Presentation

My Hashitalk Indonesia April 2024 Presentation

My Hashitalk Indonesia April 2024 Presentation

ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Anomaly Detectionon 19 Billion events a day

1. Kafka, Cassandra and Kubernetes at Scale – Real-time Anomaly Detection on 19 Billion events a day Paul Brebner instaclustr.com Technology Evangelist Cassandra Track, ApacheCon 2019, Thursday September 12th 2019, Las Vegas, USA https://www.apachecon.com/acna19/s/#/scheduledEvent/1187

2. Overview 1. Wow! (headlines) 2. Why? (did we do it) 3. What? (does it do) 4. How? (does it work)) 5. Well? (how well did it work) 6. So What?

3. 1 Wow!?! Headlines

4.

5.

6.

7. 50,000 expected 1 Million descended on Woodstock 500,000 reached the venue

8. The Instaclustr Times 19 Billion Anomaly Checks A Day! Instaclustr reveals Massively Scalable! Fast! Affordable! Anomaly Detector Machine Using Open Source Apache Cassandra, Apache Kafka, Kubernetes & AWS

9. The Instaclustr Times 19 Billion Anomaly Checks A Day! Instaclustr reveals Massively Scalable! Fast! Affordable! Anomaly Detector Machine Using Open Source Apache Cassandra, Apache Kafka, Kubernetes & AWS

10. Headline Numbers Per Second • 220,000 Anomaly checks Per Second 220000 0 50000 100000 150000 200000 250000 Anomaly checks/s

11. • 500x better than previously published results for similar system • 2018, Kafka, Cassandra, Spark • Bigger numbers? 440 220000 0 50000 100000 150000 200000 250000 Per Second Previous published results Previous published result Anomaly Checks/s x500 Headline Numbers Per Second

12. • Peak 2.3 Million Kafka writes/s • x10 rest of pipeline • Kafka as a buffer, absorbs load spike 0.2 2.3 0.0 0.5 1.0 1.5 2.0 2.5 Millions per second Millions Per Second Anomaly checks/s (M) Peak Kafka writes/s (M) Headline Numbers Millions Per Second

13. Headline Numbers Daily • Planetary scale (population 7.7B) • 19 Billion (1,000 Million) checks/day • 2.5 events per person per day • Had to stop somewhere, but no upper limit 0 2 4 6 8 10 12 14 16 18 20 Billions per day Daily Big Numbers (Billions/day) World Population Anomaly Checks

14. 2 Why? Project Goals

15. Project Goals Multiple (like Aussie Rules Football - AFL)

16. Project Goals • Fast Data • Real Time Streams processing • < 1s RT

17. Project Goals • Big Data • Throughput and Size scale • no upper limit • big benchmark numbers

18. Cost Effective • Incrementally scalable • Only pay for what you use • High benefit/cost ratio • 1/2 car “Malcom” movie, 1986

19. Apache Kafka and Cassandra • Technology - Kafka+Cassandra use case • Platform - Instaclustr’s Managed Platform • Features - Provisioning, monitoring, scaling, and more

20. Kafka as a Buffer • Cost effective for short load spikes • E.g. Influx of unexpected festival goers • Prevent overloading of rest of pipeline • All events (eventually) processed

21. Application Automation and Observability • Complementary technologies: • Kubernetes (automation) • Prometheus (monitoring) • OpenTracing+ • Jaeger (tracing)

22. 3 What? Does it do?

23. What does it do? Anomaly Detection Use Case Spot unusual events

24. “Man on Moon” headlines • 400,000 people got them there • JoAnn Morgan, Saturn 5 monitoring engineer • Only woman in the control room for Apollo 11

25. Anomaly Detection Goals Spot the difference At speed and scale

26. Spot the difference at speed • 1 second maximum • Streams processing not batch

27. Spot the difference at scale • Keys and Concurrency • Multiple keys • Need Big Data database • For Storage and Processing capacity

28. Scalability • Massive load (data velocity) • Increasing load • No upper bound • Load spikes Time Load

29. Affordability • Linear resource scalability • Elastic, on- demand • Incremental resources and cost with changing load Time ResourcesandCost $x1 $x2 $x3 $x5 $x3 $x4

30. Anomaly Detection Use Cases Many and varied Infrastructure monitoring

31. Anomaly Detection Use Cases Many and varied Application Monitoring

32. Anomaly Detection Use Cases Many and varied IoT

33. Anomaly Detection Use Cases Many and varied Finance fraud detection

34. Anomaly Detection Use Cases Many and varied Clickstream analytics

35. Anomaly Detection Use Cases Many and varied Drone deliveries

36. 4 How does it work? • Anomaly Detection • Architecture • Technologies

37. Is this our machine? • The Audio-Telly-o- Tally-o Count • Streams processing machine for counting sleepers • We’ve advanced from this 1960’s technology

38. How does it work? • CUSUM (Cumulative Sum Control Chart) • Statistical analysis of historical data

39. Logical steps (1) Events arrive in a stream (2) Get the next event from the stream (3) Write the event to the database (4) (5) Query the historic data from the database (4) (6) If there are sufficient observations, run the anomaly detector (7) Was a potential anomaly detected? Take appropriate action.

40. Pipeline Design • Design, showing interaction with Kafka and Cassandra Clusters • Load generator, detector pipeline • 2 thread pools • To constrain the number Kafka consumers ( Kafka partitions) Limits number of Kafka Consumers 2 thread pools to Decouple Kafka Consumers from rest of pipeline

41. Cloud Deployment Context • Kafka and Cassandra clusters managed by Instaclustr • Application in AWS

42. Cassandra • Open Source • NoSQL Database • Masterless ring architecture & partitioned data for • Linear scalability • High availability • Fast writes • Powerful queries with indexes

43. Instaclustr Managed Apache Cassandra Benefits ■ Optimised for low latency/high throughput ■ Automated Provisioning, Monitoring, Management ■ SOC2 certified ■ Multiple cloud providers ■ 24/7 Technical support ■ Automated Health Checks ■ Dynamic scaling ■ Zero downtime migrations ■ New! Certified Apache Cassandra ● Key highlights of the Certification Report include: ᐨ Performance testing (latency and throughput) comparing the current version to previous versions ᐨ 24-hour soak testing (including repairs and replaces) ᐨ Testing against popular drivers

44. What is Kafka? Message flow Distributed streams processing 1 Distributed Producers… 2 Send Messages 3 To Distributed Consumers 4 Via Kafka Cluster

45. Kafka Key Benefits ■ Fast – high throughput and low latency ■ Scalable – horizontally scalable, just add nodes and partitions ■ Reliable – distributed and fault tolerant ■ Zero data loss ■ Open Source ■ Heterogeneous data sources and sinks ■ Available as an Instaclustr Managed service

46. Application Automation with Kubernetes • AWS EKS • Kafka load generator and Anomaly Detection Pipeline deployed on worker nodes

47. Kubernetes • An automation system for the management, scaling and deployment of containerized applications • Master/worker Nodes architecture • Pods are units of concurrency

48. Kubernetes Benefits • Open Source • Cloud provider and programming language agnostic • Develop and test code locally, then deploy at scale • Helps with resource management – deploy application to Kubernetes and it manages scaling up/down and keeping application alive • More powerful frameworks built on Kubernetes APIs are becoming available

49. Observability 1 Prometheus Monitoring • Ran using Kubernetes Prometheus Operator • Grafana for graphing • Used to debug, tune, and observe business metrics (TPS, RT) from 100 Pods

50. Prometheus Architecture • Monitoring of applications and servers • Instrumentation • Pull-based • Architecture & Components…

51. Prometheus Operator In production on Kubernetes Use Prometheus Operator to manage application complexity and dynamics

52. Observability 2 Tracing with OpenTracing and Jaeger • Single traces • Topology of system • Even though this example has simple topology, valuable for debugging

53. OpenTracing Standard API for distributed tracing ■ Specification, not implementation ■ Need ● Application instrumentation ● OpenTracing tracer Traced Applications API Tracer implementations Open Source, Datadog

54. Jaeger Tracer Open Source Tracer Uber/CNCF

55. Tracing across Kafka topics More complex example: discovering event flows across multiple topics E.g. Kafka ESB

56. 5 How well did it work? Scaling Out From 3 to ??? Cassandra nodes

57. How well did it work? Scaling Out From 3 to ??? Cassandra nodes Due to 1:1 read/write ratio, decreased compression chunk size to 1KB “La Jamais Contente”, first car to reach 100 km/h in 1899 (electric, 68hp)

58. Scaling Knobs • Load generator (red) • Cluster sizes and worker pods (orange) • Thread pools, partitions and connections (yellow) Load Rate Cluster Size Kafka Consumers = Kafka Partitions Concurrency for Cassandra writes/reads and detection Cluster Size Cassandra Connections Kubernetes Pods

59. • Kubernetes  easy to scale application, just increase Pods • First attempt, tuned for 3 node Cassandra cluster then scaled out to 24 nodes • Whoops (blue line) Cassandra scalability

60. Cassandra scalability - better • Then tuned knobs (thread pools, Pods and Cassandra connections) to maximize throughput for each configuration (orange line) • Also tuned Kafka… Minimize Cassandra Connections but maximize detector thread pool (pool 2) concurrency

61. Kafka Scaling Kubernetes Pods x Kafka Consumer threads  More Kafka Consumers  More Kafka Partitions  Lower Throughput! 0 500000 1000000 1500000 2000000 2500000 0 100 200 300 400 500 600 700 Writes/s Partitions Partitions vs Throughput (Writes/s) 6 node x 4 cores/node Kafka Cluster

62. Kafka Scaling - better Solutions? Bigger Kafka cluster Kafka tuning? num.replica.fetchers = 1 by default, may help to increase 0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 5000000 0 100 200 300 400 500 600 700 Writes/s Partitions Partitions vs Throughput (Writes/s) Throughput (6 nodes, 4 cores/node) Throughput (9 nodes, 8 cores/node) Increased Throughput at 200 partitions

63. Final system resources Cluster Details (all running in AWS, US East North Virginia) ■ Instaclustr managed Kafka – EBS: high throughput 1500, 9 x r4.2xlarge-1500 (1,500 GB Disk, 61 GB RAM, 8 cores), Apache Kafka 2.1.0, Replication Factor=3 ■ Instaclustr managed Cassandra – Extra Large, 48 x i3.2xlarge (1769 GB SSD, 61 GB RAM, 8 cores), Apache Cassandra 3.11.3, Replication Factor=3 ■ AWS EKS Kubernetes Worker Nodes – 2 x c5.18xlarge (72 cores, 144 GB RAM, 25 Gbps network), Kubernetes Version 1.10, Platform Version eks.3

64. Scaling Out From 3 to ?? Cassandra nodes “Pininfarina Battista” the fastest car in the world (2019) 0-100 kph in 2 seconds, top speed 350 kph (electric, 1,900hp).

65. Scaling Out • From 3 to 48 Cassandra Nodes • 1.9 to19 Billion checks/day • No upper limit

66. Resources • Throughout (checks per second) vs cores for each subsystem: • Cassandra > Workers > Kafka • Maximum 574 Throughput CPU Cores Total cores Cassandra cores Kubernetes cores Kafka cores 574 Cores @ 220,000 TPS

67. Cores used - balance Cassandra (67%) > Kubernetes (21%) > Kafka (12%) 67% 21% 12% Cores per Sub-system (%) Cassandra Kubernetes Workers Kafka

68. Maximum cores used Cassandra 384 + Workers 118 + Kafka 72 = 574 Cores Total 384 118 72 0 100 200 300 400 500 600 700 Cores Cores Used 574 Total Cassandra Workers Kafka

69. Cost – Affordability at scale • Operational $ (AWS instances) only • Total $1,000/day • Can be scaled with incremental cost change 48 Cassandra nodes $1,000/day $100/day 3 Cassandra nodes

70. Kafka as a Buffer • Kafka acts as a buffer, can process 10x the Cassandra capacity • 2.3M/s vs 220,000/s • Cheaper than increasing Cassandra capacity x10

72. Some Takeaways

73. Takeaways Technical ■ Kubernetes (+AWS EKS) enabled automation (deployment, scaling, monitoring) of the application ● Some effort to understand and setup ● But once working it makes application deployment fast, scalable, repeatable and low cost ■ Prometheus and OpenTracing+Jaeger critical for debugging, tuning and reporting application performance and scalability ● Tricky to monitor applications in Kubernetes, but using the Kubernetes Operators automates the monitoring configuration ■ To achieve near linear scalability and maximize throughput need to optimize pipeline, by tuning thread pools and number of Kubernetes Pods to: ● Minimize: Cassandra Connections ● Minimize: Kafka Consumers  Kafka Partitions ● Maximize: Detector thread pool concurrency

74. Takeaways Business ■ Kafka+Cassandra enable Fast Streaming+Storage at Scale ■ Instaclustr Managed Kafka+Cassandra service ● Makes it easy to automate cluster provisioning (creation/deletion/scaling), and monitoring ● Highly available SLAs ● Proactive cluster monitoring, alerting and maintenance ■ Affordability at Scale ● Low cost Open Source and Commodity Cloud infrastructure ● only pay for what you use, application and Kafka+Cassandra clusters scale linearly with load so cost only increases incrementally ■ Application can be easily resized (scaled up and down) for any workload, no upper limit ■ Lots more use cases using Kafka+Cassandra

75. Newsflash! Geospatial Anomaly Detection

76. Newsflash! Geospatial Anomaly Detection Compared performance of multiple Spatial representations and Cassandra implementations ■ Extensions to detect anomalies over time and space ● E.g. is an event unusual relative to nearest 50 neighbours? ■ How to find neighbours using ● Distance between Latitude/longitude points ● Bounding Box ● Geohashes ● 3D (including 3D Geohashes) ■ Using different Cassandra implementations ● Clustering columns ● Secondary indexes ● Denormalized multiple tables ● Cassandra Lucene Index Plugin

77. Further information ■ The complete Anomalia Machina Blog Series (10 Parts): ● Massive scale Kafka and Cassandra deployment for real-time anomaly detection: 19 Billion events per day https://www.instaclustr.com/massive- scale-kafka-cassandra-real-time-anomaly-detection/ ■ Latest 4-part Geospatial Anomaly Detection blogs: ● https://www.instaclustr.com/geospatial-anomaly-detection-with-kafka- cassandra/ ■ The Open Source Anomalia Machina Code ● https://github.com/instaclustr/AnomaliaMachina ■ All of Paul’s Blogs ● https://www.instaclustr.com/paul-brebner/

78. Some Anomalies are easy to detect

79. “Woodstock was the blip, the tie-dyed anomaly” Some Anomalies can be detected given sufficient time

80. But other potential anomalies are harder to detect Amazon? Congo? Siberia? Which fires are worse than normal?

81. Detect complex spatio-temporal anomalies reliably at scale with Kafka, Cassandra & Kubernetes on the Instaclustr Managed Platform for Open Source www.instaclustr.com/platform/ The End

Hinweis der Redaktion

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, we will reveal how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.
1969 noteworthy year, lots of 50th anniversary events recently celebrated I don’t think Elvis is returning home again But moon and woodstock
50,000 expected, 1 Million descended on the site, 500,000 reached it
50,000 expected, 1 Million descended on the site, 500,000 reached it
Is this big? More realistic is per second 1 Billion = 1000 Million = 10^9 events/day Actually 220,000 events per second 2.3M/s Kafka write/s Per Day, Yes Big. Planetary scale! More than double world population (7.7 Billion) Could process 2.5 events per person per day Bigger than most (any?) single company’s daily financial transactions Better (500x throughput and much faster) than published results for similar problem (from 2018, using Kafka, Cassandra and Spark, 200 events/s, RT >> 1s) Bigger numbers only limited by imagination We could have kept going, but had to stop somewhere US FINRA (Financial Industry Regulatory Authority) processes up to 78 Billion events a day (also using public cloud) Computer Systems generate massive amounts of metrics E.g. Netflix uses Kafka to process > 1 Trillion (10^12) events/day (2018) And the system will scale arbitrarily high to match business requirements
Is this big? More realistic is per second 1 Billion = 1000 Million = 10^9 events/day Actually 220,000 events per second 2.3M/s Kafka write/s Per Day, Yes Big. Planetary scale! More than double world population (7.7 Billion) Could process 2.5 events per person per day Bigger than most (any?) single company’s daily financial transactions Better (500x throughput and much faster) than published results for similar problem (from 2018, using Kafka, Cassandra and Spark, 200 events/s, RT >> 1s) Bigger numbers only limited by imagination We could have kept going, but had to stop somewhere US FINRA (Financial Industry Regulatory Authority) processes up to 78 Billion events a day (also using public cloud) Computer Systems generate massive amounts of metrics E.g. Netflix uses Kafka to process > 1 Trillion (10^12) events/day (2018) And the system will scale arbitrarily high to match business requirements
Is this big? More realistic is per second 1 Billion = 1000 Million = 10^9 events/day Actually 220,000 events per second 2.3M/s Kafka write/s Per Day, Yes Big. Planetary scale! More than double world population (7.7 Billion) Could process 2.5 events per person per day Bigger than most (any?) single company’s daily financial transactions Better (500x throughput and much faster) than published results for similar problem (from 2018, using Kafka, Cassandra and Spark, 200 events/s, RT >> 1s) Bigger numbers only limited by imagination We could have kept going, but had to stop somewhere US FINRA (Financial Industry Regulatory Authority) processes up to 78 Billion events a day (also using public cloud) Computer Systems generate massive amounts of metrics E.g. Netflix uses Kafka to process > 1 Trillion (10^12) events/day (2018) And the system will scale arbitrarily high to match business requirements
Project Goals - multiple Fast (RT), Big (Scalable, no upper limit), Cost effective (Open Source, Automatic cluster creation/delete, scaling) Kafka + Cassandra demo use case Kafka as a buffer use case (cost effective for coping with short load spikes) Demonstrate Instaclustr managed service for Kafka and Cassandra (provisioning, management, monitoring) Try complementary tech for application management and scale (K8, Prometheus, OpenTracing, Jaeger)
Project Goals - multiple Fast (RT), Big (Scalable, no upper limit), Cost effective (Open Source, Automatic cluster creation/delete, scaling) Kafka + Cassandra demo use case Kafka as a buffer use case (cost effective for coping with short load spikes) Demonstrate Instaclustr managed service for Kafka and Cassandra (provisioning, management, monitoring) Try complementary tech for application management and scale (K8, Prometheus, OpenTracing, Jaeger)
Project Goals - multiple Fast (RT), Big (Scalable, no upper limit), Cost effective (Open Source, Automatic cluster creation/delete, scaling) Kafka + Cassandra demo use case Kafka as a buffer use case (cost effective for coping with short load spikes) Demonstrate Instaclustr managed service for Kafka and Cassandra (provisioning, management, monitoring) Try complementary tech for application management and scale (K8, Prometheus, OpenTracing, Jaeger)
Project Goals - multiple Fast (RT), Big (Scalable, no upper limit), Cost effective (Open Source, Automatic cluster creation/delete, scaling) Kafka + Cassandra demo use case Kafka as a buffer use case (cost effective for coping with short load spikes) Demonstrate Instaclustr managed service for Kafka and Cassandra (provisioning, management, monitoring) Try complementary tech for application management and scale (K8, Prometheus, OpenTracing, Jaeger)
Project Goals - multiple Fast (RT), Big (Scalable, no upper limit), Cost effective (Open Source, Automatic cluster creation/delete, scaling) Kafka + Cassandra demo use case Kafka as a buffer use case (cost effective for coping with short load spikes) Demonstrate Instaclustr managed service for Kafka and Cassandra (provisioning, management, monitoring) Try complementary tech for application management and scale (K8, Prometheus, OpenTracing, Jaeger)
Project Goals - multiple Fast (RT), Big (Scalable, no upper limit), Cost effective (Open Source, Automatic cluster creation/delete, scaling) Kafka + Cassandra demo use case Kafka as a buffer use case (cost effective for coping with short load spikes) Demonstrate Instaclustr managed service for Kafka and Cassandra (provisioning, management, monitoring) Try complementary tech for application management and scale (K8, Prometheus, OpenTracing, Jaeger)
Anomaly detection needs to be fast, under 1s
The headlines 50 years ago may have been about men on the moon, but the success of the program depended on many women
Anomaly detection needs to be fast, under 1s
Anomaly detection needs to be fast, under 1s, streams processing
Anomaly detection needs to be scalable, increasing key requires more storage, size and processing capacity. Need scalable database
Anomaly detection needs to be scalable, for high throughputs, linearly scalable for more processing capacity, ability to handle load spikes (buffer use case), and no upper limit And affordable, i.e. elastic, scale up and down on demand, have correct resources based on actual load (not too many or too few)
And affordable, i.e. linear, elastic, scale up and down on demand, have just sufficient resources based on actual load (not too many or too few) For experiments, want to spin resources up and down (provision, scale, delete)
Anomaly detection is used in a wide variety of domains including: Infrastructure monitoring
Anomaly detection is used in a wide variety of domains including: Infrastructure monitoring
Anomaly detection is used in a wide variety of domains including: Infrastructure monitoring
Anomaly detection is used in a wide variety of domains including: Infrastructure monitoring
Anomaly detection is used in a wide variety of domains including: Infrastructure monitoring
Anomaly detection is used in a wide variety of domains including: Infrastructure monitoring
A simple type of anomaly detection is called Break or Changepoint analysis. This takes a stream of events and analyses them to see if the most recent events are “different” to previous ones. We picked a simple version to start with (CUSUM). It only uses data for a single variable at a time, which could be something like an account number, or an IP address.
This is the prototype application design The Anomaly detection pipeline is written in Java and runs in a single multi-threaded process. It consists of a Kafka consumer which gets each new event and passes it to A Cassandra client, which writes the event to Cassandra, gets the previous 50 rows for the ID, runs the detector and decides if there’s an anomaly or not. Thread pools? Kafka Consumer pool useful to constrain the number of Kafka Consumers, and thereby constrain the number of Kafka partitions which are expensive!
What is Kafka? Kafka is a distributed streams processing system, it allows distributed producers to send messages to distributed consumers via a Kafka cluster.
The next graph shows the Kafka producer ramping up (from 1 to 9 Kubernetes Pods), with 2 minutes load time, peaking at 2.3M events/s (this time in Grafana). Note that because each metric was being retrieved from multiple Pods I had to view them as stacked graphs to get the total metric value for all the Pods. This graph shows the anomaly check rate reaching 220,000 events/s and continuing (until all the events are processed). Prometheus is gathering this metric from 100 Kubernetes Pods.
After also instrumenting the application with OpenTracing, here’s the Jaeger dependencies view (there are other views which show single traces in detail) which shows the topology of the system, including tracing across process boundaries (producers to consumers):
The Anomalia Machina pipeline is relatively simple, so I wondered how well OpenTracing would work for discovering and visualising more complex Kafka topologies. For example, would it be possible to visualise the topology of data flow across many Kafka topics? I wrote a simple Markov chain simulator which allows you to choose the number of source topics, intermediate topics, and sink topics, and a graph density, and then produces random traces. The code is in this gist. Here’s the dependency graph for a run of this code. In practice you would also want to add information about the Kafka producers and consumers (either as extra nodes, or by labelling the edges). There’s also a cool Force directed graph view which allows you to select a node and highlight the dependent nodes.
Pre-tuning: “La Jamais Contente”, first automobile to reach 100 km/h in 1899 (electric, 68hp)
Pre-tuning: “La Jamais Contente”, first automobile to reach 100 km/h in 1899 (electric, 68hp)
Knobs for scaling
Scaling from 3 to ? Cassandra nodes: Initial method was just to increase number of Worker Pods with no tuning of application parameters. This resulted in blue line eeek. Ended up tuning each configuration (number of Worker Pods + Cassandra Nodes), including thread pool sizes and C* connections. Had to optimise the anomaly detection pipeline to minimize: Cassandra connections, and Kafka partitions By tuning the number of pipeline worker Pods in Kubernetes and the application thread pools Initially sub-linear scalability (blue line), eventually close to perfect scalability (orange line)
Scaling from 3 to ? Cassandra nodes: Initial method was just to increase number of Worker Pods with no tuning of application parameters. This resulted in blue line eeek. Ended up tuning each configuration (number of Worker Pods + Cassandra Nodes), including thread pool sizes and C* connections. Had to optimise the anomaly detection pipeline to minimize: Cassandra connections, and Kafka partitions By tuning the number of pipeline worker Pods in Kubernetes and the application thread pools Initially sub-linear scalability (blue line), eventually close to perfect scalability (orange line)
https://azure.microsoft.com/fr-fr/blog/processing-trillions-of-events-per-day-with-apache-kafka-on-azure/
https://azure.microsoft.com/fr-fr/blog/processing-trillions-of-events-per-day-with-apache-kafka-on-azure/
Post-tuning: Fast-forward 120 years… “Pininfarina Battista” the fastest car in the world, 0-100 kph in 2 seconds, top speed 350 kph (electric, 1,900hp).
The complete machine for the biggest result (48 Cassandra nodes) has 574 cores in total. Cassandra (384) > Workers (118) > Kafka (72)
The complete machine for the biggest result (48 Cassandra nodes) has 574 cores in total. Cassandra (384) > Workers (118) > Kafka (72)
The complete machine for the biggest result (48 Cassandra nodes) has 574 cores in total. Cassandra (384) > Workers (118) > Kafka (72)
This graph shows that it only costs around $1,000 a day for the basic infrastructure using on-demand AWS instances. This graph also shows that the system can easily be scaled up or down to match different business requirements, and the infrastructure costs will scale proportionally. For example, the smallest system we ran still checked 1.5 Billion events per day, for a cost of only $100/day for the AWS infrastructure.
https://medium.com/vizzuality-blog/the-amazon-is-on-fire-is-it-worse-than-normal-5fa430a7880e https://www.news.com.au/technology/environment/amazon-fires-dwarfed-by-the-blazes-burning-across-africa/news-story/4ff4d1a4b2cbbc55f79f367bf5f2bc9d
https://medium.com/vizzuality-blog/the-amazon-is-on-fire-is-it-worse-than-normal-5fa430a7880e https://www.news.com.au/technology/environment/amazon-fires-dwarfed-by-the-blazes-burning-across-africa/news-story/4ff4d1a4b2cbbc55f79f367bf5f2bc9d Questions C* Read/write, how did I tune reads? Decreasing the compression chunk size to 1KB (the smallest possible value) resulted in higher CPU usage and an increase in throughput to 9,000 TPS. The Apache Cassandra documentation explains the benefits of compression as follows: “Compression’s primary benefit is that it reduces the amount of data written to disk. Not only does the reduced size save in storage requirements, it often increases read and write throughput, as the CPU overhead of compressing data is faster than the time it would take to read or write the larger volume of uncompressed data from disk.” Kafka monitoring and tuning? Cost with scale, looks good from $100 to $1000 from 3 to 48 Clarify flow to emphasize that data is read from Kafka with consumer Could we automate the tuning? I.e. feedback loop between monitoring and k8, how to set threads? Add Prometheus monitoring architecture/story to both talks? Did we think about getting rid of C*? Yes, here’s why not (streams, not random access via IDs, so need to read and filter, or have 1 topic per Id (but millions), or use streams and C* as state store????!!!