SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
who am i
user ID Dani
Alias DTrapezoid
Bg Apache Proj
Dist Sys
s real time
event driven Sys
slaying legacy
How did we
get here
age
3in
f
pgggpaeomqao.m.oio.ms aknaf.at
ngI
what are we
doing
Y
KSQL Architecture
KSQL Use cases
3 KS Q performance considerations
4 When to use k SQL when to not
g introduce KST Reams
Ok let's learn more
about
E
you
What letter best describes your ego
on
T
select hope
for
your
cluster
select profit
I
l select
y
it
depends
select make
a wish
it
y.souyewieqsm
bzdr.my
give
select
eqsocluster
select x
smeary
it
qnaEg
depends
select
y
make
a
wish
Why does it matter to
think about a
Query
before you
eE
execute it p
P 77
27
n
p
Lots of
led sons
we do hit always have control over stuff
E
ear
to
I can r
jeng'd
s
g
FEI
a
why
use
Igf
E TEEoietEEIifions
qq.fqtf.gd.de
RDBMS
initial use cases thinking Benefits
11
DB integration Buffering
1 Back pressure
message
bus
Decouples system
pub sub Distributed
Highly Availabl
lowly latent
Data governancei
omen
box wait there's
moire as a
streaming Platform
K2fk2 Streams API
STREAM ALL THE THINGS
Event Driven systems
Transform once use
many
o
sto
I
tIIgsEgEe etc
ftp.eamprocessing
3 f
L a Ii
zgiB
iP
KSQL KSTREAMS
ice creams
O oFiEE Stream ProcessingEIEEE.EE
Ehis w Kafka
tt
FFfB abstracts
I
F't
7f
9 to
KSQL KSTREAMS
III gently caaa.sieams3
s5om3oa
Let's talk about what KSQL is
under the hood
what happens when you KSQL
CREATE STREAM AS SELECT
CREATE TABLE AS SELECT
Hr
Output
To P l C
Name Default same
2Istream
table Or
us to Mre with KAFKA TOPIC prop WITH clause
partitions customize
carefully
w PARTITIONS prop WITH clause
Replication Factor Defdutt RFI topic L
customial with REPLICAS property WITH clause
Aggregations leverage embedded storage
engine to locally manage state
A compacted changelog topic persistsaggregation
state
compacted changelogtopics have the same
ofpartitions as the input stream Default L
replies
is KSQL like 5dL
O
I KSQL
µ 1
111111
1 IT Aggregate
11
I
11TH
t 55
join
i ta
i WT
IT
Filter
o HDDI
0 I connect 0 7
WE 0 sinkI source 1 0 connect
connector I 0
Elasticsource It sink
qq.iqconnector 0 connect
qq.gggID
ffoHfi Skinny Ero S3
e o
L
gtfEE.io
EEEEhhaaaE
Golf connect
cassandr
imhoff
KSQL is a DBA's worst nightmare
It's a Query that never stops
Jeremy
Custenborder
principalsystemsEngineers
confluent inc
why would someone want to use
KSQL then
Mumia
to write
EE
CONTINUOUS
stream processing programs
simply
my Run SQL Queries against eo date
agnostic of programming language
my works contrhously in Real time
Queries won't Quit until the messages
do
my Fault tolerant
f scales horizontally Vertically
f Distributed
what does the architecture of Ksa
look like
EE
ii
I I I
I
Ksai clients
if
EEE
SQL Engine
www.gesksQLstdtemeuts
Quenes
rest API
itow client aeu.esasffctohememra ee.me
384 tzfnkqgtopykosurtkso.ie Apps w confluentcontrol
Ksar server fEosgedneofsikEEEEIYEE.EEaEEa
ADD KSQL servers who Restarting APPS
https://www.confluent.io/blog/ksql-in-action-real-time-streaming-etl-from-oracle-
transactional-data
https://www.confluent.io/blog/ksql-in-action-enriching-csv-events-with-data-from-rdbms-into-AWS/
https://docs.confluent.io/current/ksql/docs/developer-guide/transform-a-stream-with-ksql.html
https://www.confluent.io/stream-processing-cookbook/ksql-recipes/detecting-abnormal-transactions
what are some cool KSQL use cases
if
t
my streaming E 14 RT monitoring
rders Kafka
CDC 1
4 4 69905 connect elastic
login Debezium ksypez.ms Elasticus KSQL
T Data Enrichment
card Eeo
Kwangyez qq.io
gEioE9oqzIksapfoko98cnr're'tflog 2
Egg53
Kafkaconnect
Debezium KSQL
my Data Transformation
change data format of message Wes
timestamp field
of partitions timestamp format
of replicas Kafkatopic name
F Anomaly Detection
create a of time to write to a ksactableftopic
collect anomalous behavior
WINDOW TUMBLING
{
"order_id": 1,
"customer_name": "Maryanna Andryszczak",
"date_of_birth": "1922-06-06T02:21:59Z",
"product": "Nut - Walnut, Pieces",
"order_total_usd": "1.65",
"town": "Portland",
"country": "United States"
}
ksql> CREATE STREAM purchases 
(order_id INT, customer_name VARCHAR, date_of_birth
VARCHAR, 
product VARCHAR, order_total_usd VARCHAR, town
VARCHAR, country VARCHAR) 
WITH (KAFKA_TOPIC='purchases', VALUE_FORMAT='JSON');
Message
----------------
Stream created
----------------
SELECT * FROM PURCHASES LIMIT 5;
SELECT ORDER_ID, PRODUCT, TOWN, COUNTRY FROM PURCHASES WHERE
COUNTRY='Germany';
rims
y
create PURCHASES stream
Let's validate the 1st few messages
Now let's filter by the
country
Germany
CREATE STREAM PUCHASES_GERMANY AS SELECT * FROM PURCHASES
WHERE COUNTRY='Germany';
ksql> LIST TOPICS;
Kafka Topic | Registered | Partitions | Partition
Replicas | Consumers | ConsumerGroups
-------------------------------------------------------------
-----------------------------------
_confluent-metrics | false | 12 | 1
| 0 | 0
PUCHASES_GERMANY | true | 4 | 1
| 0 | 0
purchases | true | 1 | 1
| 1 | 1
-------------------------------------------------------------
-----------------------------------
ksql>
https://www.confluent.io/stream-processing-cookbook/ksql-
recipes/data-filtering
Let's just get the German orders on
one
topic1 table
cool Now we have all the purchases
from Germany in a separate topic
Try it
Mr
https://docs.confluent.io/current/ksql/docs/capacity-
planning.html
HttRDWARET Performance
Considerations
MemoryCPU
Resource used by
0h heap
message
SQL to serialize Processing
des endre messages off heap Aggregating
to Streams I Ktables Joins
then executequeries 32 GB Min
4 cores min start here then t
Network
risk
Aggregations
I join Always the
persist temp state today Biggest
bottle
100 GB SSDs min heck for good
throughput
I GBit Nlc
mopeE
info I
Sizing
considerations
It may need to add more brokers
KSQL Queries consume I produce
from topics
Repartitioning
stateful Queries mean changelog
topics
Throughput
Decreases relative to every count due
to message sculcomplexity
Query Types
project joins AggregationsFilter
2X CPU
sumSELECT
COUNT
FROM ETC
WHERE
What about headless
TIP D Test in
deployment
indeYET.info
gdefsg
ii
JVM
I
af
s
gcog
pgekg t.pg.es
path to confluent bin1kSqlnodequery file pathltolmyquery Sql
create Drop streams
start Stop Queries
Start server nodes
VC your workflow
When
to use
KSQL
when to note
Where is
yerp
B
f outed
data
Kafka Et
ggqB.B.qStreams
4 q BEBoBBoaooqBdOEFBdoao.q
BB BB
q.BBOaooaaBFg.fBe 0KB Oooo Be
Bebop BABB BARB
BBBhAoo OB0BBBGa H
Bo Be
BBeoaooss.oo.BB
BBaqfMih
BffrI.aoaoa.ooo.qqo.Bg.aeo.w
BB B FOE
ee LAB Sooooo B wannabe
Teaser are you O
a
maintainer
Java Ef.ggphhtdf.NET
teams to are you
new to Kafka
Kafka Streams
Processer API low level
Imperative customizable
streams API built in abstractions
Functional
KSTREAM
KTABLE
GLOBAL K TABLE
Stateless Stateful
transformations
L l I I r I i r a
I
don't
care howy
I
want
KafkaStreams
forEvery
language on
the
planet o
I C C l y 7 Your team
tfGo
gggµ
Agg CONSUME
STREAM DO OUT
process
Jd I
iirEE ifeng.im
gon
qgf.BAsatisfy all
yer devs.iqdOFOFa
yf gf
anotseha.LT
DO BIG CLUSTERS
A
gI5E
IjE
isagoge
RESOURCE
ISOLATION
PER USE CASE
TEAM RULE
JUST LIKE D BI
U SE Ca SE
eDB
g
f gqt
EFTrapezoidFooloewarTmore
D
yZ bout Kafka
my dog my losing
battle against
the girl
scouts of America O.BEBD.io
https://mockaroo.com/
TEE
earn
you
waivers
jI
EEE EEEaka
more
F Yehaephes
E
l
get
Ksac
what if l want to use differentdats
https://www.confluent.io/
stream-processing-cookbook/
https://docs.confluent.io/
current/streams/concepts.html
https://docs.confluent.io/
current/ksql/docs/index.html
https://www.confluent.io/
training/ksql-apache-kafka-
streams-processing
When to KSQL & When to Live the KStream (Dani Traphagen, Confluent) Kafka Summit London 2019

Weitere ähnliche Inhalte

Was ist angesagt?

Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
confluent
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
confluent
 

Was ist angesagt? (20)

From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Apache Kafka in Adobe Ad Cloud's Analytics Platform
Apache Kafka in Adobe Ad Cloud's Analytics PlatformApache Kafka in Adobe Ad Cloud's Analytics Platform
Apache Kafka in Adobe Ad Cloud's Analytics Platform
 
Westpac AU - Confluent Schema Registry
Westpac AU - Confluent Schema RegistryWestpac AU - Confluent Schema Registry
Westpac AU - Confluent Schema Registry
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
 
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
Securing Kafka At Zendesk (Joy Nag, Zendesk) Kafka Summit 2020
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer Building an Event-oriented Data Platform with Kafka, Eric Sammer
Building an Event-oriented Data Platform with Kafka, Eric Sammer
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
 
Confluent Developer Training
Confluent Developer TrainingConfluent Developer Training
Confluent Developer Training
 
Building Out Your Kafka Developer CDC Ecosystem
Building Out Your Kafka Developer CDC  EcosystemBuilding Out Your Kafka Developer CDC  Ecosystem
Building Out Your Kafka Developer CDC Ecosystem
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache KafkaWestpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache Kafka
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
How to over-engineer things and have fun? | Oto Brglez, OPALAB
How to over-engineer things and have fun? | Oto Brglez, OPALABHow to over-engineer things and have fun? | Oto Brglez, OPALAB
How to over-engineer things and have fun? | Oto Brglez, OPALAB
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 

Ähnlich wie When to KSQL & When to Live the KStream (Dani Traphagen, Confluent) Kafka Summit London 2019

Ähnlich wie When to KSQL & When to Live the KStream (Dani Traphagen, Confluent) Kafka Summit London 2019 (20)

To Ksql Or Live the KStream
To Ksql Or Live the KStreamTo Ksql Or Live the KStream
To Ksql Or Live the KStream
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Multimaster
MultimasterMultimaster
Multimaster
 
CMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesCMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 Instances
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
AWS Summit Berlin 2013 - Your first week with EC2
AWS Summit Berlin 2013 - Your first week with EC2AWS Summit Berlin 2013 - Your first week with EC2
AWS Summit Berlin 2013 - Your first week with EC2
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving System
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2
 
Daniel Steigerwald - Este.js - konec velkého Schizma
Daniel Steigerwald - Este.js - konec velkého SchizmaDaniel Steigerwald - Este.js - konec velkého Schizma
Daniel Steigerwald - Este.js - konec velkého Schizma
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 

Mehr von confluent

Mehr von confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

When to KSQL & When to Live the KStream (Dani Traphagen, Confluent) Kafka Summit London 2019

  • 1. who am i user ID Dani Alias DTrapezoid Bg Apache Proj Dist Sys s real time event driven Sys slaying legacy How did we get here age 3in f pgggpaeomqao.m.oio.ms aknaf.at ngI what are we doing Y KSQL Architecture KSQL Use cases 3 KS Q performance considerations 4 When to use k SQL when to not g introduce KST Reams
  • 2. Ok let's learn more about E you
  • 3. What letter best describes your ego on T select hope for your cluster select profit I l select y it depends select make a wish
  • 5. Why does it matter to think about a Query before you eE execute it p P 77 27 n p Lots of led sons
  • 6. we do hit always have control over stuff E ear to I can r jeng'd s g FEI a
  • 7. why use Igf E TEEoietEEIifions qq.fqtf.gd.de RDBMS initial use cases thinking Benefits 11 DB integration Buffering 1 Back pressure message bus Decouples system pub sub Distributed Highly Availabl lowly latent Data governancei omen box wait there's moire as a streaming Platform K2fk2 Streams API STREAM ALL THE THINGS
  • 8. Event Driven systems Transform once use many o sto I tIIgsEgEe etc ftp.eamprocessing 3 f L a Ii zgiB iP KSQL KSTREAMS ice creams
  • 9. O oFiEE Stream ProcessingEIEEE.EE Ehis w Kafka tt FFfB abstracts I F't 7f 9 to KSQL KSTREAMS III gently caaa.sieams3 s5om3oa Let's talk about what KSQL is under the hood
  • 10. what happens when you KSQL CREATE STREAM AS SELECT CREATE TABLE AS SELECT Hr Output To P l C Name Default same 2Istream table Or us to Mre with KAFKA TOPIC prop WITH clause partitions customize carefully w PARTITIONS prop WITH clause Replication Factor Defdutt RFI topic L customial with REPLICAS property WITH clause Aggregations leverage embedded storage engine to locally manage state A compacted changelog topic persistsaggregation state compacted changelogtopics have the same ofpartitions as the input stream Default L replies
  • 11. is KSQL like 5dL O I KSQL µ 1 111111 1 IT Aggregate 11 I 11TH t 55 join i ta i WT IT Filter o HDDI 0 I connect 0 7 WE 0 sinkI source 1 0 connect connector I 0 Elasticsource It sink qq.iqconnector 0 connect qq.gggID ffoHfi Skinny Ero S3 e o L gtfEE.io EEEEhhaaaE Golf connect cassandr imhoff KSQL is a DBA's worst nightmare It's a Query that never stops Jeremy Custenborder principalsystemsEngineers confluent inc
  • 12. why would someone want to use KSQL then Mumia to write EE CONTINUOUS stream processing programs simply my Run SQL Queries against eo date agnostic of programming language my works contrhously in Real time Queries won't Quit until the messages do my Fault tolerant f scales horizontally Vertically f Distributed
  • 13. what does the architecture of Ksa look like EE ii I I I I Ksai clients if EEE SQL Engine www.gesksQLstdtemeuts Quenes rest API itow client aeu.esasffctohememra ee.me 384 tzfnkqgtopykosurtkso.ie Apps w confluentcontrol Ksar server fEosgedneofsikEEEEIYEE.EEaEEa ADD KSQL servers who Restarting APPS
  • 14. https://www.confluent.io/blog/ksql-in-action-real-time-streaming-etl-from-oracle- transactional-data https://www.confluent.io/blog/ksql-in-action-enriching-csv-events-with-data-from-rdbms-into-AWS/ https://docs.confluent.io/current/ksql/docs/developer-guide/transform-a-stream-with-ksql.html https://www.confluent.io/stream-processing-cookbook/ksql-recipes/detecting-abnormal-transactions what are some cool KSQL use cases if t my streaming E 14 RT monitoring rders Kafka CDC 1 4 4 69905 connect elastic login Debezium ksypez.ms Elasticus KSQL T Data Enrichment card Eeo Kwangyez qq.io gEioE9oqzIksapfoko98cnr're'tflog 2 Egg53 Kafkaconnect Debezium KSQL my Data Transformation change data format of message Wes timestamp field of partitions timestamp format of replicas Kafkatopic name F Anomaly Detection create a of time to write to a ksactableftopic collect anomalous behavior WINDOW TUMBLING
  • 15. { "order_id": 1, "customer_name": "Maryanna Andryszczak", "date_of_birth": "1922-06-06T02:21:59Z", "product": "Nut - Walnut, Pieces", "order_total_usd": "1.65", "town": "Portland", "country": "United States" } ksql> CREATE STREAM purchases (order_id INT, customer_name VARCHAR, date_of_birth VARCHAR, product VARCHAR, order_total_usd VARCHAR, town VARCHAR, country VARCHAR) WITH (KAFKA_TOPIC='purchases', VALUE_FORMAT='JSON'); Message ---------------- Stream created ---------------- SELECT * FROM PURCHASES LIMIT 5; SELECT ORDER_ID, PRODUCT, TOWN, COUNTRY FROM PURCHASES WHERE COUNTRY='Germany'; rims y create PURCHASES stream Let's validate the 1st few messages Now let's filter by the country Germany
  • 16. CREATE STREAM PUCHASES_GERMANY AS SELECT * FROM PURCHASES WHERE COUNTRY='Germany'; ksql> LIST TOPICS; Kafka Topic | Registered | Partitions | Partition Replicas | Consumers | ConsumerGroups ------------------------------------------------------------- ----------------------------------- _confluent-metrics | false | 12 | 1 | 0 | 0 PUCHASES_GERMANY | true | 4 | 1 | 0 | 0 purchases | true | 1 | 1 | 1 | 1 ------------------------------------------------------------- ----------------------------------- ksql> https://www.confluent.io/stream-processing-cookbook/ksql- recipes/data-filtering Let's just get the German orders on one topic1 table cool Now we have all the purchases from Germany in a separate topic Try it Mr
  • 17. https://docs.confluent.io/current/ksql/docs/capacity- planning.html HttRDWARET Performance Considerations MemoryCPU Resource used by 0h heap message SQL to serialize Processing des endre messages off heap Aggregating to Streams I Ktables Joins then executequeries 32 GB Min 4 cores min start here then t Network risk Aggregations I join Always the persist temp state today Biggest bottle 100 GB SSDs min heck for good throughput I GBit Nlc mopeE info I
  • 18. Sizing considerations It may need to add more brokers KSQL Queries consume I produce from topics Repartitioning stateful Queries mean changelog topics Throughput Decreases relative to every count due to message sculcomplexity Query Types project joins AggregationsFilter 2X CPU sumSELECT COUNT FROM ETC WHERE
  • 19. What about headless TIP D Test in deployment indeYET.info gdefsg ii JVM I af s gcog pgekg t.pg.es path to confluent bin1kSqlnodequery file pathltolmyquery Sql create Drop streams start Stop Queries Start server nodes VC your workflow
  • 20. When to use KSQL when to note Where is yerp B f outed data Kafka Et ggqB.B.qStreams 4 q BEBoBBoaooqBdOEFBdoao.q BB BB q.BBOaooaaBFg.fBe 0KB Oooo Be Bebop BABB BARB BBBhAoo OB0BBBGa H Bo Be BBeoaooss.oo.BB BBaqfMih BffrI.aoaoa.ooo.qqo.Bg.aeo.w BB B FOE ee LAB Sooooo B wannabe Teaser are you O a maintainer Java Ef.ggphhtdf.NET teams to are you new to Kafka
  • 21. Kafka Streams Processer API low level Imperative customizable streams API built in abstractions Functional KSTREAM KTABLE GLOBAL K TABLE Stateless Stateful transformations
  • 22. L l I I r I i r a I don't care howy I want KafkaStreams forEvery language on the planet o I C C l y 7 Your team tfGo gggµ Agg CONSUME STREAM DO OUT process Jd I iirEE ifeng.im gon qgf.BAsatisfy all yer devs.iqdOFOFa yf gf
  • 23. anotseha.LT DO BIG CLUSTERS A gI5E IjE isagoge RESOURCE ISOLATION PER USE CASE TEAM RULE JUST LIKE D BI U SE Ca SE
  • 24. eDB g f gqt EFTrapezoidFooloewarTmore D yZ bout Kafka my dog my losing battle against the girl scouts of America O.BEBD.io
  • 25. https://mockaroo.com/ TEE earn you waivers jI EEE EEEaka more F Yehaephes E l get Ksac what if l want to use differentdats https://www.confluent.io/ stream-processing-cookbook/ https://docs.confluent.io/ current/streams/concepts.html https://docs.confluent.io/ current/ksql/docs/index.html https://www.confluent.io/ training/ksql-apache-kafka- streams-processing