SlideShare ist ein Scribd-Unternehmen logo
1 von 79
1Confidential
KSQL
The Open Source Streaming SQL Engine for Apache Kafka
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
KSQLis the
Streaming
SQL Engine
for
Apache Kafka
3Confidential
1.0 Enterprise
Ready J
A Brief History of Apache Kafka and Confluent
0.11 Exactly-once
semantics
0.10 Data processing
(Streams API)
0.9 Data integration
(Connect API)
Intra-cluster
replication
0.8
2012 2014
Cluster mirroring0.7
2015 2016 20172013 2018
CP 4.1
KSQL GA
4Confidential
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
5KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
6KSQL- Streaming SQL for Apache Kafka
Apache Kafka - A Distributed, Scalable Commit Log
7KSQL- Streaming SQL for Apache Kafka
Anatomy of a Topic
8KSQL- Streaming SQL for Apache Kafka
Apache Kafka at Large Scale à No need to do a POC
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63921
https://qconlondon.com/london2018/presentation/cloud-native-and-scalable-kafka-architecture
(2018) (2018)
9KSQL- Streaming SQL for Apache Kafka
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
Apache Kafka – The Rise of a Streaming Platform
10KSQL- Streaming SQL for Apache Kafka
Apache Kafka – The Rise of a Streaming Platform
11KSQL- Streaming SQL for Apache Kafka
KSQL – The Streaming SQL Engine for Apache Kafka
12KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
13KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Part of Apache Kafka
14KSQL- Streaming SQL for Apache Kafka
Stream Processing
Data at Rest Data in Motion
15KSQL- Streaming SQL for Apache Kafka
Key concepts
16KSQL- Streaming SQL for Apache Kafka
Independent Dev / Test / Prod of independent Apps
17KSQL- Streaming SQL for Apache Kafka
No Matter Where it Runs
18KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Processor Topology
Read input from Kafka
Operator DAG:
• Filter / map / aggregation / joins
• Operators can be stateful
Write result back to Kafka
19KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Processor Topology
20KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Runtime
21KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Distributed State
22KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Scaling
23KSQL- Streaming SQL for Apache Kafka
Pluggable State Store.
Default Strategy:
- In-memory (fast access)
- Local disc (for fast recovery)
- Replicated to Kafka (for resilience)
https://www.infoq.com/presentations/kafka-streams-spring-cloud
State Management
24KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Streams and Tables
25KSQL- Streaming SQL for Apache Kafka
Kafka Streams - Streams and Tables
26KSQL- Streaming SQL for Apache Kafka
// Example: reading data from Kafka
KStream<byte[], String> textLines = builder.stream("textlines-topic", Consumed.with(
Serdes.ByteArray(), Serdes.String()));
// Example: transforming data
KStream<byte[], String> upperCasedLines= rawRatings.mapValues(String::toUpperCase));
KStream
27KSQL- Streaming SQL for Apache Kafka
// Example: aggregating data
KTable<String, Long> wordCounts = textLines
.flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("W+")))
.groupBy((key, word) -> word)
.count();
KTable
28KSQL- Streaming SQL for Apache Kafka
Kafka
Streams
A complete streaming microservices, ready for production at large-scale
App configuration
Define processing
(here: WordCount)
Start processing
29KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
30KSQL- Streaming SQL for Apache Kafka
Why KSQL?
Population
CodingSophistication
Realm of Stream Processing
New, Expanded Realm
BI
Analysts
Core
Developers
Data
Engineers
Core Developers
who don’t like
Java
Kafka
Streams
KSQL
31KSQL- Streaming SQL for Apache Kafka
Trade-Offs
• subscribe()
• poll()
• send()
• flush()
• mapValues()
• filter()
• punctuate()
• Select…from…
• Join…where…
• Group by..
Flexibility Simplicity
Kafka Streams KSQL
Consumer
Producer
32KSQL- Streaming SQL for Apache Kafka
What is it for ?
Streaming ETL
• Kafka is popular for data pipelines
• KSQL enables easy transformations of data within the pipe
CREATE STREAM vip_actions AS
SELECT userid, page, action FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
33KSQL- Streaming SQL for Apache Kafka
What is it for ?
Analytics, e.g. Anomaly Detection
• Identifying patterns or anomalies in real-time data, surfaced in milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTES)
GROUP BY card_number
HAVING count(*) > 3;
34KSQL- Streaming SQL for Apache Kafka
What is it for ?
Real Time Monitoring
• Log data monitoring, tracking and alerting
• Sensor / IoT data
CREATE TABLE error_counts AS
SELECT error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE type = 'ERROR'
GROUP BY error_code;
35KSQL- Streaming SQL for Apache Kafka
What is it for ?
Simple Derivations of Existing Topics
• One-liner to re-partition and / or re-key a topic for new uses
CREATE STREAM views_by_userid
WITH (PARTITIONS=6,
VALUE_FORMAT=‘JSON’,
TIMESTAMP=‘view_time’) AS
SELECT *
FROM clickstream
PARTITION BY user_id;
36KSQL- Streaming SQL for Apache Kafka
What is it for ?
Easily Convert between Data Formats in Real Time
• JSON-to-Avro conversion
CREATE STREAM sensor_events_json (sensor_id VARCHAR, temperature
INTEGER, ...) WITH (KAFKA_TOPIC='events-topic', VALUE_FORMAT='JSON');
CREATE STREAM sensor_events_avro WITH (VALUE_FORMAT='AVRO') AS SELECT *
FROM sensor_events_json;
37KSQL- Streaming SQL for Apache Kafka
Where is KSQL not such a great fit (at least not yet)?
Powerful ad-hoc query
○ Limited span of time usually
retained in Kafka
○ No indexes
BI reports (Tableau etc.)
○ No indexes
○ No JDBC (most Bi tools are not
good with continuous results!)
Keep in mind:
Kafka is a streaming platform!
38KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
39KSQL- Streaming SQL for Apache Kafka
KSQL – A Streaming SQL Engine for Apache Kafka
40KSQL- Streaming SQL for Apache Kafka
KSQL is Equally viable for S / M / L / XL / XXL use cases
Ok. Ok. Ok.
… and KSQL is ready for production, including 24/7 support!
41KSQL- Streaming SQL for Apache Kafka
KSQL is Equally viable for S / M / L / XL / XXL use cases
42KSQL- Streaming SQL for Apache Kafka
How do you deploy applications?
43KSQL- Streaming SQL for Apache Kafka
Where to develop and operate your applications?
44KSQL- Streaming SQL for Apache Kafka
KSQL Concepts
● No need for source code
• Zero, none at all, not even one line.
• No SerDes, no generics, no lambdas, ...
● All the Kafka Streams “magic” out-of-the-box
• Exactly Once Semantics
• Windowing
• Event-time aggregation
• Late-arriving data
• Distributed, fault-tolerant, scalable, ...
45KSQL- Streaming SQL for Apache Kafka
STREAM and TABLE as first-class citizens
46KSQL- Streaming SQL for Apache Kafka
SELECT statement syntax
SELECT `select_expr` [, ...]
FROM `from_item` [, ...]
[ WINDOW `window_expression` ]
[ WHERE `condition` ]
[ GROUP BY `grouping expression` ]
[ HAVING `having_expression` ]
[ LIMIT n ]
where from_item is one of the following:
stream_or_table_name [ [ AS ] alias]
from_item LEFT JOIN from_item ON join_condition
47KSQL- Streaming SQL for Apache Kafka
CREATE STREAM AS syntax
CREATE STREAM `stream_name`
[WITH (`property = expression` [, …] ) ]
AS SELECT `select_expr` [, ...]
FROM `from_item` [, ...]
[ WHERE `condition` ]
[ PARTITION BY `column_name` ]
● where property can be any of the following:
KAFKA_TOPIC = name - what to call the sink topic
FORMAT = DELIMITED | JSON | AVRO - defaults to format of input stream
AVROSCHEMAFILE = path/to/file - if FORMAT=AVRO, where the output schema file will be written to
PARTITIONS = # - number of partitions in sink topic
TIMESTAMP = column - The name of the column to use as the timestamp. This can be used to define the
event time.
48KSQL- Streaming SQL for Apache Kafka
CREATE TABLE AS syntax
CREATE TABLE `stream_name`
[WITH ( `property_name = expression` [, ...] )]
AS SELECT `select_expr` [, ...]
FROM `from_item` [, ...]
[ WINDOW `window_expression` ]
[ WHERE `condition` ]
[ GROUP BY `grouping expression` ]
[ HAVING `having_expression` ]
● where property values are same as for ‚Create Streams as Select‘
49KSQL- Streaming SQL for Apache Kafka
Automatic Inference of Topic Schema (leveraging Confluent Schema Registry)
https://www.matthowlett.com/2017-12-23-exploring-wikipedia-ksql.html
50KSQL- Streaming SQL for Apache Kafka
WINDOWing
● Not ANSI SQL ! à Continuous Queries
● Three types supported (same as Kafka Streams):
• TUMBLING (repeats at a non-overlapping interval)
• SELECT appname, ip, COUNT(appname) AS problem_count FROM
logstream WINDOW TUMBLING (size 1 minute) WHERE loglevel='ERROR'
GROUP BY appname, ip;
• HOPPING (similar to tumbling, but hopping generally has an overlapping interval)
• SELECT itemid, SUM(arraycol[0]) FROM orders WINDOW HOPPING (
size 20 second, advance by 5 second) GROUP BY itemid;
• SESSION (groups elements by sessions of activity, do not overlap and do not have a fixed start and end time,
closes when it does not receive elements for a certain period of time, i.e., when a gap of inactivity occurred)
• SELECT itemid, SUM(sales_price) FROM orders WINDOW SESSION (20
second) GROUP BY itemid;
51KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
52KSQL- Streaming SQL for Apache Kafka
KSQL CLI
ksql> CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH (kafka_topic='pageviews',
value_format='DELIMITED');
ksql> CREATE TABLE users_original (registertime bigint, gender varchar, regionid varchar, userid varchar) WITH (kafka_topic='users',
value_format='JSON');
ksql> SELECT pageid FROM pageviews_original LIMIT 10;
ksql> CREATE STREAM pageviews_female AS SELECT users_original.userid AS userid, pageid, regionid, gender FROM
pageviews_original LEFT JOIN users_original ON pageviews_original.userid = users_original.userid WHERE gender = 'FEMALE';
ksql/bin/ksql-server-start
ksql/bin/ksql
53KSQL- Streaming SQL for Apache Kafka
KSQL Web UI
54KSQL- Streaming SQL for Apache Kafka
Live Demo – KSQL Getting Started
55KSQL- Streaming SQL for Apache Kafka
Break
56KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
57KSQL- Streaming SQL for Apache Kafka
KSQL - Components
KSQL has 3 main components:
1. The Engine which actually runs the Kafka Streams topologies
2. The REST server interface enables an Engine to receive instructions from the CLI
or any other client
3. The CLI, designed to be familiar to users of MySQL, Postgres etc.
(Note that you also need a Kafka Cluster… KSQL is deployed independently)
58KSQL- Streaming SQL for Apache Kafka
How to run KSQL è #1 Client – Server (Interactive Mode)
JVM
KSQL Server
KSQL CLI / any REST Client
JVM
KSQL Server
JVM
KSQL Server
Kafka Cluster
59KSQL- Streaming SQL for Apache Kafka
How to run KSQL è #1 Client – Server (Interactive Mode)
• Start any number of server nodes
bin/ksql-server-start
• Start one or more CLIs or REST Clients and point them to a server
bin/ksql https://myksqlserver:8090
• All servers share the processing load
Technically, instances of the same Kafka Streams Applications
scale up / down without restart
60KSQL- Streaming SQL for Apache Kafka
How to run KSQL è #2 as Standalone Application (Headless Mode)
JVM
KSQL Server
JVM
KSQL Server
JVM
KSQL Server
Kafka Cluster
61KSQL- Streaming SQL for Apache Kafka
How to run KSQL è #2 as Standalone Application (Headless Mode)
• Start any number of server nodes
Pass a file of KSQL statement to execute
bin/ksql-node query-file=foo/bar.sql
• Ideal for streaming ETL application deployment
Version-control your queries and transformations as code
• All running engines share the processing load
Technically, instances of the same Kafka Streams Applications
scale up / down without restart
62KSQL- Streaming SQL for Apache Kafka
How to run KSQL è #3 Embedded in an Application (JVM Mode)
JVM App Instance
KSQL Engine
Application Code
JVM App Instance
KSQL Engine
Application Code
JVM App Instance
KSQL Engine
Application Code
Kafka Cluster
63KSQL- Streaming SQL for Apache Kafka
How to run KSQL è #3 Embedded in an Application (JVM Mode)
• Embed directly in your Java application
• Generate and execute KSQL queries through the Java API
Version-control your queries and transformations as code
• All running application instances share the processing load
Technically, instances of the same Kafka Streams Applications
scale up / down without restart
64KSQL- Streaming SQL for Apache Kafka
Dedicating resources
Join Engines to the same
‘service pool’ by means of the
ksql.service.id property
65KSQL- Streaming SQL for Apache Kafka
Sizing of KSQL Servers
3 Categories of KSQL Queries
• Project / Filter, e.g. SELECT <columns> FROM <table/stream> WHERE <condition>
• Joins, e.g. Stream-Table Joins
• Aggregations, e.g. SUM, COUNT, TOPK, TOPKDISTINCT
Sizing Questions
• How much Memory, CPU, Disk, Network?
• When and how to scale up / down?
• How to tune performance?
• Etc…
Guide for Capacity Planning
https://docs.confluent.io/current/ksql/docs/
capacity-planning.html
https://docs.confluent.io/current/streams/
sizing.html#streams-sizing
66KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
67KSQL- Streaming SQL for Apache Kafka
Demo: Clickstream Analysis
Kafka
Producer
Elastic
search
Grafana
Kafka
Cluster
Kafka
Connect
KSQL
Stream of
Log Events
Kafka Ecosystem
Other Components
68KSQL- Streaming SQL for Apache Kafka
Demo - Clickstream Analysis
• https://docs.confluent.io/current/ksql/docs/tutorials/clickstream-docker.html#ksql-clickstream-
docker
• Leverages Apache Kafka, Kafka Connect, KSQL, Elasticsearch and Grafana
• 5min screencast: https://www.youtube.com/watch?v=A45uRzJiv7I
• Setup in 5 minutes (with or without Docker)
SELECT STREAM
CEIL(timestamp TO HOUR) AS timeWindow, productId,
COUNT(*) AS hourlyOrders, SUM(units) AS units
FROM Orders GROUP BY CEIL(timestamp TO HOUR),
productId;
timeWindow | productId | hourlyOrders | units
------------+-----------+--------------+-------
08:00:00 | 10 | 2 | 5
08:00:00 | 20 | 1 | 8
09:00:00 | 10 | 4 | 22
09:00:00 | 40 | 1 | 45
... | ... | ... | ...
69KSQL- Streaming SQL for Apache Kafka
Live Demo – KSQL Clickstream Analysis
70KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
71KSQL- Streaming SQL for Apache Kafka
Advanced Use Case - Machine Learning UDF for Real Time Sensor Analytics
Kafka
Producer
Elastic
search
Grafana
Kafka
Cluster
Kafka
Connect
KSQL
Heath
Check
Sensor
Kafka Ecosystem
Other Components
Emergency
System
All Data
Apply
Analytic
Model
Filter
Predictions
72KSQL- Streaming SQL for Apache Kafka
KSQL and Deep Learning (Autoencoder) for IoT Sensor Analytics
https://www.confluent.io/blog/write-user-defined-function-udf-ksql/
https://github.com/kaiwaehner/ksql-machine-learning-udf
“SELECT event_id, anomaly(SENSORINPUT) FROM health_sensor;“
KSQL UDF using an analytic model under the hood
à Write once, use in any KSQL statement
73KSQL- Streaming SQL for Apache Kafka
KSQL UDF
https://www.confluent.io/blog/write-user-defined-function-udf-ksql/
https://github.com/kaiwaehner/ksql-machine-learning-udf
74KSQL- Streaming SQL for Apache Kafka
Use Case:
Anomaly Detection
(Sensor Healthcheck)
Machine Learning Algorithm:
Autoencoder built with H2O
Streaming Platform:
Apache Kafka and KSQL
Live Demo – Prebuilt Model Embedded in KSQL Function
75KSQL- Streaming SQL for Apache Kafka
Agenda – KSQL Deep Dive
1) Apache Kafka Ecosystem
2) Kafka Streams as Foundation for KSQL
3) Motivation for KSQL
4) KSQL Concepts
5) Live Demo #1 – Intro to KSQL
6) KSQL Architecture
7) Live Demo #2 - Clickstream Analysis
8) Building a User Defined Function (Example: Machine Learning)
9) Getting Started
76KSQL- Streaming SQL for Apache Kafka
KSQL Quick Start
github.com/confluentinc/ksql
Local runtime
or
Docker container
77KSQL- Streaming SQL for Apache Kafka
Resources and Next Steps
Get Involved
• Try the Quickstart on GitHub
• Check out the code
• Play with the examples
KSQL is GA… You can already use it for production deployments!
https://github.com/confluentinc/ksql
http://confluent.io/ksql
https://slackpass.io/confluentcommunity #ksql
KSQLis the
Streaming
SQL Engine
for
Apache Kafka
79KSQL- Streaming SQL for Apache Kafka
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.confluent.io
www.kai-waehner.de
LinkedIn
Questions? Feedback?
Please contact me…

Weitere ähnliche Inhalte

Was ist angesagt?

ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Eventsconfluent
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafkaconfluent
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Timothy Spann
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101Whiteklay
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connectconfluent
 
Serverless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesServerless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesClaus Ibsen
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Kafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platformKafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platformJean-Paul Azar
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaGuido Schmutz
 
[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf
[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf
[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdfOpen Source Consulting
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 

Was ist angesagt? (20)

kafka
kafkakafka
kafka
 
ksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time EventsksqlDB: Building Consciousness on Real Time Events
ksqlDB: Building Consciousness on Real Time Events
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
From Zero to Hero with Kafka Connect
From Zero to Hero with Kafka ConnectFrom Zero to Hero with Kafka Connect
From Zero to Hero with Kafka Connect
 
Serverless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesServerless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on Kubernetes
 
Kafka basics
Kafka basicsKafka basics
Kafka basics
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Kafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platformKafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial - introduction to the Kafka streaming platform
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around Kafka
 
[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf
[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf
[오픈테크넷서밋2022] 국내 PaaS(Kubernetes) Best Practice 및 DevOps 환경 구축 사례.pdf
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 

Ähnlich wie KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka

Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Codemotion
 
KSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKai Wähner
 
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...Kai Wähner
 
KSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache KafkaKSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache KafkaMatthias J. Sax
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLFlorent Ramiere
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Paolo Castagna
 
Real Time Stream Processing with KSQL and Kafka
Real Time Stream Processing with KSQL and KafkaReal Time Stream Processing with KSQL and Kafka
Real Time Stream Processing with KSQL and KafkaDavid Peterson
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLRethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLKai Wähner
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLSteps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLconfluent
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!confluent
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!Guido Schmutz
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
KSQL- Streaming Sql for Kafka
KSQL- Streaming Sql for KafkaKSQL- Streaming Sql for Kafka
KSQL- Streaming Sql for KafkaKnoldus Inc.
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLNick Dearden
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsLightbend
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comCedric Vidal
 
Sparkstreaming with kafka and h base at scale (1)
Sparkstreaming with kafka and h base at scale (1)Sparkstreaming with kafka and h base at scale (1)
Sparkstreaming with kafka and h base at scale (1)Sigmoid
 

Ähnlich wie KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka (20)

Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
KSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache KafkaKSQL – An Open Source Streaming Engine for Apache Kafka
KSQL – An Open Source Streaming Engine for Apache Kafka
 
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
KSQL – The Open Source SQL Streaming Engine for Apache Kafka (Big Data Spain ...
 
KSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache KafkaKSQL---Streaming SQL for Apache Kafka
KSQL---Streaming SQL for Apache Kafka
 
Riviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQLRiviera Jug - 20/03/2018 - KSQL
Riviera Jug - 20/03/2018 - KSQL
 
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!Un'introduzione a Kafka Streams e KSQL... and why they matter!
Un'introduzione a Kafka Streams e KSQL... and why they matter!
 
Real Time Stream Processing with KSQL and Kafka
Real Time Stream Processing with KSQL and KafkaReal Time Stream Processing with KSQL and Kafka
Real Time Stream Processing with KSQL and Kafka
 
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQLRethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQLSteps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
Steps to Building a Streaming ETL Pipeline with Apache Kafka® and KSQL
 
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
KSQL- Streaming Sql for Kafka
KSQL- Streaming Sql for KafkaKSQL- Streaming Sql for Kafka
KSQL- Streaming Sql for Kafka
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQL
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka Streams
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
Sparkstreaming with kafka and h base at scale (1)
Sparkstreaming with kafka and h base at scale (1)Sparkstreaming with kafka and h base at scale (1)
Sparkstreaming with kafka and h base at scale (1)
 

Mehr von Kai Wähner

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Kai Wähner
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?Kai Wähner
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKai Wähner
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaKai Wähner
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareKai Wähner
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Kai Wähner
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureKai Wähner
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Kai Wähner
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryKai Wähner
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail IndustryKai Wähner
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Kai Wähner
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Kai Wähner
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Kai Wähner
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsKai Wähner
 

Mehr von Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and Logistics
 

Kürzlich hochgeladen

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...masabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 

Kürzlich hochgeladen (20)

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 

KSQL Deep Dive - The Open Source Streaming Engine for Apache Kafka

  • 1. 1Confidential KSQL The Open Source Streaming SQL Engine for Apache Kafka Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 3. 3Confidential 1.0 Enterprise Ready J A Brief History of Apache Kafka and Confluent 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication 0.8 2012 2014 Cluster mirroring0.7 2015 2016 20172013 2018 CP 4.1 KSQL GA
  • 4. 4Confidential Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 5. 5KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 6. 6KSQL- Streaming SQL for Apache Kafka Apache Kafka - A Distributed, Scalable Commit Log
  • 7. 7KSQL- Streaming SQL for Apache Kafka Anatomy of a Topic
  • 8. 8KSQL- Streaming SQL for Apache Kafka Apache Kafka at Large Scale à No need to do a POC https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63921 https://qconlondon.com/london2018/presentation/cloud-native-and-scalable-kafka-architecture (2018) (2018)
  • 9. 9KSQL- Streaming SQL for Apache Kafka The Log ConnectorsConnectors Producer Consumer Streaming Engine Apache Kafka – The Rise of a Streaming Platform
  • 10. 10KSQL- Streaming SQL for Apache Kafka Apache Kafka – The Rise of a Streaming Platform
  • 11. 11KSQL- Streaming SQL for Apache Kafka KSQL – The Streaming SQL Engine for Apache Kafka
  • 12. 12KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 13. 13KSQL- Streaming SQL for Apache Kafka Kafka Streams - Part of Apache Kafka
  • 14. 14KSQL- Streaming SQL for Apache Kafka Stream Processing Data at Rest Data in Motion
  • 15. 15KSQL- Streaming SQL for Apache Kafka Key concepts
  • 16. 16KSQL- Streaming SQL for Apache Kafka Independent Dev / Test / Prod of independent Apps
  • 17. 17KSQL- Streaming SQL for Apache Kafka No Matter Where it Runs
  • 18. 18KSQL- Streaming SQL for Apache Kafka Kafka Streams - Processor Topology Read input from Kafka Operator DAG: • Filter / map / aggregation / joins • Operators can be stateful Write result back to Kafka
  • 19. 19KSQL- Streaming SQL for Apache Kafka Kafka Streams - Processor Topology
  • 20. 20KSQL- Streaming SQL for Apache Kafka Kafka Streams - Runtime
  • 21. 21KSQL- Streaming SQL for Apache Kafka Kafka Streams - Distributed State
  • 22. 22KSQL- Streaming SQL for Apache Kafka Kafka Streams - Scaling
  • 23. 23KSQL- Streaming SQL for Apache Kafka Pluggable State Store. Default Strategy: - In-memory (fast access) - Local disc (for fast recovery) - Replicated to Kafka (for resilience) https://www.infoq.com/presentations/kafka-streams-spring-cloud State Management
  • 24. 24KSQL- Streaming SQL for Apache Kafka Kafka Streams - Streams and Tables
  • 25. 25KSQL- Streaming SQL for Apache Kafka Kafka Streams - Streams and Tables
  • 26. 26KSQL- Streaming SQL for Apache Kafka // Example: reading data from Kafka KStream<byte[], String> textLines = builder.stream("textlines-topic", Consumed.with( Serdes.ByteArray(), Serdes.String())); // Example: transforming data KStream<byte[], String> upperCasedLines= rawRatings.mapValues(String::toUpperCase)); KStream
  • 27. 27KSQL- Streaming SQL for Apache Kafka // Example: aggregating data KTable<String, Long> wordCounts = textLines .flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("W+"))) .groupBy((key, word) -> word) .count(); KTable
  • 28. 28KSQL- Streaming SQL for Apache Kafka Kafka Streams A complete streaming microservices, ready for production at large-scale App configuration Define processing (here: WordCount) Start processing
  • 29. 29KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 30. 30KSQL- Streaming SQL for Apache Kafka Why KSQL? Population CodingSophistication Realm of Stream Processing New, Expanded Realm BI Analysts Core Developers Data Engineers Core Developers who don’t like Java Kafka Streams KSQL
  • 31. 31KSQL- Streaming SQL for Apache Kafka Trade-Offs • subscribe() • poll() • send() • flush() • mapValues() • filter() • punctuate() • Select…from… • Join…where… • Group by.. Flexibility Simplicity Kafka Streams KSQL Consumer Producer
  • 32. 32KSQL- Streaming SQL for Apache Kafka What is it for ? Streaming ETL • Kafka is popular for data pipelines • KSQL enables easy transformations of data within the pipe CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum';
  • 33. 33KSQL- Streaming SQL for Apache Kafka What is it for ? Analytics, e.g. Anomaly Detection • Identifying patterns or anomalies in real-time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTES) GROUP BY card_number HAVING count(*) > 3;
  • 34. 34KSQL- Streaming SQL for Apache Kafka What is it for ? Real Time Monitoring • Log data monitoring, tracking and alerting • Sensor / IoT data CREATE TABLE error_counts AS SELECT error_code, count(*) FROM monitoring_stream WINDOW TUMBLING (SIZE 1 MINUTE) WHERE type = 'ERROR' GROUP BY error_code;
  • 35. 35KSQL- Streaming SQL for Apache Kafka What is it for ? Simple Derivations of Existing Topics • One-liner to re-partition and / or re-key a topic for new uses CREATE STREAM views_by_userid WITH (PARTITIONS=6, VALUE_FORMAT=‘JSON’, TIMESTAMP=‘view_time’) AS SELECT * FROM clickstream PARTITION BY user_id;
  • 36. 36KSQL- Streaming SQL for Apache Kafka What is it for ? Easily Convert between Data Formats in Real Time • JSON-to-Avro conversion CREATE STREAM sensor_events_json (sensor_id VARCHAR, temperature INTEGER, ...) WITH (KAFKA_TOPIC='events-topic', VALUE_FORMAT='JSON'); CREATE STREAM sensor_events_avro WITH (VALUE_FORMAT='AVRO') AS SELECT * FROM sensor_events_json;
  • 37. 37KSQL- Streaming SQL for Apache Kafka Where is KSQL not such a great fit (at least not yet)? Powerful ad-hoc query ○ Limited span of time usually retained in Kafka ○ No indexes BI reports (Tableau etc.) ○ No indexes ○ No JDBC (most Bi tools are not good with continuous results!) Keep in mind: Kafka is a streaming platform!
  • 38. 38KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 39. 39KSQL- Streaming SQL for Apache Kafka KSQL – A Streaming SQL Engine for Apache Kafka
  • 40. 40KSQL- Streaming SQL for Apache Kafka KSQL is Equally viable for S / M / L / XL / XXL use cases Ok. Ok. Ok. … and KSQL is ready for production, including 24/7 support!
  • 41. 41KSQL- Streaming SQL for Apache Kafka KSQL is Equally viable for S / M / L / XL / XXL use cases
  • 42. 42KSQL- Streaming SQL for Apache Kafka How do you deploy applications?
  • 43. 43KSQL- Streaming SQL for Apache Kafka Where to develop and operate your applications?
  • 44. 44KSQL- Streaming SQL for Apache Kafka KSQL Concepts ● No need for source code • Zero, none at all, not even one line. • No SerDes, no generics, no lambdas, ... ● All the Kafka Streams “magic” out-of-the-box • Exactly Once Semantics • Windowing • Event-time aggregation • Late-arriving data • Distributed, fault-tolerant, scalable, ...
  • 45. 45KSQL- Streaming SQL for Apache Kafka STREAM and TABLE as first-class citizens
  • 46. 46KSQL- Streaming SQL for Apache Kafka SELECT statement syntax SELECT `select_expr` [, ...] FROM `from_item` [, ...] [ WINDOW `window_expression` ] [ WHERE `condition` ] [ GROUP BY `grouping expression` ] [ HAVING `having_expression` ] [ LIMIT n ] where from_item is one of the following: stream_or_table_name [ [ AS ] alias] from_item LEFT JOIN from_item ON join_condition
  • 47. 47KSQL- Streaming SQL for Apache Kafka CREATE STREAM AS syntax CREATE STREAM `stream_name` [WITH (`property = expression` [, …] ) ] AS SELECT `select_expr` [, ...] FROM `from_item` [, ...] [ WHERE `condition` ] [ PARTITION BY `column_name` ] ● where property can be any of the following: KAFKA_TOPIC = name - what to call the sink topic FORMAT = DELIMITED | JSON | AVRO - defaults to format of input stream AVROSCHEMAFILE = path/to/file - if FORMAT=AVRO, where the output schema file will be written to PARTITIONS = # - number of partitions in sink topic TIMESTAMP = column - The name of the column to use as the timestamp. This can be used to define the event time.
  • 48. 48KSQL- Streaming SQL for Apache Kafka CREATE TABLE AS syntax CREATE TABLE `stream_name` [WITH ( `property_name = expression` [, ...] )] AS SELECT `select_expr` [, ...] FROM `from_item` [, ...] [ WINDOW `window_expression` ] [ WHERE `condition` ] [ GROUP BY `grouping expression` ] [ HAVING `having_expression` ] ● where property values are same as for ‚Create Streams as Select‘
  • 49. 49KSQL- Streaming SQL for Apache Kafka Automatic Inference of Topic Schema (leveraging Confluent Schema Registry) https://www.matthowlett.com/2017-12-23-exploring-wikipedia-ksql.html
  • 50. 50KSQL- Streaming SQL for Apache Kafka WINDOWing ● Not ANSI SQL ! à Continuous Queries ● Three types supported (same as Kafka Streams): • TUMBLING (repeats at a non-overlapping interval) • SELECT appname, ip, COUNT(appname) AS problem_count FROM logstream WINDOW TUMBLING (size 1 minute) WHERE loglevel='ERROR' GROUP BY appname, ip; • HOPPING (similar to tumbling, but hopping generally has an overlapping interval) • SELECT itemid, SUM(arraycol[0]) FROM orders WINDOW HOPPING ( size 20 second, advance by 5 second) GROUP BY itemid; • SESSION (groups elements by sessions of activity, do not overlap and do not have a fixed start and end time, closes when it does not receive elements for a certain period of time, i.e., when a gap of inactivity occurred) • SELECT itemid, SUM(sales_price) FROM orders WINDOW SESSION (20 second) GROUP BY itemid;
  • 51. 51KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 52. 52KSQL- Streaming SQL for Apache Kafka KSQL CLI ksql> CREATE STREAM pageviews_original (viewtime bigint, userid varchar, pageid varchar) WITH (kafka_topic='pageviews', value_format='DELIMITED'); ksql> CREATE TABLE users_original (registertime bigint, gender varchar, regionid varchar, userid varchar) WITH (kafka_topic='users', value_format='JSON'); ksql> SELECT pageid FROM pageviews_original LIMIT 10; ksql> CREATE STREAM pageviews_female AS SELECT users_original.userid AS userid, pageid, regionid, gender FROM pageviews_original LEFT JOIN users_original ON pageviews_original.userid = users_original.userid WHERE gender = 'FEMALE'; ksql/bin/ksql-server-start ksql/bin/ksql
  • 53. 53KSQL- Streaming SQL for Apache Kafka KSQL Web UI
  • 54. 54KSQL- Streaming SQL for Apache Kafka Live Demo – KSQL Getting Started
  • 55. 55KSQL- Streaming SQL for Apache Kafka Break
  • 56. 56KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 57. 57KSQL- Streaming SQL for Apache Kafka KSQL - Components KSQL has 3 main components: 1. The Engine which actually runs the Kafka Streams topologies 2. The REST server interface enables an Engine to receive instructions from the CLI or any other client 3. The CLI, designed to be familiar to users of MySQL, Postgres etc. (Note that you also need a Kafka Cluster… KSQL is deployed independently)
  • 58. 58KSQL- Streaming SQL for Apache Kafka How to run KSQL è #1 Client – Server (Interactive Mode) JVM KSQL Server KSQL CLI / any REST Client JVM KSQL Server JVM KSQL Server Kafka Cluster
  • 59. 59KSQL- Streaming SQL for Apache Kafka How to run KSQL è #1 Client – Server (Interactive Mode) • Start any number of server nodes bin/ksql-server-start • Start one or more CLIs or REST Clients and point them to a server bin/ksql https://myksqlserver:8090 • All servers share the processing load Technically, instances of the same Kafka Streams Applications scale up / down without restart
  • 60. 60KSQL- Streaming SQL for Apache Kafka How to run KSQL è #2 as Standalone Application (Headless Mode) JVM KSQL Server JVM KSQL Server JVM KSQL Server Kafka Cluster
  • 61. 61KSQL- Streaming SQL for Apache Kafka How to run KSQL è #2 as Standalone Application (Headless Mode) • Start any number of server nodes Pass a file of KSQL statement to execute bin/ksql-node query-file=foo/bar.sql • Ideal for streaming ETL application deployment Version-control your queries and transformations as code • All running engines share the processing load Technically, instances of the same Kafka Streams Applications scale up / down without restart
  • 62. 62KSQL- Streaming SQL for Apache Kafka How to run KSQL è #3 Embedded in an Application (JVM Mode) JVM App Instance KSQL Engine Application Code JVM App Instance KSQL Engine Application Code JVM App Instance KSQL Engine Application Code Kafka Cluster
  • 63. 63KSQL- Streaming SQL for Apache Kafka How to run KSQL è #3 Embedded in an Application (JVM Mode) • Embed directly in your Java application • Generate and execute KSQL queries through the Java API Version-control your queries and transformations as code • All running application instances share the processing load Technically, instances of the same Kafka Streams Applications scale up / down without restart
  • 64. 64KSQL- Streaming SQL for Apache Kafka Dedicating resources Join Engines to the same ‘service pool’ by means of the ksql.service.id property
  • 65. 65KSQL- Streaming SQL for Apache Kafka Sizing of KSQL Servers 3 Categories of KSQL Queries • Project / Filter, e.g. SELECT <columns> FROM <table/stream> WHERE <condition> • Joins, e.g. Stream-Table Joins • Aggregations, e.g. SUM, COUNT, TOPK, TOPKDISTINCT Sizing Questions • How much Memory, CPU, Disk, Network? • When and how to scale up / down? • How to tune performance? • Etc… Guide for Capacity Planning https://docs.confluent.io/current/ksql/docs/ capacity-planning.html https://docs.confluent.io/current/streams/ sizing.html#streams-sizing
  • 66. 66KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 67. 67KSQL- Streaming SQL for Apache Kafka Demo: Clickstream Analysis Kafka Producer Elastic search Grafana Kafka Cluster Kafka Connect KSQL Stream of Log Events Kafka Ecosystem Other Components
  • 68. 68KSQL- Streaming SQL for Apache Kafka Demo - Clickstream Analysis • https://docs.confluent.io/current/ksql/docs/tutorials/clickstream-docker.html#ksql-clickstream- docker • Leverages Apache Kafka, Kafka Connect, KSQL, Elasticsearch and Grafana • 5min screencast: https://www.youtube.com/watch?v=A45uRzJiv7I • Setup in 5 minutes (with or without Docker) SELECT STREAM CEIL(timestamp TO HOUR) AS timeWindow, productId, COUNT(*) AS hourlyOrders, SUM(units) AS units FROM Orders GROUP BY CEIL(timestamp TO HOUR), productId; timeWindow | productId | hourlyOrders | units ------------+-----------+--------------+------- 08:00:00 | 10 | 2 | 5 08:00:00 | 20 | 1 | 8 09:00:00 | 10 | 4 | 22 09:00:00 | 40 | 1 | 45 ... | ... | ... | ...
  • 69. 69KSQL- Streaming SQL for Apache Kafka Live Demo – KSQL Clickstream Analysis
  • 70. 70KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 71. 71KSQL- Streaming SQL for Apache Kafka Advanced Use Case - Machine Learning UDF for Real Time Sensor Analytics Kafka Producer Elastic search Grafana Kafka Cluster Kafka Connect KSQL Heath Check Sensor Kafka Ecosystem Other Components Emergency System All Data Apply Analytic Model Filter Predictions
  • 72. 72KSQL- Streaming SQL for Apache Kafka KSQL and Deep Learning (Autoencoder) for IoT Sensor Analytics https://www.confluent.io/blog/write-user-defined-function-udf-ksql/ https://github.com/kaiwaehner/ksql-machine-learning-udf “SELECT event_id, anomaly(SENSORINPUT) FROM health_sensor;“ KSQL UDF using an analytic model under the hood à Write once, use in any KSQL statement
  • 73. 73KSQL- Streaming SQL for Apache Kafka KSQL UDF https://www.confluent.io/blog/write-user-defined-function-udf-ksql/ https://github.com/kaiwaehner/ksql-machine-learning-udf
  • 74. 74KSQL- Streaming SQL for Apache Kafka Use Case: Anomaly Detection (Sensor Healthcheck) Machine Learning Algorithm: Autoencoder built with H2O Streaming Platform: Apache Kafka and KSQL Live Demo – Prebuilt Model Embedded in KSQL Function
  • 75. 75KSQL- Streaming SQL for Apache Kafka Agenda – KSQL Deep Dive 1) Apache Kafka Ecosystem 2) Kafka Streams as Foundation for KSQL 3) Motivation for KSQL 4) KSQL Concepts 5) Live Demo #1 – Intro to KSQL 6) KSQL Architecture 7) Live Demo #2 - Clickstream Analysis 8) Building a User Defined Function (Example: Machine Learning) 9) Getting Started
  • 76. 76KSQL- Streaming SQL for Apache Kafka KSQL Quick Start github.com/confluentinc/ksql Local runtime or Docker container
  • 77. 77KSQL- Streaming SQL for Apache Kafka Resources and Next Steps Get Involved • Try the Quickstart on GitHub • Check out the code • Play with the examples KSQL is GA… You can already use it for production deployments! https://github.com/confluentinc/ksql http://confluent.io/ksql https://slackpass.io/confluentcommunity #ksql
  • 79. 79KSQL- Streaming SQL for Apache Kafka Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.confluent.io www.kai-waehner.de LinkedIn Questions? Feedback? Please contact me…