Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Introduction to KSQL: Streaming SQL for Apache Kafka®

265 Aufrufe

Veröffentlicht am

Join Tom Green, Solution Engineer at Confluent for this Lunch and Learn talk covering KSQL. Confluent KSQL is the streaming SQL engine that enables real-time data processing against Apache Kafka®. It provides an easy-to-use, yet powerful interactive SQL interface for stream processing on Kafka, without the need to write code in a programming language such as Java or Python. KSQL is scalable, elastic, fault-tolerant, and it supports a wide range of streaming operations, including data filtering, transformations, aggregations, joins, windowing, and sessionization.

By attending one of these sessions, you will learn:
-How to query streams, using SQL, without writing code.
-How KSQL provides automated scalability and out-of-the-box high availability for streaming queries
-How KSQL can be used to join streams of data from different sources
-The differences between Streams and Tables in Apache Kafka

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Introduction to KSQL: Streaming SQL for Apache Kafka®

  1. 1. 1C O N F I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® tom.green@confluent.io
  2. 2. Confluent Partner Briefing Agenda & Housekeeping ● Introduction to KSQL Presentation ● KSQL Demonstration ● Q & A - please ask your questions via that chat box Follow up materials ● The slides and recording will be emailed
  3. 3. Confluent Partner Briefing Process streams of data in real time, as they occur. 11010 1 01011 1 00110 1 10001 0 3 Apache Kafka is a Distributed Streaming Platform Publish and subscribe to streams of data similar to a message queue or enterprise messaging system. 11010 1 01011 1 00110 1 10001 0 Store streams of data in a fault tolerant way. 11010 1 01011 1 00110 1 10001 0
  4. 4. 5C O N F I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® to write real-time applications in SQL
  5. 5. 7C O N F I D E N T I A L Lower the bar to enter the world of streaming User Population CodingSophistication Core developers who use Java/Scala Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts streams
  6. 6. 8C O N F I D E N T I A L KSQL CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8; streams Lowering the bar: KSQL vs. Kafka Streams Lower the bar to enter the world of streaming vs.
  7. 7. 9C O N F I D E N T I A L KSQL ● You write only SQL. No Java, Python, or other boilerplate to wrap around it! ● Create KSQL user defined functions in Java when needed. CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8;
  8. 8. 10C O N F I D E N T I A L New user experience: interactive stream processing
  9. 9. 11C O N F I D E N T I A L KSQL can be used interactively + programmatically ksql> 1 UI POST /query 2 CLI 3 REST 4 Headless
  10. 10. 12C O N F I D E N T I A L All you need is Kafka and KSQL 1.Build & package 2. Submit job required for fault-tolerance ksql> SELECT * FROM myStream Without KSQL With KSQL processing storage
  11. 11. 13 ● SQL ● Interactive service ● ... KSQL Kafka Streams API Stream Processing in Kafka Producer/Consumer ● Stateful stream processing ● Window operations ● DSL ○ map(), filter(), join(), aggregate(), … ... Kafka Consumer Group Protocol ● Elasticity ● Fault Tolerance ...
  12. 12. 14C O N F I D E N T I A L Stream/Table Duality
  13. 13. 15C O N F I D E N T I A L Alice + €50 The Stream-Table Duality Stream (payments) Table (balance) time Alice €50 Bob + €18 Alice €50 Alice €50 Bob €18 Alice + €25 Alice €50 Bob €18 Alice €75 Bob €18 Alice – €60 Alice €75 Bob €18 Alice €15 Bob €18
  14. 14. 16C O N F I D E N T I A L KSQL Examples
  15. 15. 17C O N F I D E N T I A L Data exploration KSQL example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
  16. 16. 18C O N F I D E N T I A L Example: Retail KSQL joins the two streams in real-time Stream of shipments that arrive Stream of purchases from online and physical stores
  17. 17. 19C O N F I D E N T I A L Example: IoT, Automotive, Connected Cars KSQL joins the two streams in real-time Kafka Connect streams data in Cars send telemetry data via Kafka API Kafka Streams application to notify customers
  18. 18. 20C O N F I D E N T I A L KSQL for Real-Time Monitoring ● Log data monitoring ● Tracking and alerting ● Syslog data ● Sensor / IoT data ● Application metrics CREATE STREAM syslog_invalid_users AS SELECT host, message FROM syslog WHERE message LIKE '%Invalid user%'; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
  19. 19. 21C O N F I D E N T I A L KSQL for Anomaly Detection ● Identify patterns or anomalies in real- time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, COUNT(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING COUNT(*) > 3;
  20. 20. 22C O N F I D E N T I A L KSQL for Streaming ETL ● Joining, filtering, and aggregating streams of event data CREATE STREAM vip_actions AS SELECT user_id, page, action FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum';
  21. 21. 23C O N F I D E N T I A L KSQL for Data Transformation ● Easily make derivations of existing topics CREATE STREAM pageviews_avro WITH (PARTITIONS=6, VALUE_FORMAT='AVRO') AS SELECT * FROM pageviews_json PARTITION BY user_id;
  22. 22. 26C O N F I D E N T I A L Demo
  23. 23. 27C O N F I D E N T I A L Example: CDC from DB via Kafka to Elastic KSQL processes table changes in real-time Kafka Connect streams data in Kafka Connect streams data out ratings
  24. 24. 28C O N F I D E N T I A L Deployment Scalability and HA
  25. 25. 29 ● SQL ● Interactive service ● ... KSQL Kafka Streams API Stream Processing in Kafka Producer/Consumer ● Stateful stream processing ● Window operations ● DSL ○ map(), filter(), join(), aggregate(), … ... Kafka Consumer Group Protocol ● Elasticity ● Fault Tolerance ...
  26. 26. 30C O N F I D E N T I A L How to run KSQL #1 Interactive KSQL, for development & testing ksql> POST /query Kafka Cluster (data) KSQL Cluster (processing) KSQL does not run on Kafka brokers! ...
  27. 27. 31C O N F I D E N T I A L How to run KSQL #2 Headless KSQL, for production Kafka Cluster (data) servers started with same .sql file KSQL Cluster (processing) ... interaction for UI, CLI, REST is disabled
  28. 28. 32 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic ● Interactive Service ○ REST API ○ Command Topic
  29. 29. 33 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
  30. 30. 34 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
  31. 31. 35 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
  32. 32. 36 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic
  33. 33. 37 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic Fault Tolerance:
  34. 34. 38 KSQL Deployment KSQL Server KSQL Server Command Topic Fault Tolerance:
  35. 35. 39 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic Scale Out:
  36. 36. 40 KSQL Deployment KSQL Server KSQL Server KSQL Server Command Topic KSQL Server Scale Out:
  37. 37. 41C O N F I D E N T I A L How to run KSQL KSQL Server (JVM process) …and many more… DEB, RPM, ZIP, TAR downloads http://confluent.io/ksql Docker images confluentinc/cp-ksql-server confluentinc/cp-ksql-cli
  38. 38. 43C O N F I D E N T I A L Resources and Next Steps confluentinc/ksql http://confluent.io/ksql http://cnfl.io/slack
  39. 39. 44C O N F I D E N T I A L

×