Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
1
Introduction to Apache Kafka as
Event-Driven Open Source Streaming
Platform
Kai Waehner
Technology Evangelist
kontakt@ka...
2
Vision of an event streaming enterprise
Search
Sensors / IoT
RDBMS Monitoring
NoSQLReal-time Analytics Data Warehouse
Ap...
3
Business Digitalization Trends are Driving the Need to Process
Events at a whole new Scale, Speed and Efficiency
The Wor...
4
Before: many ad hoc pipelines
Search Security
Fraud Detection Application
User Tracking Operational Logs Operational Met...
5
After: streaming platform with Kafka
Search Security
Fraud Detection Application
User Tracking Operational Logs Operatio...
Events
What is an event?
Events
8
Events
A Sale An Invoice A Trade A Customer
Experience
9
Where are they?
Events haven’t had a proper
home in infrastructure or in
code. They are implicit.
Here!
10
Haven’t we seen all this
before?
11
What’s different this time around?
(Published in 2009) (Published in 2004)
A Streaming Platform is the Underpinning of an
Event-driven Architecture
Ubiquitous connectivity
Globally scalable platfor...
16
● Global-scale
● Real-time
● Persistent Storage
● Stream Processing
Apache Kafka: The De-facto Standard for Real-Time E...
Apache Kafka at Scale at Tech
Giants
> 4.5 trillion messages / day > 6 Petabytes / day
“You name it”
* Kafka Is not just u...
Confluents Business Value per Use Case
Improve
Customer
Experience
(CX)
Increase
Revenue
(make money)
Business
Value
Decre...
Confluent Partner
Briefing
19
Example: An Airbnb Booking Event
Booked event happens
{
rentalId:4124,
rentalPrice: 58,
user...
A Modern, Distributed Platform for
Data Streams.
Messaging + Storage +
Processing!
Apache Kafka is made up of
distributed, immutable, append-
only commit logs
Writers
Kafka
cluster
Readers
Scalability of a filesystem
• hundreds of MB/s throughput
• many TB per server
• commodity hardware
Guarantees of a Database
• Strict ordering
• Persistence
Distributed by design
• Replication
• Fault Tolerance
• Partitioning
• Elastic Scaling
Kafka Topics
my-topic
my-topic-partition-0
my-topic-partition-1
my-topic-partition-2
broker-1
broker-2
broker-3
P
Producing to Kafka
Time
P
Producing to Kafka
Time
C2 C3C1
Partition Leadership and Replication
Broker 1
Topic1
partition1
Broker 2 Broker 3 Broker 4
Topic1
partition1
Topic1
partit...
Apache Kafka (kafka.apache.org) includes Kafka Connect and Kafka Streams
Kafka Connect is an integration framework on top of Kafka‘s Core
Kafka’s Streams API: Build real-time applications for your core business
Kafka’s Streams API
• To build real-time applicat...
Example: complete app, ready for production at large-scale
Word
Count
App configuration
Define processing
(here:
WordCount...
3535
Confluent Delivers a Mission-Critical Event Streaming Platform
Apache Kafka®
Core | Connect API | Streams API
Data Co...
KSQL – A Streaming SQL Engine for Apache Kafka
3737
Confluent Control Center (C3)
Monitors all pipelines end-to-end
• Lost Messages?
• Duplicates?
• Latency Issues?
• Wh...
3939
Best-of-breed Platforms, Partners and Services for Multi-cloud Streams
Private Cloud
Deploy on bare-metal, VMs,
conta...
40
Kafka Connect Couchbase Connector
https://github.com/couchbase/kafka-connect-couchbase
https://www.confluent.io/connect...
41
Kafka Connect Couchbase Connector
Couchbase cluster
…
Kafka cluster
Kafka Connect
(Connectors to Extract and Load
data)...
42
Confluent and Couchbase - Synergies
• Distributed and fault tolerant
• Horizontally scalable
• Geographically replicate...
43
KSQLKafka Streams
Event Streaming with Apache Kafka and Couchbase
Splunk Security
Fraud Detection Application
User Trac...
46
Confluent’s Streaming Maturity Model - where are you?
Value
Maturity (Investment &
time)
2
Enterprise
Streaming Pilot /...
47
This is just the beginning of a new era… Confluent’s Vision:
Global
Automated disaster recovery
Global applications wit...
48Highly Scalable Microservices with Apache Kafka + Mesos
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWae...
Apache Kafka and Couchbase => Event Streaming Platform + NoSQL
Apache Kafka and Couchbase => Event Streaming Platform + NoSQL
Apache Kafka and Couchbase => Event Streaming Platform + NoSQL
Apache Kafka and Couchbase => Event Streaming Platform + NoSQL
Nächste SlideShare
Wird geladen in …5
×

Apache Kafka and Couchbase => Event Streaming Platform + NoSQL

337 Aufrufe

Veröffentlicht am

Confluent and Couchbase – Event Streaming Platform + NoSQL combined. This slide deck introduces Apache Kafka as event streaming platform and how to leverage Kafka Connect to integrate with Couchbase.

Sample Best Fit Use Cases

Services requiring low latency, highly available and scalable data ingestion or presentation tier with onward transport of data.
Serving data with high availability to a high multiplicity of readers (up to millions) with deterministic low latency.
Services that wish to transform streaming data and quickly store intermediate state for further processing.
Services storing or processing a high cardinality of entities or with rapid schema evolution.
Services with operational data storage requirements up to the 10s of Terabytes.

Examples of typical applications requiring these functionalities:

Recommendation engines, predictive analytics engines, fraud detection frameworks, risk analytics engines, trader toolkits, real-time trade blotters.

Kafka Connect Couchbase Connector


Stream, filter, and transform events to and from Couchbase with Source and Sink connectors.

Fast, reliable and fault tolerant: Based on DCP (Couchbase replication protocol).

Efficient: Only load new or modified documents.

Real-time: Every mutation to Couchbase generates an event which is published to a Kafka topic.

End-to-End monitoring: Integrated with Confluent Control Center:
Kafka is de-facto standard for data movement
Unified control, monitoring, and metrics
“Config-only”

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

Apache Kafka and Couchbase => Event Streaming Platform + NoSQL

  1. 1. 1 Introduction to Apache Kafka as Event-Driven Open Source Streaming Platform Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de … and its integration with Couchbase
  2. 2. 2 Vision of an event streaming enterprise Search Sensors / IoT RDBMS Monitoring NoSQLReal-time Analytics Data Warehouse Apps Microservices Big Data Streaming Platform
  3. 3. 3 Business Digitalization Trends are Driving the Need to Process Events at a whole new Scale, Speed and Efficiency The World has Changed Mobile Cloud Microservices Internet of Things Machine Learning
  4. 4. 4 Before: many ad hoc pipelines Search Security Fraud Detection Application User Tracking Operational Logs Operational Metrics Big Data App Data Warehouse Mainframes NoSQL Relational DB Databases Storage Interfaces Monitoring App Databases Storage Interfaces
  5. 5. 5 After: streaming platform with Kafka Search Security Fraud Detection Application User Tracking Operational Logs Operational MetricsMainframes Relational DB Big Data App Monitoring App Data Warehouse Streaming Platform NoSQL
  6. 6. Events What is an event?
  7. 7. Events
  8. 8. 8 Events A Sale An Invoice A Trade A Customer Experience
  9. 9. 9 Where are they? Events haven’t had a proper home in infrastructure or in code. They are implicit. Here!
  10. 10. 10 Haven’t we seen all this before?
  11. 11. 11 What’s different this time around? (Published in 2009) (Published in 2004)
  12. 12. A Streaming Platform is the Underpinning of an Event-driven Architecture Ubiquitous connectivity Globally scalable platform for all event producers and consumers Immediate data access Data accessible to all consumers in real time Single system of record Persistent storage to enable reprocessing of past events Continuous queries Stream processing capabilities for in-line data transformation Microservices DBs SaaS apps Mobile Customer 360 Real-time fraud detection Data warehouse Producers Consumers Database change Microservices events SaaS data Customer experiences Streams of real time events Stream processing apps Stream processing apps Stream processing apps
  13. 13. 16 ● Global-scale ● Real-time ● Persistent Storage ● Stream Processing Apache Kafka: The De-facto Standard for Real-Time Event Streaming Edge Cloud Data LakeDatabases Datacenter IoT SaaS AppsMobile Microservices Machine Learning Apache Kafka
  14. 14. Apache Kafka at Scale at Tech Giants > 4.5 trillion messages / day > 6 Petabytes / day “You name it” * Kafka Is not just used by tech giants ** Kafka is not just used for big data
  15. 15. Confluents Business Value per Use Case Improve Customer Experience (CX) Increase Revenue (make money) Business Value Decrease Costs (save money) Core Business Platform Increase Operational Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $↔
  16. 16. Confluent Partner Briefing 19 Example: An Airbnb Booking Event Booked event happens { rentalId:4124, rentalPrice: 58, userId: 5893381 …. } Rental availability Rental pricing Recommended experiences Account history Account Updates Store Updates Report Updates User engagement Localized supply Topic: rentalOrders
  17. 17. A Modern, Distributed Platform for Data Streams. Messaging + Storage + Processing!
  18. 18. Apache Kafka is made up of distributed, immutable, append- only commit logs
  19. 19. Writers Kafka cluster Readers
  20. 20. Scalability of a filesystem • hundreds of MB/s throughput • many TB per server • commodity hardware
  21. 21. Guarantees of a Database • Strict ordering • Persistence
  22. 22. Distributed by design • Replication • Fault Tolerance • Partitioning • Elastic Scaling
  23. 23. Kafka Topics my-topic my-topic-partition-0 my-topic-partition-1 my-topic-partition-2 broker-1 broker-2 broker-3
  24. 24. P Producing to Kafka Time
  25. 25. P Producing to Kafka Time C2 C3C1
  26. 26. Partition Leadership and Replication Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  27. 27. Apache Kafka (kafka.apache.org) includes Kafka Connect and Kafka Streams
  28. 28. Kafka Connect is an integration framework on top of Kafka‘s Core
  29. 29. Kafka’s Streams API: Build real-time applications for your core business Kafka’s Streams API • To build real-time applications for your core business • Easiest way to process data in Apache Kafka • Apps are standard Java applications that run on client machines • Powerful yet easy-to-use library, part of Apache Kafka • https://github.com/apache/kafka/tree/trunk/streams Streams API Your App Kafka Cluster
  30. 30. Example: complete app, ready for production at large-scale Word Count App configuration Define processing (here: WordCount) Start processing
  31. 31. 3535 Confluent Delivers a Mission-Critical Event Streaming Platform Apache Kafka® Core | Connect API | Streams API Data Compatibility Schema Registry Enterprise Operations Replicator | Auto Data Balancer | Connectors | MQTT Proxy | Kubernetes Operator Database Changes Log Events IoT Data Web Events other events Hadoop Database Data Warehouse CRM other DATA INTEGRATION Transformations Custom Apps Analytics Monitoring other REAL-TIME APPLICATIONS COMMUNITY FEATURES COMMERCIAL FEATURES Datacenter Public Cloud Confluent Cloud Confluent Platform Management & Monitoring Control Center | Security Development & Connectivity Clients | Connectors | REST Proxy | KSQL CONFLUENT FULLY- MANAGED CUSTOMER SELF-MANAGED
  32. 32. KSQL – A Streaming SQL Engine for Apache Kafka
  33. 33. 3737 Confluent Control Center (C3) Monitors all pipelines end-to-end • Lost Messages? • Duplicates? • Latency Issues? • What is the problem? • Where is the problem? • Etc.
  34. 34. 3939 Best-of-breed Platforms, Partners and Services for Multi-cloud Streams Private Cloud Deploy on bare-metal, VMs, containers or Kubernetes in your datacenter with Confluent Platform and Confluent Operator Public Cloud Implement self-managed in the public cloud or adopt a fully managed service with Confluent Cloud Hybrid Cloud Build a persistent bridge between datacenter and cloud with Confluent Replicator Confluent Replicator VM SELF MANAGED FULLY MANAGED
  35. 35. 40 Kafka Connect Couchbase Connector https://github.com/couchbase/kafka-connect-couchbase https://www.confluent.io/connector/couchbase-db-connector/ Open Source, Developed by Couchbase, Certified by Confluent
  36. 36. 41 Kafka Connect Couchbase Connector Couchbase cluster … Kafka cluster Kafka Connect (Connectors to Extract and Load data) • Stream, filter, and transform events to and from Couchbase with Source and Sink connectors. • Fast, reliable and fault tolerant: Based on DCP (Couchbase replication protocol). • Efficient: Only load new or modified documents. • Real-time: Every mutation to Couchbase generates an event which is published to a Kafka topic. • End-to-End monitoring: Integrated with Confluent Control Center: • Kafka is de-facto standard for data movement • Unified control, monitoring, and metrics • “Config-only”
  37. 37. 42 Confluent and Couchbase - Synergies • Distributed and fault tolerant • Horizontally scalable • Geographically replicated • Low latency • Open source
  38. 38. 43 KSQLKafka Streams Event Streaming with Apache Kafka and Couchbase Splunk Security Fraud Detection Application User Tracking Operational Logs Operational MetricsMainframes Oracle DB Hadoop Business App Monitoring App AWS Redshift Kafka Couchbase Kafka Connect
  39. 39. 46 Confluent’s Streaming Maturity Model - where are you? Value Maturity (Investment & time) 2 Enterprise Streaming Pilot / Early Production Pub + Sub Store Process 5 Central Nervous System 1 Developer Interest Pre-Streaming 4 Global Streaming 3 SLA Ready, Integrated Streaming Projects Platform
  40. 40. 47 This is just the beginning of a new era… Confluent’s Vision: Global Automated disaster recovery Global applications with geo-awareness Infinite Efficient and infinite data with tiered storage Unlimited horizontal scalability for single clusters Faster elastic scaling for brokers and partition Elastic Easy Container-based orchestration and management Faster elastic scaling when adding brokers and partitions Cloud-native Apache Kafka for on-premises, hybrid, multi-cloud
  41. 41. 48Highly Scalable Microservices with Apache Kafka + Mesos Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.confluent.io www.kai-waehner.de LinkedIn Questions? Feedback? Please contact me!

×