52. Yet, at company scale, the majority of processes run in the background anyway.
They are asynchronous to one another.
Online
Billing Inventory Fulfillment Fraud
Offline
53. So it makes sense to DECOUPLE services from one another.
Billing Inventory Fulfillment Fraud
Offline
Online
Decouple
54. Apache Kafka™ helps with this as it provides a data backbone for your services.
Billing Inventory Fulfillment Finance Fraud
HTTP etc
Offline
Online
74. Wide Spectrum of Messaging Offerings
Ultra- low Latency (often no broker in the middle)
High Volume (Persistent or Non-Persistent)
Highly Available (Clustered and Fault Tolerant)
Embedded Messaging (inside apps)
Cross Datacenter / Organizational / B2B
Enterprise Message Bus
Messaging-as-a-Service
Web / IoT Messaging
Instant Messaging
76. Kafka is a Mashup
Mashup of some well proven concepts into something even greater and easier to use:
EAI + ETL
Messaging Middleware + Big Data
Batch + Real-time
Data Movement + Data Processing
Log Data Streams + Structured Database Tables
77. + Distributed clustered storage
Kafka is a blend of messaging, stream processing, ETL and
modern database designs built around a distributed log
+ Streaming platform
Pub/Sub
Messaging
ETL
Connectors
Spark
Flink
Beam
IBM MQ
TIBCO
RabbitMQ
Mulesoft
Talend
Informatica
Kafka is much more than messaging
+ Exactly Once
+ Designed for the Cloud
+ Inter DC replication
+ Schema evolution
Stream
Processing
Confluent Confidential
78. What’s different about Kafka? Topics are also Queues
Consumers can share one copy of the data
• Independent consumers share the same log
• Inter-dependent consumers share the same log
• No need for Topic/Queue bridging or multiple
copies of the data
Message processing is greatly simplified
- There is no “head’ of the queue
- Writes are sequential, distributed, and
parallel
79. What’s different about Kafka? Messages are not deleted when
consumed
Messages in the commit logs are persistent and immutable
Slow Consumers are (very) decoupled from Fast Producers
Batch and real-time are unified
Message Replay, Replication, and Auditing are built-in (for free)
All production messaging deployment need some form of these
Message Retention is not a waste of disk space
You need to size for offline/disconnected consumers anyway
Distributed State can always be recreated from a common commit log
Makes distributed HA apps much easier to build
80. What’s different about Kafka? Topic Partitions and Keyed Messages
- Topics/Queues are not the smallest unit of
scalability
- Topics partitions are distributed across
brokers for parallel in-order consumption
- This is very different from a cluster of
traditional message brokers
- [graphic of topic partitions with parallel
Producers, Brokers, and Consumers]
- Sometime you can just use more keys
instead of more topics
- Eg. don’t create a new topic for every user,
or IoT device, create unique keys
- This is proven to scale to many millions of
connected users, cars and IoT devices
- [graphic to show Keyed messages get
distributed across topic partitions]
81. From an event stream / transaction log we can derive all of the following
database centric features:
- Replication
- Secondary Indexing
- Caching
- Materialized Views
What’s different about Kafka? Duality of streams and databases
Duality of a message streams and database tables is a key design point
=
83. Old World: REST Based Microservices Interconnect
GUI
UI Service Order
s
Returns
Pay Fulfilment Stock
Confluent Confidential
Each Microservice has to maintain their own stateful
nature by using their own databases
1. Difficult to Enforce Same REST API standards
across many languages and micro-services.
2. Rest APIs Inherently Slow: Limited to Thousands
calls/sec.
3. Inter Service Dependencies are Messy.
4. Each Service Needs to Maintain State.
5. Difficult to enforce consistent security standards.
6. Logging is distributed between services.
7. Version compatibility between services is difficult.
84. Streaming Microservices with Kafka
GUI
UI
Service
Orders
Service
Returns
Service
Fulfilment
Service
Payment
Service
Stock
Service
Confluent Confidential
Database Sources Now Centralized on the Kafka Bus for all microservices
1. Service inter-communication standard enforce by Kafka Schema Registry.
2. Millions of messages per second on cheap hardware.
3. No Inter-Service Dependency: just depend on Kafka.
4. Each service can be stateless: Kafka maintains state.
5. Security can be enforced by ACLs from Kafka.
6. Logs can be aggregated into Kafka.
7. Version compatibility can be enforced by Scheme Registry.
8. Kafka is inherently HA, horizontal scalable: still no central point of failure.
85. What’s different about Kafka? Ecosystem and Adoption
The Kafka ecosystem is flourishing and developer adoption continues to grow
• Confluent Platform additions (REST Proxy, Schema Registry, KSQL etc.)
• Third Party Connectors ( Confluent Hub)
• Open Source contributions from individuals, corporations, vendors, consulting organizations
• Inside and outside of Big Data/Stream Processing
86. Adoption of Event Streaming
60%Fortune 100 Companies
Using Apache Kafka