Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Keep your Data Close and your
Caches Hotter using Apache
Kafka, Connect and KSQL
Ricardo Ferreira, Developer Advocate
@rif...
About Me:
● Hi, my name is Ricardo Ferreira
● Developer Advocate @ Confluent
● Currently into Cloud & DevOps
● Ex-Oracle, ...
Data is only useful
if it is Fresh and
Contextual
There are three parts
in a airbag system:
● The bag itself.
● The sensors which tell the bag to
inflate when there is a co...
December 6th, 2010:
Commuter rail train
hits elderly driver
● 70-year old lady hear on the news
that there will be no comm...
Caches can be a
Solution for Data
that is Fresh
APIs need to access
data freely and easily
● Data should never be treated as a
scarce resource in applications
● Latency s...
Caches can be either
built-in or distributed
● If data can fit into the API memory,
then you should use built-in caches
● ...
Demo
Let’s Tweet the Song!
1. Access your Twitter account.
2. Use #KafkaSummit in your tweet.
3. The name of the song must be
w...
Application X-Ray:
● Confluent Cloud Cluster
● AWS and Terraform
● Spring Boot Application
● Apache Kafka Connect
● Conflu...
Application X-Ray:
● Confluent Cloud Cluster
● AWS and Terraform
● Spring Boot Application
● Apache Kafka Connect
● Conflu...
Caching
Patterns
Caching Pattern:
Refresh Ahead
● Proactively updates the cache
● Keep the entries always in-sync
● Ideal for latency sensi...
Caching Pattern:
Refresh Ahead / Adapt
● Proactively updates the cache
● Keep the entries always in-sync
● Ideal for laten...
Caching Pattern:
Write Behind
● Removes I/O pressure from app
● Allows true horizontal scalability
● Ensures ordering and ...
Caching Pattern:
Write Behind / Adapt
● Removes I/O pressure from app
● Allows true horizontal scalability
● Ensures order...
Caching Pattern:
Event Federation
● Replicates data across regions
● Keep multiple regions in-sync
● Great to improve RPO ...
Kafka Connect
Implementation
Strategies
Kafka Connect support
for In-Memory Caches
● Connector for Redis is open and it
is available in Confluent Hub
● Connector ...
Frameworks for other
In-Memory Caches
● Oracle provides HotCache from
GoldenGate for Oracle Coherence
● Hazelcast has the ...
Interested on DB CDC?
Then meet Debezium!
● Amazing CDC technology to pull
data out from databases to Kafka
● Works in a l...
Support for Running
Kafka Connect Servers
● Run by yourself on BareMetal:
https://kafka.apache.org/downloads
https://www.c...
25
Please Stay in Touch:
@riferrei
riferrei
riferrei
ricardo@confluent.io
https://riferrei.net
https://cnfl.io/slack
Keeping Your Data Close and Your Caches Hotter (Ricardo Ferreira, Confluent) Kafka Summit NYC 2019
Nächste SlideShare
Wird geladen in …5
×

Keeping Your Data Close and Your Caches Hotter (Ricardo Ferreira, Confluent) Kafka Summit NYC 2019

116 Aufrufe

Veröffentlicht am

The distributed cache is becoming a popular technique to improve performance and simplify the data access layer when dealing with databases. Bringing the data as close as possible to the CPU allows unparalleled execution speed as well as horizontal scalability. This approach is often successful when used in a microservices design in which the cache is accessed only by a single API. However, it becomes more challenging if multiple applications are involved and changes are made to the database directly by other applications. The data held in the cache eventually becomes stale and no longer consistent with its underlying database. When consistency problems arise, the Engineering team must address that through additional coding — which directly jeopardizes the team’s ability to be agile between releases. This talk presents a set of patterns for cache-based architectures that aim to keep the caches always hot; by using Apache Kafka and its connectors to accomplish that goal. It will be shown how to set up these patterns across different IMDGs such as Hazelcast, Apache Ignite or Coherence. These patterns can be used in conjunction with different cache topologies such as cache-aside, read-through, write-behind, and refresh-ahead, making it reusable enough to be used as a framework to achieve data consistency in any architecture that relies on distributed caches.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Keeping Your Data Close and Your Caches Hotter (Ricardo Ferreira, Confluent) Kafka Summit NYC 2019

  1. 1. Keep your Data Close and your Caches Hotter using Apache Kafka, Connect and KSQL Ricardo Ferreira, Developer Advocate @riferrei #KafkaSummit
  2. 2. About Me: ● Hi, my name is Ricardo Ferreira ● Developer Advocate @ Confluent ● Currently into Cloud & DevOps ● Ex-Oracle, Red Hat, IONA Tech ● https://riferrei.net @riferrei #KafkaSummit
  3. 3. Data is only useful if it is Fresh and Contextual
  4. 4. There are three parts in a airbag system: ● The bag itself. ● The sensors which tell the bag to inflate when there is a collision probability based on speediness. ● The inflation system, which does combine two compounds [Sodium Azide (NaN3) and Potassium Nitrate (KNO3)] used to produce Nitrogen gas and inflate the bag. @riferrei #KafkaSummit What if the airbag deploys 30 seconds after the collision?
  5. 5. December 6th, 2010: Commuter rail train hits elderly driver ● 70-year old lady hear on the news that there will be no commuter rail train on that day. ● She tries to beat the train as its speed through the Groove Street, but there was no enough time to break. ● Luckily she is still alive. @riferrei #KafkaSummit What if the information about the commuter rail train is outdated?
  6. 6. Caches can be a Solution for Data that is Fresh
  7. 7. APIs need to access data freely and easily ● Data should never be treated as a scarce resource in applications ● Latency should be kept as minimal to ensure a better user experience ● Data should be not be static: keep the data fresh continuously ● Find ways to handle large amounts of data without breaking the APIs @riferrei #KafkaSummit CacheAPI Read Write Read Write
  8. 8. Caches can be either built-in or distributed ● If data can fit into the API memory, then you should use built-in caches ● Otherwise, you may need to use distributed caches for large sizes ● Some cache implementations provides the best of both cases ● For distributed caches, make sure to always find a good way to O(1) @riferrei #KafkaSummit CacheAPI Read Write Built-in Caches Cache API Distributed Caches Cache Cache Read Write
  9. 9. Demo
  10. 10. Let’s Tweet the Song! 1. Access your Twitter account. 2. Use #KafkaSummit in your tweet. 3. The name of the song must be within brackets as shown below. @riferrei #KafkaSummit
  11. 11. Application X-Ray: ● Confluent Cloud Cluster ● AWS and Terraform ● Spring Boot Application ● Apache Kafka Connect ● Confluent KSQL ● Redis Cache ● AWS Lambda ● Amazon Alexa @riferrei #KafkaSummit
  12. 12. Application X-Ray: ● Confluent Cloud Cluster ● AWS and Terraform ● Spring Boot Application ● Apache Kafka Connect ● Confluent KSQL ● Redis Cache ● AWS Lambda ● Amazon Alexa @riferrei #KafkaSummit You can find the source-code of this application here:
  13. 13. Caching Patterns
  14. 14. Caching Pattern: Refresh Ahead ● Proactively updates the cache ● Keep the entries always in-sync ● Ideal for latency sensitive cases ● Ideal when data read is costly ● It may need initial data loading @riferrei #KafkaSummit Kafka Connect Cache Kafka Connect API
  15. 15. Caching Pattern: Refresh Ahead / Adapt ● Proactively updates the cache ● Keep the entries always in-sync ● Ideal for latency sensitive cases ● Ideal when data read is costly ● It may need initial data loading @riferrei #KafkaSummit Kafka Connect Application Cache Kafka Connect Transform and adapt records before delivery Schema Registry for canonical models API
  16. 16. Caching Pattern: Write Behind ● Removes I/O pressure from app ● Allows true horizontal scalability ● Ensures ordering and persistence ● Minimizes DB code complexity ● Totally handles DB unavailability @riferrei #KafkaSummit Kafka Connect Application Cache Kafka Connect API
  17. 17. Caching Pattern: Write Behind / Adapt ● Removes I/O pressure from app ● Allows true horizontal scalability ● Ensures ordering and persistence ● Minimizes DB code complexity ● Totally handles DB unavailability @riferrei #KafkaSummit Kafka Connect Application Cache Kafka Connect Transform and adapt records before delivery Schema Registry for canonical models API
  18. 18. Caching Pattern: Event Federation ● Replicates data across regions ● Keep multiple regions in-sync ● Great to improve RPO and RTO ● Handles lazy/slow networks well ● Works well if its used along with Read-Through and Write-Through patterns. @riferrei #KafkaSummit Confluent Replicator <<MirrorMaker>>
  19. 19. Kafka Connect Implementation Strategies
  20. 20. Kafka Connect support for In-Memory Caches ● Connector for Redis is open and it is available in Confluent Hub ● Connector for Memcached is open and it is available in Confluent Hub ● Connectors for both GridGain and Apache Ignite implementations. ● Connector for InfiniSpan is open and is maintained by Red Hat @riferrei #KafkaSummit Kafka Connect Kafka Connect Kafka Connect Kafka Connect
  21. 21. Frameworks for other In-Memory Caches ● Oracle provides HotCache from GoldenGate for Oracle Coherence ● Hazelcast has the Jet framework, which provides support for Kafka ● Pivotal GemFire (Apache Geode) has good support from Spring ● Good news: you can always write your own sink using Connect API @riferrei #KafkaSummit Oracle GoldenGate Hazelcast Jet Spring Data Spring Kafka Connect Framework Any Cache
  22. 22. Interested on DB CDC? Then meet Debezium! ● Amazing CDC technology to pull data out from databases to Kafka ● Works in a log level, which means true CDC implementation for your projects instead of record polling ● Open-source maintained by Red Hat. Have broad support for many popular databases. ● It is built on top of Kafka Connect @riferrei #KafkaSummit
  23. 23. Support for Running Kafka Connect Servers ● Run by yourself on BareMetal: https://kafka.apache.org/downloads https://www.confluent.io/download ● IaaS on AWS or Google Cloud: https://github.com/confluentinc/ccloud-tools ● Running using Docker Containers: https://hub.docker.com/r/confluentinc/cp-kafka- connect/ ● Running using Kubernetes: https://github.com/confluentinc/cp-helm-chart https://www.confluent.io/confluent-operator/ @riferrei #KafkaSummit Kafka Connect
  24. 24. 25 Please Stay in Touch: @riferrei riferrei riferrei ricardo@confluent.io https://riferrei.net https://cnfl.io/slack

×