Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019

565 Aufrufe

Veröffentlicht am

Eventing and streaming open a world of compelling new possibilities to our software and platform designs. They can reduce time to decision and action while lowering total platform cost. But they are not a panacea. Understanding the edges and limits of these architectures can help you avoid painful missteps. This talk will focus on event-driven and streaming architectures and how Apache Kafka can help you implement these. It will also discuss key tradeoffs you will face along the way from partitioning schemes to the impact of availability vs. consistency (CAP Theorem). Finally, we’ll discuss some challenges of scale for patterns like Event Sourcing and how you can use other tools and even features of Kafka to work around them. This talk assumes a basic understanding of Kafka and distributed computing but will include brief refresher sections.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summit NYC 2019

  1. 1. Hard Truths About Eventing and Streaming Dan Rosanova Group Principle Program Manager Microsoft Azure Messaging
  2. 2. A brief history of messaging • Old school messaging (System/360 QTAM and TCAM) • IBM MQ • Rabbit MQ • Service Bus • ActiveMQ™ • ZeroMQ • Apache Kafka® • NATS
  3. 3. What exactly is a messaging queue
  4. 4. A simple queue Sender sends message to queue Queue ACKs receipt Receiver connects to queue & retrieves message Receiver ACKs complete (or other action)
  5. 5. What queues are good at and why
  6. 6. The queue is the arbiter of truth – which simplifies many other aspects EACH READER CAN JUST SAY ‘GIVE ME THE NEXT’ MESSAGES ARE ACKED / COMPLETED INDIVIDUALLY THE QUEUE IS A BUFFER TO IMPROVE SCALE AND PERFORMANCE
  7. 7. Messaging and Queues are about applications more than they are about data
  8. 8. Competing Consumer: The Server-Side Cursor
  9. 9. The message is the Unit of Work
  10. 10. But there are some thing queues aren’t so good at
  11. 11. Replay
  12. 12. Strict ordering
  13. 13. Strict ordering
  14. 14. Strict ordering
  15. 15. Very high scale Eventually competing consumer models break down
  16. 16. Competing Consumer: not all competition is healthy
  17. 17. Enter the Partitioned Consumer model
  18. 18. How is a partitioned consumer different than a queue? Data Apache Kafka® implements a partitioned consumer model
  19. 19. There’s something else that resembles this
  20. 20. There’s something else that resembles this RECORDS A STREAM RECODING MOVES FORWARD ONLY YOU CAN PLAY THE TAPE OVER AND OVER AGAIN A CASSETTE TAPE ACTUALLY HAS LEFT AND RIGHT CHANNELS WHEN YOU PRESS RECORD, THEY BOTH RECORD BUT THE DATA ON EACH CHANNEL IS DIFFERENT IN KAFKA THESE CHANNELS ARE CALLED PARTITIONS
  21. 21. A bit more on the partition concept Partition is essentially append only Reads are performed using a client side curor Reads are nondestructive
  22. 22. In a stream the partition is the Unit of Work Streams are processed differently from batch data – normal functions cannot operate on streams as a whole, as they have potentially unlimited data, and formally, streams are codata (potentially unlimited), not data (which is finite).
  23. 23. Ultimate example for streams By Danielpr85 based on Graphviz source of TuukkaH - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=687268
  24. 24. What streams like Kafka are very good at Scale Low cost Replay Order
  25. 25. Why partitioned consumer scales so well
  26. 26. Why partitioned consumer scales so well Maximum Degree of Parallelism
  27. 27. Low cost • There are no expensive indexes to maintain • Because each partition is independent there is no cross broker coordination necessary (other than optional replication) • Client-side cursor avoids the overhead of traditional message brokers • Data replication and ACK level is a choice of the sender
  28. 28. Replay
  29. 29. This may lead you to believe you have found Zen
  30. 30. But there are clouds on the horizon
  31. 31. Fan out and routing • Partitioned streams (like Kafka) don’t offer server-side filtering • Every reader must read all the data • As more readers want the data a network imbalance develops • Parse.ly Kafkapocalypse 10MBps 10MBps 10MBps 10MBps N MBps
  32. 32. Streams are not queues • The Unit of Work is not an individual message • This means processing individual messages gets complicated • Cursor management becomes a big challenge • There is no inherent dead letter capability • People start adding these ‘features’ in and end up recreating a queue
  33. 33. CAP Theorem In theoretical computer science the CAP theorem states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: Consistency, Availability, Partition tolerance
  34. 34. What does CAP mean for streams? Consistency: Data should produce the same results when read multiple times – i.e. it should be stable and durable Availability: The place data is written to should always be available to write to Partition tolerance: the ability to continue functioning when one part of the system becomes separated from another
  35. 35. Or put another way when a network partition happens, which over time is inevitable, then you must make a choice...
  36. 36. This is your last chance. After this, there is no turning back... Consistency
  37. 37. You must decide which of these two is most significant Consistency Availabilty
  38. 38. Partitioning schemes Not all keys are created equal You need to be careful to avoid hot keys It’s not always something you can avoid
  39. 39. To key or not to key… that is the question
  40. 40. Adding partitions You’ve identified a hot partition You add more partitions to handle the scale The result is a data split Partition 1 Partition 2 1 2 3 4 5 6 7
  41. 41. Pause
  42. 42. Failure to plan is planning to fail
  43. 43. Strategies for dealing with failures in messaging and streaming Stop Drop Retry Deadletter
  44. 44. Stop • Simply stop reading – or writing the stream • Wait until someone elsewhere has fixed the problem and then resume • Appropriate for some scenarios, but not all • Probably a good idea to include a notification
  45. 45. Drop • If the messages aren’t that important, just drop them • Up to a certain point they may not matter • This is a good strategy for non-mission critical streams • But not so good for scenarios requiring strong consistency guarantees • Definitely a good idea to include a notification
  46. 46. Retry • Try again and see if it works • Perhaps the error is transient • Be aware of impact on downstream systems - idempotence
  47. 47. Deadletter • Put the data somewhere off your hot path so that you can go back and handle it later • Does not interrupt your flow • Works for poisoned messages
  48. 48. Combining strategies • Often no one strategy will exactly match your needs • You can combine these to achieve the policy that is right for you • E.G. Retry three times, then deadletter
  49. 49. Another Pause
  50. 50. What are event driven architectures • Events are notifications that something happened • This is different than traditional messages, which are the thing (the command) • Event Driven Architectures are reactive in nature • State is derived from an event log or stream
  51. 51. Event Sourcing • Add head • Add body • Add left arm • Add right arm • Add left leg • Add right leg
  52. 52. Event Sourcing • Add head • Add body • Add left arm • Add right arm • Add left leg • Add right leg
  53. 53. Capabilities we’ve gain from Event Sourcing • Complete rebuild • Temporal query • Event replay
  54. 54. What cool things can you do now? • Add head • Add body • Add left arm • Add right arm • Add left leg • Add right leg
  55. 55. This sounds interesting…
  56. 56. Obvious shortcomings of Event Sourcing and how to overcome them TIME TO PROCESS THE LOG: CHECKPOINTING ON A REGULAR BASIS HOW TO QUERY THE STATE: BUILDING A MATERIALIZED VIEW
  57. 57. Event Sourcing leads to divergent models for read and write This is often addressed with Command Query Responsibility Separation (CQRS) Despite these benefits, you should be very cautious about using CQRS. Many information systems fit well with the notion of an information base that is updated in the same way that it's read, adding CQRS to such a system can add significant complexity. I've certainly seen cases where it's made a significant drag on productivity, adding an unwarranted amount of risk to the project, even in the hands of a capable team. -Martin Fowler
  58. 58. KStreams can help you do Event Sourcing BASICALLY A WAY TO DO EVENT SOURCING WITHOUT BEING AN ARCHITECTURAL ASTRONAUT PROVIDES MATERIALIZED VIEW (USES ROCKSDB INTERNALLY TO HOLD THE TABLE) EACH APPLICATION CAN NOW HAVE ITS OWN VIEW OF THE STREAM
  59. 59. Cloud Events Purpose Definition https://cloudevents.io/
  60. 60. A specification for describing event data in common formats to provide interoperability across services, platforms and systems.
  61. 61. Why Cloud Events? THE LACK OF A COMMON WAY OF DESCRIBING EVENTS MEANS DEVELOPERS MUST CONSTANTLY RE- LEARN HOW TO RECEIVE EVENTS. THIS ALSO LIMITS THE POTENTIAL FOR LIBRARIES, TOOLING AND INFRASTRUCTURE TO AIDE THE DELIVERY OF EVENT DATA ACROSS ENVIRONMENTS. THE PORTABILITY AND PRODUCTIVITY WE CAN ACHIEVE FROM EVENT DATA IS HINDERED OVERALL. CONSISTENCY ACCESSIBILITY PORTABILITY
  62. 62. Sample Cloud Event • These are the rules for the envelope • The data section is opaque
  63. 63. Combining Events and Streams •Events can be fed into a stream •Stream processors can produce their own events Stream f(x)
  64. 64. Key differences between events and streams Events as the records, streams as the communication mechanism
  65. 65. Key differences between events and streams • Dispatch and how you can do this in Kafka • Push and other ways to accomplish it Stream Push Based Dispatch Fan In Fan Out
  66. 66. In closing Pick the right tool for the job You may need multiple tools Be realistic about your expectations Experiment and learn - continuously Share your learnings in contributions, blogs, etc. Be an active member of the Apache Kafka community!
  67. 67. Thank You!

×