14. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber ● Subscriber pulls/picks up msg from server
15. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
16. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
○ Store each msg & its state(delivered etc)
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
○ Just store msg. Dont care whether pickedup or not
17. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
○ Store each msg & its state(delivered etc)
○ Maintain order of msg
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
○ Just store msg. Dont care whether pickedup or not
○ Ordering logic dictated by client & storage format
18. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
○ Store each msg & its state(delivered etc)
○ Maintain order of msg
● Hence mostly an ‘Online’ processing model
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
○ Just store msg. Dont care whether pickedup or not
○ Ordering logic dictated by client & storage format
● Hence mostly an Offline processing model
19. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
○ Store each msg & its state(delivered etc)
○ Maintain order of msg
● Hence mostly an ‘Online’ processing model
● Server can do complex routing logic.
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
○ Just store msg. Dont care whether pickedup or not
○ Ordering logic dictated by client & storage format
● Hence mostly an Offline processing model
● Client maintains routing logic. Server is blind to it.
20. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
○ Store each msg & its state(delivered etc)
○ Maintain order of msg
● Hence mostly an ‘Online’ processing model
● Server can do complex routing logic.
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
○ Just store msg. Dont care whether pickedup or not
○ Ordering logic dictated by client & storage format
● Hence mostly an Offline processing model
● Client maintains routing logic. Server is blind to it.
● Also. Subscriber stores state i.e. which msg’s it picked up
21. What’s the Diff ?
● Server Pushes/delivers msg to Subscriber
● Server does lot of work in-mem
○ Store each msg & its state(delivered etc)
○ Maintain order of msg
● Hence mostly an ‘Online’ processing model
● Server can do complex routing logic.
● Subscriber pulls/picks up msg from server
● Not much in-mem work for server, just store msg
○ Just store msg. Dont care whether pickedup or not
○ Ordering logic dictated by client & storage format
● Hence mostly an Offline processing model
● Client maintains routing logic. Server is blind to it.
● Also. Subscriber stores state i.e. which msg’s it picked up
42. Summary
● Publisher chooses a topic to publish onto.
○ It also decides Routing logic i.e. chooses which partition to publish onto (uses a partitioning key)
43. Summary
● Publisher chooses a topic to publish onto.
○ It also decides Routing logic i.e. chooses which partition to publish onto (uses a partitioning key)
● Broker receives message & appends message to end of topic partition.
44. Summary
● Publisher chooses a topic to publish onto.
○ It also decides Routing logic i.e. chooses which partition to publish onto (uses a partitioning key)
● Broker receives message & appends message to end of topic partition.
● Subscriber requests broker for msg at specific offset in a Topic Partition.
45. Summary
● Publisher chooses a topic to publish onto.
○ It also decides Routing logic i.e. chooses which partition to publish onto (uses a partitioning key)
● Broker receives message & appends message to end of topic partition.
● Subscriber requests broker for msg at specific offset in a Topic Partition.
● Upto Subscriber to remember which msg offset it has processed.
47. A lovely use case - REPLAY
● Since Subscriber requests for a message at an offset in a topic partition, the
subscriber is free to REPLAY the processing at any point in time.
48. A lovely use case - REPLAY
● Since Subscriber requests for a message at an offset in a topic partition, the
subscriber is free to REPLAY the processing at any point in time.
● Handy when outages occur.
52. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
53. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
54. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
55. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
56. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
57. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
58. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
● What about replication of data ?
59. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
● What about replication of data ?
○ While creating a topic, you can set replication factor which applies to each topic partition.
60. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
● What about replication of data ?
○ While creating a topic, you can set replication factor which applies to each topic partition.
● What about data retention time policy ?
61. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
● What about replication of data ?
○ While creating a topic, you can set replication factor which applies to each topic partition.
● What about data retention time policy ?
○ While creating a topic, please set it. You can edit later on.
62. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
● What about replication of data ?
○ While creating a topic, you can set replication factor which applies to each topic partition.
● What about data retention time policy ?
○ While creating a topic, please set it. You can edit later on.
● Think about producer Partitioning key ...
63. Things to Ponder about
● How do i achieve high Read/Write Throughput ?
○ Have more partitions per topic , this determines read/write throughput
● Can multiple publishers publish concurrently to same topic partition ?
○ Yes
● Should multiple Consumers read from same topic partition ?
○ Ideally one Consumer per partition or Consumer group per partition
● What about replication of data ?
○ While creating a topic, you can set replication factor which applies to each topic partition.
● What about data retention time policy ?
○ While creating a topic, please set it. You can edit later on.
● Think about producer Partitioning key …
65. Others ...
● Amazon Kinesis is similar to Kafka ….
● You have Redis - PubSub (different guarantees, not similar to
kafka)
66. What i did not cover ? :)
● Kafka Replication mechanism
○ ISR = in sync replica set
● Tools like Kafka mirror
● Zookeeper interaction (yes kafka depends on zookeeper)
67. What’s new in kafka ?
● Kafka stream api
● Kafka Sql
● See release notes … :)