Does your organization struggle with updating of its Kafka Streams application? Releasing a new version of a Kafka Streams application can be challenging, especially if its state has to be preserved between releases. Consider these best-practices and architectural ideas to make this process smoother and improve your release process.
Having experienced accidental removal of change-log topics and needing to expand partitions, it is much easier to handle with some planning. With the proper planning, you can achieve easier application upgrades.
Key take-aways from the session include:
* How do minimize the rebuilding of the state-stores.
* How to change stream topologies without affecting the existing state stores.
* What you can do when you absolutely need to increase the number of partitions within your application.
* How to leveraging schemas for application releases.
* Measures to prevent data corruption, especially if Kafka is not only your system of record but also your source of truth.
* Techniques to support rolling back an application.
* The advantages of splitting apart a Kafka Streams application into multiple applications.
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Â
Developing Kafka Streams Applications with Upgradability in Mind with Neil Buesing | Kafka Summit London 2022
1. Designing your Ka
fk
a Streams
Applications with Upgradability In
Mind
Ka
fk
a Summit 2022 London
Neil Buesing, Rill Data
@nbuesing nbuesing
2. Background
⢠Principal Solution Architect, Rill Data, Inc.
⢠Work with clients streaming data into our platform
⢠5+ years experience with Ka
fk
a Streams
⢠Speak on topics I'm passionate about with Apache Ka
fk
a and Ka
fk
a
Streams
⢠Working from home with the best pair-programmer
3. Goals
1. Con
fi
dence you can upgrade your application
2. Support for Data Recovery
⢠e.g., data corrupted due to bug in upgrade
3. Options
⢠e.g., responsibility
4. Reduce Developer time to achieve upgrade
4. Topics
1. Name processors
2. Name state stores
3. Minimize rebuilding of state
4. Data evolution
5. Partitioning
6. Microservices
â¨
â¨
7. Backup & Restore
8. Repartitioning
9. Windowed Stores
10. Circuit Breakers
11. Switches
7. Name Your Processors
⢠Syntax, add naming to existing con
fi
guration, Named added to those w/out
⢠Produced.as(), Grouped.as(), Joined.as(), Consumed.as(), Name.as()
⢠Gotchas - builders & static construction behavior
⢠Produced.with(Serdes.String(), vSerde).as("name")
⢠Produced.as("name").withKeySerde(Serdes.String()).withValueSerde(vSerde)
⢠Produced.<String,PurchaseOrder>as("name")
â¨
.withKeySerde(Serdes.String())
â¨
.withValueSerde(vSerde)
13. Name Your State Stores
⢠The most important thing you can do to make upgrades easier
â¨
⢠Simple
â¨
â¨
KTable<String, User> users =
â¨
builder.table(options.getUserTopic(),
Consumed.as("ktable-users"),
Materialized.as("user-table"));
21. Data Evolution - JSON
public class UnmappedProperties {
private final Map<String, Object> map = new LinkedHashMap
<
>
();
@JsonAnyGetter
public Map<String, Object> getUnknownProperties() {
return map;
}
@JsonAnySetter
public void setUnknownProperty(String key, Object value) {
map.put(key, value);
}
}
22. Data Evolution - JSON
@JsonInclude(JsonInclude.Include.NON_NULL)
public class Product {
private String sku;
@JsonUnwrapped
private UnmappedProperties unmappedProperties = new UnmappedProperties();
public Product(Sku sku) {
this.sku = sku;
}
}
23. Data Evolution - JSON
⢠Risk/Pitfall
⢠Data type changes can break this approach
⢠Validates 3rd party inputs
⢠Implement a clearUnknownProperties()
24. Data Evolution - Avro
⢠EvolutionâŚ
⢠part of Avro's library
⢠leveraged by Con
fl
uent's Schema Registry
â¨
25. Data Evolution - Avro
⢠FULL
⢠ability to roll-backup
⢠streams apps are producers and consumers (forward and backward are harder)
⢠V1 ⡠V2 and V2 ⡠V3
⢠FULL-TRANSITIVE
⢠Ability to handle aggregations of older versions inde
fi
nitely
⢠V1 ⡠V2 and V2 ⡠V3 and V1 ⡠V3
â¨
26. Data Evolution - Protobuf
⢠Tags numbers are encoded,
fi
eld names are not
⢠optional ⡠repeated
⢠no encoding di
ff
erences: writing a repeated value and reading it as an
optional value has "last one wins"
⢠Renaming
fi
elds âś full evolution
⢠Renumber tags ✠no evolution
27. Avoid Schema Registry Serialization for Keys
⢠A simple addition of a default attribute â breaks partitioning
⢠Exceptions
⢠output topics for sink connectors (e.g. JDBC Sink)
28. Data Evolution
(takeaways)
⢠Full (Forward and Backwards) - easier to roll-back your applications
⢠Full Transitive - easier to handle old data in your aggregates
⢠JSON, Avro, and Protobuf all have their own nuances - understand them
32. Partitioning
⢠Plan for growth (butâŚ)
⢠Strive for even work-loads
⢠Partition for storage is as important (if not more so) than throughput
⢠Selecting a Partitioning for your Streams Applications
⢠12 partitions better than 10 partitions
⢠avoid primes, 5
⢠24 (but at what cost?)
â¨
â¨
1,2,3,4,6,12 1,2,5,10
1,2,3,4,6,8,12,24
1,5
33. Partitioning
⢠If repartitioning is easy
⢠4 partitions
⢠If repartitioning is hard
⢠8 or 12 partitions
⢠24 partitions (large state stores)
⢠consider separation into multiple micro services
36. Micro Services
⢠easier to deploy
⢠more uniform allocation of work
⢠minimize downtime during restarts
⢠easier to understand
⢠threading
⢠storage
43. Backup and Restore
⢠transformValues cannot be created before aggregate/reduce since DSL
requires store to be materialized
fi
rst.
⢠aggregate and reduce do not have access to headers
⢠if DSL adopts PAPI updated refactoring, it would then be able to.
⢠understand how store caching and commit interval works
45. co-partitioning
⢠partitioning of source and restore topics must match
⢠co-partitioning validation isn't catching this.
⢠behavior very confusing when they are not the same
â¨
(speaking from experience đ¤Ś)
49. Repartitioning
⢠Leverage Built-in Backup and Restore
⢠On/O
ff
fi
lters so you can discard while brining the application online
⢠Version your application
⢠"foo.v1" â "foo.v2"
51. Repartitioning
⢠Considerations around making restore a separate application
⢠Downtime
⢠Cut-over
⢠Using `application.id` for backup
⢠Keeping the code up to date
52. Window Stores
Type Boundary Examples
# records for key
â¨
@ point in time
Fixed
Size
Tumbling Epoch
[8:00, 8:30)
[8:30, 9:00)
single Yes
Hopping Epoch
[8:00, 8:30)
[8:15, 8:45)
[8:30, 8:45)
[8:45, 9:00)
constant Yes
Sliding Record
[8:02, 8:32]
[8:20, 8:50]
[8:21, 8:51]
variable Yes
Session Record
[8:02, 8:02]
[8:02, 8:10]
[9:10, 12:56]
single
â¨
(by tombstoning)
No
53. Window Stores
⢠Fixed Windows do NOT store window size (or end timestamp) in the
message
⢠Release new version and co-exist with old version
⢠Wait to use new version until windows are "ready"
55. Window Stores
⢠New Version Challenges
⢠Very long windows make it harder to wait for cut-over
⢠epoch
⢠hydration
⢠replay incoming events
⢠How ("When") to have clients cut over to new version
⢠earliest, latest, or speci
fi
c timestamp
⢠circuit breaker â moves burden to streams development team.
56. application.id & versions
⢠Versions should be a su
ffi
x on application.id
⢠".v1", ".v2"
⢠Leverage ACLs with pre
fi
x on application.id
58. Circuit Breakers
⢠Starting and Stopping the Circuit Breaker application controls
fl
ow of
messages
⢠Unable to stop producers
⢠Complicated streams application
⢠in-
fl
ight data needs to be handled by same version
⢠no duplicate processing between version releases
59. Circuit Breakers
⢠Added Complexity
⢠Extra Application
⢠Extra Topic
⢠but can have smaller retention time (original is source-of-truth)
⢠Extra Deployments
60. Circuit Breaker handy for ksqlDB
⢠Placing a Ka
fk
a Streams circuit-breaker application gives control in front of
ksqlDB where consumer group selection is not possible
⢠KSQL query starts from latest
⢠KLIP-28 "create or replace" solves many issues (0.12.0)
⢠KLIP-22 "add consumer group id" (proposal - no traction)
62. Switches
⢠Burden on our deployment, not down-stream applications
⢠no o
ff
set management changes
63. Circuit Breakers & Switches
⢠Do not adopt these w/out need
⢠Add-in only if (and when) needed
64. Topics
1. Name processors
2. Name state stores
3. Minimize rebuilding of state
4. Data evolution
5. Partitioning
6. Microservices
â¨
â¨
7. Backup & Restore
8. Repartitioning
9. Windowed Stores
10. Circuit Breakers
11. Switches
65. Takeaways
⢠Do Right Away
⢠Name your State Stores
⢠Name your Processors
⢠Meaningful Partition Size
⢠Su
ffi
x based versioning
⢠Start Planning
⢠Backup/Restore & Repartitioning
⢠External Applications & Teams
⢠Release Scheduling
⢠Data Evolution Strategy