I don't have to tell anybody how awesome Event Sourcing is as an architecture style. But those that have been using it in production must have experienced the pains of keeping the upgrades fast enough, especially if you use a more traditional relational database like SQL Server. I've once heard somebody say: "If it's slow, make it faster", but rebuilding a big projection is simply a very expensive operations and involves a lot of network traffic between the application and the database servers.
Over the years since we adopted Event Sourcing, we've been experimenting a lot and have implemented various improvements to make those reprojections faster. And we haven't stopped. We're already working on some new ideas lately. So in this session, I'd like to share those techiques, their pros and cons and how we implemented them on a more detailed level.
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
Slow Event Sourcing (re)projections - Just make them faster!
1. Just make them faster!
Slow (re)projections in Event Sourcing?
Dennis Doomen
@ddoomen | The Continuous Improver
2. About Me
Hands-on architect in the .NET space with 25 years of experience on
an everlasting quest for knowledge to build the right software the right
way at the right time
@ddoomen | The Continuous Improver
5. Projection
A (persisted) representation of a set
of events optimized for querying.
Dennis Doomen | @ddoomen | The Continuous Improver
Projector
Actively or passively transposes
events into an optimized
representation
6. Dennis Doomen | @ddoomen | The Continuous Improver
Checkpoint
An unambiguous reference to a
particular commit in an ordered
event store.
Stream
An ordered collection of events,
originating from the same aggregate
Commit
A collection of ordered events for a
single aggregate, persisted at the
same time
7. Why do we care?
Dennis Doomen | @ddoomen | The Continuous Improver
8. Reasons for
reprojections
• Fix a code bug in the
projection
• Restructure the projection
for performance reasons
• New features require
changes to projections.
Dennis Doomen | @ddoomen | The Continuous Improver
10. Out-of-place / Blue-Green upgrades
Event Store
Projector
Application Application
Network Load Balancer
Event Store
Version 1 Version 2
events
Projection
Projector
Projection
bring off-line
Returns HTTP 503
(Service Unavailable)
Returns HTTP 503
(Service Unavailable)
Dennis Doomen | @ddoomen | The Continuous Improver
Requires full
rebuild of the
projections
Practically no
downtime.
11. Patterns for improving (re)projection speed
Dennis Doomen | @ddoomen | The Continuous Improver
12. Make projectors fully asynchronous
Domain
Event Store
Events
App
Persistent
Store
Projector
HTTP API
Projection
Freedom to go
as fast or slow
as possible
Autonomy to
handle (transient)
projection issues
independently
Can use
whatever you
need or deem fit
Requires persistent
tracking of the
subscription
Requires a way to
uniquely track a
commit within the
event store
Dennis Doomen | @ddoomen | The Continuous Improver
Domain updates
will not be
immediately
visible.
13. Project stream-by-stream
Domain
Event Store
Events
App
Persistent
Store
Projector
HTTP API
Projection
Single write
operation per
stream
Reads commits
stream-by-stream
instead of in order of
appearance
Only works if
projection map 1-
to-1 to streams
Requires different
technique during
reprojection vs
normal operation.
Dennis Doomen | @ddoomen | The Continuous Improver
14. ORM
Use an ORM’s Unit of Work
Domain
Event Store
Events
App
Persistent
Store
Projector
HTTP API
Projection
Can process as
many events in a
batch as needed
and still trigger
one DB operation
Reads commits in order
of appearance
Some ORMs
provide first and
second level
caching
Need to stay away
from typical ORM
pitfalls.
Can mix and
match with RAW
SQL
Requires custom
handling when
batch processing
fails
Dennis Doomen | @ddoomen | The Continuous Improver
15. Commit
Event Event
Commit
Event Event
Commit
Event Event
RDBMS
Commit
Event Event
Commit
Event Event
Commit
Event Event
RDBMS
Unit of Work
Use an ORM’s Unit of Work
INSERT/UPDATE … WHERE …
INSERT/UPDATE … WHERE …
INSERT/UPDATE … WHERE …
INSERT/UPDATE … WHERE …
Dennis Doomen | @ddoomen | The Continuous Improver
16. Startup-time in-memory rebuilds
Domain
Event Store
Events
App
Distributed
Cache
Projector
HTTP API
Projection
Kept
in-memory
Can be both
synchronous and
asynchronous
Postpones
availability of data
at start-up.
Can help avoid
reprojections in a
server farm
Can be very fast
Dennis Doomen | @ddoomen | The Continuous Improver
17. MRU cache
Use aggressive caching during rebuilds
Domain
Event Store
Events
App
Persistent
Store
Projector
HTTP API
Projection
Works well with an
ORM’s unit-of-
work
May be only safe
during rebuilds.
Dennis Doomen | @ddoomen | The Continuous Improver
18. Mark streams as closed and skip reprojection
Domain
Event Store
Events
App
Persistent
Store
Projector
HTTP API
Projection
Could also be
triggered through
an explicit archive
process
Skips “closed”
commits
Closed Graph
Projector
Persistent
Store
Tracks whether all
related streams can be
closed
Marks all commits in
the stream as “closed”
Requires custom
metadata per
commit.
Only useful if your
domain has the
notion of “closed”
Dennis Doomen | @ddoomen | The Continuous Improver
19. Persistent
Store
Persistent
Store
Separate projections for “closed” streams
Domain
Event Store
Events
App
Projector
HTTP API
“Open” Projections
Moves
projections to the
“closed” store at
the right time
Never needs to be
rebuild
“Closed” Projections
Limited choices for
searching, unless
you accept
schema changes
or use JSON
Need to “join” data
from two stores,
potentially using
different schemas.
Contains projections in
schema-aware format with
additional searchable
fields.
Only useful if your
domain has the
notion of “closed”
Only needs to
rebuild “open”
streams
Dennis Doomen | @ddoomen | The Continuous Improver
20. Persistent
Store
Lucene
Index / ES
Tombstoning “closed” streams
Domain
Event Store
Events
App
Normal Projector
HTTP API
“Open” Projections
Removes the
projections when
the stream is
tombstoned
Never needs to be
rebuild
Snapshots
Need to “join” data
from two stores,
potentially using
different schemas
Optimized for
searching
Only useful if your
domain has the
notion of “closed”
Deletes the
tombstoned
streams
Tombstone Projector
Dennis Doomen | @ddoomen | The Continuous Improver
21. Event Store
Tombstone
Projector
Document #1
Marked As
Archivable Event
Lucene / ES Index
Take snapshot
Purge events
Tombstone
$tombstone
stream
Application.
Normal Projector
Search
Tracks deleted streams
for future reference
Stream Tombstoned
Event
Search
Tombstoning “closed” streams
Dennis Doomen | @ddoomen | The Continuous Improver
Allows projectors
to clean up
22. Original
Cloning and project the remainder
Domain
Event Store
Events
App
Clone
Projector
HTTP API
Projection
Clone individual tables
or the entire database
Cannot rely on
rebuild to fix bugs
Needs to
understand all
supported
database schemas
Triggers the clone and
then projects the
remainder
Cloning tables is
very fast on SQL
Server
Event store can
still be used in
parallal
Requires
autonomy
Dennis Doomen | @ddoomen | The Continuous Improver
23. Old
projections
Copy on write
Domain
Event Store
Events
App
New
projections
Projector
HTTP API
Projection
Cannot rely on
rebuilds to fix bugs
Whenever a projection
news to change, it is
copied and updated in
the new table/database
No rebuild
needed
Need to join &
deduplicate data
from two stores,
potentially using
different schemas
Limited choices for
searching, unless
you accept schema
changes
Dennis Doomen | @ddoomen | The Continuous Improver
24. What about the event store itself?
@ddoomen | The Continuous Improver
25. Out-of-place / Blue-Green upgrades
Event Store
Projector
Application Application
Network Load Balancer
Event Store
Version 1 Version 2
events
Projection
Projector
Projection
Needs to copy
all events
Dennis Doomen | @ddoomen | The Continuous Improver
26. In-place / Blue-Green upgrades
Event Store
Projector
Application Application
Network Load Balancer
Version 1 Version 2
Projection
Projector
Projection
Event body,
stream ID, etc
Code
Version
Ignores events which
code base version is
newer than supported
Writes events with the
code base version
Rolling back means
loosing data
Old version can
run
simultaneously
with the new
version
Requires
asynchronous
projectors /
dispatching
Dennis Doomen | @ddoomen | The Continuous Improver
Every event has a code
base version