"Stateful app as an efficient way to build dispatching for riders and drivers", Oleksandr Chumak

Fwdays
FwdaysFwdays
"Stateful app as an efficient way to build dispatching for riders and drivers",  Oleksandr Chumak
Uklon in numbers
12 130+
Engineers
Product Teams
16 M
Android/iOS
downloads
1.5M+
Riders DAU
30+
microservices
200k+
Drivers DAU
3
Countries
30
Cities
"Stateful app as an efficient way to build dispatching for riders and drivers",  Oleksandr Chumak
Uklon
RiderApp DriverApp
How to reduce CPU consumption by 10 times due
to stateful-processing and ensure high reliability
What is the report about?
3
What are the solutions employed
by our competitors?
1
Scaling of stateful services
Reliability of stateful services
Workloads that make the stateless approach inefficient
Basic concepts
Agenda
Workloads that make the
stateless approach inefficient
1. massive frequent write operations are needed to track the objects'
current locations. As drivers can move as fast as 20 meters per second,
it is therefore important to update drivers' locations at a second.
Several challenges within
the ride-hailing are…
2. a K-nearest neighbour (kNN) query poses tremendous challenges,
compared to a simple Get query, in a key-value data store such as
Redis.
Feature #1
Orders Dispatching
Find the best driver for the order
Feature #2
Orders Broadcasting
Streaming your order to many drivers
DriverApp
Feature #3
Batch dispatching
Greedy algorithm Batching algorithm
The Process of Order Dispatching
with Batch Windows
2 min
9 min
4 min
4 min
Total wait time = 11 min Total wait time = 8 min
image
Feature #4
Driver ETA Tracker
Requirements:
1. Active Orders = tens of thousands
2. Drivers send their location every
2-5 seconds
1. Order offers. Find the best driver near you.
2. Order broadcasts. Fan-out orders to multiple drivers.
3. Order chaining. Find the next order for the driver, while
completing the current one.
4. Order batching (optimization). Reduce the total waiting time
for all passengers.
5. Sector queue (airports, train stations).
6. Driver ETA tracking for accepted order.
7. Matching driver’s GPS location to map graph node.
Other Workloads
Simplified Overview of
the Architecture
Stateful
● Load balancing algorithms
● Scalability
○ Partitioning
○ Replication
● Fault tolerance and Cold start
4
Stateful
architectures
Open Problems
1
Key concept
1. Local state is stored in memory KV structures
2. The local state restored from the durable log.
In same cases, local state change may have
been checkpointed to remote KV store (or into
a separate kafka topic)
3. Local state updates occur within a
single-threaded. No concurrency, Monotonic
Writes
NFR (Kyiv only)
Writes
1.1) 5000-10000 rps
1.2) 100-500 rps
Reads
2.1) 500 rps (handle 100-500 drivers
per request)
2.2) fetch 50000-200000 rows/sec
(100-400MB/sec)
driver entity: 2 KB (50 perc)/ 13 KB (99 perc)
total size for 100K = 200 MB
Key differences
Stateless (remote KV)
● Provide GET/PUT/DELETE API
● A high CPU cost due to
marshalling and serialization
● Additional network latency
● Frequently necessitates
additional local caching
Stateful (in-memory/local KV)
● Domain specific API. Ex:
○ Find nearest drivers
○ Calculate ETA
● Data locality
● Shared-nothing
1
Access patterns for
In-memory KV
1. Key lookup
2. Index seek (Offers, Broadcast)
3. All scans / Range scans
Concept #1: Co-partitioning
Two topics are described as
co-partitioned if:
1. Their keys have the same schemas
2. They are materialized by topics
with the same number of partitions
3. Their producers have similar
'partitioner'
Concept #1: Co-partitioning
Concept #2: Re-keying partitions
● Related events are not
co-partitioned
● Well-balanced partitions
● These can be unbalanced partitions and,
as a result, consumers
● Achieving data locality for the consumer
Concept #3: Filtering + Enriching
DriverLocation {
"driver": 12345
"latitude": 50.30846,
"longitude": 30.53419
}
DriverETA {
"driver": 12345
"latitude": 50.30846,
"longitude": 30.53419
“order”: 98765,
“eta”: “2 min”
}
How to scale?
Driver Dispatching
Driver Dispatching
Driver Dispatching
Driver Dispatching
1
Scalability
1
1. geospatial indexing (geohash, S2, H3)
2. city_id (region)
Some sharding strategy
Consider the following points when you design a data
partitioning scheme:
1. Minimize cross-partition data access operations
2. Minimize cross-partition joins
1
Partitioning by Region
Possible challenges:
● down-time during rebalance:
scale-out, rolling update
● unbalanced load: The load
from Kyiv is equivalent to the
load from all cities of Ukraine
combined)
1
Try to fix:
Partitioning by Region + Replication
Replication:
● Standalone consumers
● No partitions rebalance
● No down-time
● Replication overhead is
less than 0.1CPU per pod
● Reduced requirements
for cold recovery
1
1. Scalability - adding Kafka
partitions and deploying
separate Shard-Instances for
cities/countries
2. Elasticity - scale-out of
consumers within a Shard
Scalability
Reliability?
1
Replica synchronization
● State-based CRDT
● Last write wins (LWW)
● Optimistic replication (can
become temporarily
inconsistent)
● Strong Eventual Consistency
(SEC)
● Reading Your Own
Writes
● Monotonic Reads
● Consistent Prefix Reads
Depends on your Domain
● Reading Your Own
Writes
● Monotonic Reads
● Consistent Prefix Reads
1
Problems with Replication Lag?
1
1. Single infrastructure dependency - Kafka (battle tested streaming
platform with high throughput, fault-tolerance, and scalability).
2. When a task instance restarts, local state is repopulated by reading its
own Kafka log
3. Yes, reading and repopulating will take some time
Fault tolerance with local state
1
1. Key-Based Retention
a. Aggressive topic compaction
b. Tombstones
2. Time-Based Retention
Controlling State Size.
How long time to rebuild the state?
1
1. Driver state retention: 1hour
2. Repopulate local state:
a. Read driver-state from the beginning of the topic: 400k msg (8
partitions)
b. Read driver-locations from the 'now - 5sec'
3. You need to implement own event for ”live processing started”
How long time to rebuild
the state?
"Live processing started "dispatching.driver-summary-events [0]"
after 00:00:01.7875633 sec (50142 msgs)"
SLA level of 99.998% uptime/availability
results in the following periods of allowed
downtime/unavailability:
■ Daily: 1.7s
Traffic Jams requirements
1. Reduce the cost of Google
Maps API
2. High rate of Writes (20k
online drivers)
3. Update traffic information
every 5min
Stateful processing
● Grouping messages by partition key
● Aggregating messages in hopping window
● MapReduce
Driver ETA Tracker
4
Similar workload using Redis
https://aws.amazon.com/blogs/database/optimize-redis-client-performance-for-amazon-elasticache/?utm_source=pocket_saves
○ Client: c5.4xlarge (16 vCPU 32GiB)
○ Redis: 3 nodes r6g.2xlarge (8 vCPUs 64Gib)
46
Resources Usage
Although the current design is simple, it allows flexibility to change
key aspects:
○ Replication + Sharding
4
Future works
46
1. Stateful is not always difficult
2. Simple and Reliable solution
3. Easy to maintain
4. Much more efficient in terms of resources (2 vCPUs for all
dispatching) instead of a Redis cluster with 16-24 vCPUs
5. What about MS Orleans?
Lessons learned
4
The Twelve-Factor App
Misleading
46
Space-based architecture?
https://www.amazon.com/_/dp/1492043451?smid=ATVPDKIKX0DER&_encoding=UTF8&tag=oreilly20-20
Contacts
Solution Architect
Oleksandr Chumak
https:/
/www.linkedin.com/in/oleksandr-chuma
k-45967588/
facebook.com/achumak.dev
1 von 46

Más contenido relacionado

Similar a "Stateful app as an efficient way to build dispatching for riders and drivers", Oleksandr Chumak(20)

Velocity 2018   preetha appan finalVelocity 2018   preetha appan final
Velocity 2018 preetha appan final
preethaappan118 views
 Unclouding  Container Challenges Unclouding  Container Challenges
Unclouding Container Challenges
Rakuten Group, Inc.407 views
Oow2007 performanceOow2007 performance
Oow2007 performance
Ricky Zhu494 views
Map reduceMap reduce
Map reduce
대호 김89 views
z/VM Performance Analysisz/VM Performance Analysis
z/VM Performance Analysis
Rodrigo Campos5.8K views
Designing Scalable ApplicationsDesigning Scalable Applications
Designing Scalable Applications
Fabricio Epaminondas2.7K views
Corralling Big Data at TACCCorralling Big Data at TACC
Corralling Big Data at TACC
inside-BigData.com2K views
Mobile web performance - MoDev EastMobile web performance - MoDev East
Mobile web performance - MoDev East
Patrick Meenan3.4K views

Más de Fwdays(20)

"Stateful app as an efficient way to build dispatching for riders and drivers", Oleksandr Chumak

  • 2. Uklon in numbers 12 130+ Engineers Product Teams 16 M Android/iOS downloads 1.5M+ Riders DAU 30+ microservices 200k+ Drivers DAU 3 Countries 30 Cities
  • 5. How to reduce CPU consumption by 10 times due to stateful-processing and ensure high reliability What is the report about?
  • 6. 3 What are the solutions employed by our competitors?
  • 7. 1 Scaling of stateful services Reliability of stateful services Workloads that make the stateless approach inefficient Basic concepts Agenda
  • 8. Workloads that make the stateless approach inefficient
  • 9. 1. massive frequent write operations are needed to track the objects' current locations. As drivers can move as fast as 20 meters per second, it is therefore important to update drivers' locations at a second. Several challenges within the ride-hailing are… 2. a K-nearest neighbour (kNN) query poses tremendous challenges, compared to a simple Get query, in a key-value data store such as Redis.
  • 10. Feature #1 Orders Dispatching Find the best driver for the order
  • 11. Feature #2 Orders Broadcasting Streaming your order to many drivers DriverApp
  • 12. Feature #3 Batch dispatching Greedy algorithm Batching algorithm The Process of Order Dispatching with Batch Windows 2 min 9 min 4 min 4 min Total wait time = 11 min Total wait time = 8 min
  • 13. image Feature #4 Driver ETA Tracker Requirements: 1. Active Orders = tens of thousands 2. Drivers send their location every 2-5 seconds
  • 14. 1. Order offers. Find the best driver near you. 2. Order broadcasts. Fan-out orders to multiple drivers. 3. Order chaining. Find the next order for the driver, while completing the current one. 4. Order batching (optimization). Reduce the total waiting time for all passengers. 5. Sector queue (airports, train stations). 6. Driver ETA tracking for accepted order. 7. Matching driver’s GPS location to map graph node. Other Workloads
  • 15. Simplified Overview of the Architecture Stateful
  • 16. ● Load balancing algorithms ● Scalability ○ Partitioning ○ Replication ● Fault tolerance and Cold start 4 Stateful architectures Open Problems
  • 17. 1 Key concept 1. Local state is stored in memory KV structures 2. The local state restored from the durable log. In same cases, local state change may have been checkpointed to remote KV store (or into a separate kafka topic) 3. Local state updates occur within a single-threaded. No concurrency, Monotonic Writes
  • 18. NFR (Kyiv only) Writes 1.1) 5000-10000 rps 1.2) 100-500 rps Reads 2.1) 500 rps (handle 100-500 drivers per request) 2.2) fetch 50000-200000 rows/sec (100-400MB/sec) driver entity: 2 KB (50 perc)/ 13 KB (99 perc) total size for 100K = 200 MB
  • 19. Key differences Stateless (remote KV) ● Provide GET/PUT/DELETE API ● A high CPU cost due to marshalling and serialization ● Additional network latency ● Frequently necessitates additional local caching Stateful (in-memory/local KV) ● Domain specific API. Ex: ○ Find nearest drivers ○ Calculate ETA ● Data locality ● Shared-nothing
  • 20. 1 Access patterns for In-memory KV 1. Key lookup 2. Index seek (Offers, Broadcast) 3. All scans / Range scans
  • 22. Two topics are described as co-partitioned if: 1. Their keys have the same schemas 2. They are materialized by topics with the same number of partitions 3. Their producers have similar 'partitioner' Concept #1: Co-partitioning
  • 23. Concept #2: Re-keying partitions ● Related events are not co-partitioned ● Well-balanced partitions ● These can be unbalanced partitions and, as a result, consumers ● Achieving data locality for the consumer
  • 24. Concept #3: Filtering + Enriching DriverLocation { "driver": 12345 "latitude": 50.30846, "longitude": 30.53419 } DriverETA { "driver": 12345 "latitude": 50.30846, "longitude": 30.53419 “order”: 98765, “eta”: “2 min” }
  • 25. How to scale? Driver Dispatching Driver Dispatching Driver Dispatching Driver Dispatching
  • 27. 1 1. geospatial indexing (geohash, S2, H3) 2. city_id (region) Some sharding strategy Consider the following points when you design a data partitioning scheme: 1. Minimize cross-partition data access operations 2. Minimize cross-partition joins
  • 28. 1 Partitioning by Region Possible challenges: ● down-time during rebalance: scale-out, rolling update ● unbalanced load: The load from Kyiv is equivalent to the load from all cities of Ukraine combined)
  • 29. 1 Try to fix: Partitioning by Region + Replication Replication: ● Standalone consumers ● No partitions rebalance ● No down-time ● Replication overhead is less than 0.1CPU per pod ● Reduced requirements for cold recovery
  • 30. 1 1. Scalability - adding Kafka partitions and deploying separate Shard-Instances for cities/countries 2. Elasticity - scale-out of consumers within a Shard Scalability
  • 32. 1 Replica synchronization ● State-based CRDT ● Last write wins (LWW) ● Optimistic replication (can become temporarily inconsistent) ● Strong Eventual Consistency (SEC)
  • 33. ● Reading Your Own Writes ● Monotonic Reads ● Consistent Prefix Reads Depends on your Domain ● Reading Your Own Writes ● Monotonic Reads ● Consistent Prefix Reads 1 Problems with Replication Lag?
  • 34. 1 1. Single infrastructure dependency - Kafka (battle tested streaming platform with high throughput, fault-tolerance, and scalability). 2. When a task instance restarts, local state is repopulated by reading its own Kafka log 3. Yes, reading and repopulating will take some time Fault tolerance with local state
  • 35. 1 1. Key-Based Retention a. Aggressive topic compaction b. Tombstones 2. Time-Based Retention Controlling State Size. How long time to rebuild the state?
  • 36. 1 1. Driver state retention: 1hour 2. Repopulate local state: a. Read driver-state from the beginning of the topic: 400k msg (8 partitions) b. Read driver-locations from the 'now - 5sec' 3. You need to implement own event for ”live processing started” How long time to rebuild the state? "Live processing started "dispatching.driver-summary-events [0]" after 00:00:01.7875633 sec (50142 msgs)" SLA level of 99.998% uptime/availability results in the following periods of allowed downtime/unavailability: ■ Daily: 1.7s
  • 37. Traffic Jams requirements 1. Reduce the cost of Google Maps API 2. High rate of Writes (20k online drivers) 3. Update traffic information every 5min
  • 38. Stateful processing ● Grouping messages by partition key ● Aggregating messages in hopping window ● MapReduce
  • 40. 4 Similar workload using Redis https://aws.amazon.com/blogs/database/optimize-redis-client-performance-for-amazon-elasticache/?utm_source=pocket_saves ○ Client: c5.4xlarge (16 vCPU 32GiB) ○ Redis: 3 nodes r6g.2xlarge (8 vCPUs 64Gib)
  • 42. Although the current design is simple, it allows flexibility to change key aspects: ○ Replication + Sharding 4 Future works
  • 43. 46 1. Stateful is not always difficult 2. Simple and Reliable solution 3. Easy to maintain 4. Much more efficient in terms of resources (2 vCPUs for all dispatching) instead of a Redis cluster with 16-24 vCPUs 5. What about MS Orleans? Lessons learned