Speaker: Yupeng Fu, Staff Engineer, Uber
High availability and reliability are important requirements to Uber services, and the services shall tolerate datacenter failures in a region and fail over to another region. In this talk, we will present the active-active Apache Kafka® at Uber and how it facilitates disaster discovery across regions for Uber services. In particular, we will highlight the key components including topic replication, topic aggregation, offsets sync and then walk through several use cases of their disaster recovery strategy using active-active Kafka. Lastly, we will present several interesting challenges and the future work planned.
Yupeng Fu is a staff engineer in Uber Data Org leading the streaming data platform. Previously, he worked at Alluxio and Palantir, building distributed data analysis and storage platforms. Yupeng holds a B.S. and an M.S. from Tsinghua University and did his Ph.D. research on databases at UCSD.
6. Multi-region at Uber
● Provide business resilience and continuity as the top priority
○ Survive outages and disasters without major business impact
○ Region isolation to avoid cascading failure
● Take good care of customer experiences
○ Serve user requests in a closer region
○ Data integrity and consistency matters
● Improve infrastructure flexibility and efficiency
○ Decease compliance and policy risks
○ Leverage both on-premise and cloud partners
7. Considerations for apps/services
● Highly available
○ Auto and on-demand region failover
● Highly flexible
○ Stateless and mobile
○ Data sharded by Geo
● Tradeoffs in SLA
○ Local data vs aggregated view
○ Latency vs consistency
● Leverage active-active storage layer for state
sharing
9. Considerations for Apache Kafka
● Producer
○ Data produced locally
● Data aggregation
○ Topics replicated to agg clusters
10. Considerations for Apache Kafka
● Producer
○ Data produced locally
● Data aggregation
○ Topics replicated to agg clusters
● Active-active consumers
○ Double compute
○ Data ingestion
11. Active-active example: surge
● Real-time dynamic pricing
● Critical service with strict SLA
● Heavy distributed computation
● Large memory footprint
● Latency over consistency
Dynamic pricing
Rider
Driver
13. Data replication - uReplicator
● Uber’s Apache Kafka replication service
● Goals
○ Stable replication, e.g. rebalance only occurs during startup
○ Operate with ease, e.g. add/remove whitelists
○ Scalable
○ High throughput
● Open sourced: https://github.com/uber/uReplicator
● Blog: https://eng.uber.com/ureplicator/
14. Considerations for Apache Kafka
● Producer
○ Data produced locally
● Data aggregation
○ Topics replicated to agg clusters
● Active-active consumers
○ Double compute
○ Data ingestion
● Active-passive consumers
○ Consistency sensitive apps
○ Challenge on offset sync
15. Offset sync - challenges
● Requirements
○ No data loss -> cannot resume from
largest offset
○ Reduce duplicates -> cannot resume
from smallest offset
● Constraints
○ Not all messages have timestamp
○ Messages in the agg cluster out of order
due to the merge
16. Offset sync - architecture
● uReplicator reports the offset from src to
dst to the offset manager
● Offset manager
○ Stores the checkpoints state
○ Translates the offsets mapping
● Sync job periodically translates the offsets
and pushes the new offsets
● Internal consumer looks up the offsets
18. Offset sync - translation
11
12
21
22
11
12
21
22
21
22
11
12
13
14
23
24
13
14
23
24
23
24
13
14
● Find the mapped offset during the failover
○ Find the src offsets from the most recent
checkpoints
○ Take the min of the checkpointed offsets
on the failed over agg cluster
6
3
5
1
3
1
7
19. Offset sync - active-passive producer
11
12
21
11
12
21
21
11
12
13
14
13
14
● Find the mapped offset during the failover
○ Find the src offsets from the most recent
checkpoints
○ Take the min of the checkpointed offsets
on the failed over agg cluster, ignore the
checkpoints when the src offset is the
latest
3
13
14
15
16
15
16
15
16