Streaming your Lyft Ride Prices
At Lyft we dynamically price our rides with a combination of various data sources, machine learning models, and streaming infrastructure for low latency, reliability and scalability. Dynamic pricing allows us to quickly adapt to real world changes and be fair to drivers (by say raising rates when there's a lot of demand) and fair to passengers (by let’s say offering to return 10 mins later for a cheaper rate). To accomplish this, our system consumes a massive amount of events from different sources.
The streaming platform powers pricing by bringing together the best of two worlds using Apache Beam; ML algorithms in Python/Tensorflow and Apache Flink as the streaming engine. Enablement of data science tools for machine learning and a process that allows for faster deployment is of growing importance for the business. Topics covered in this talk include:
* Examples for dynamic pricing based on real-time event streams, including location of driver, ride requests, user session event and based on machines learning models
* Comparison of legacy system and new streaming platform for dynamic pricing
* Processing live events in realtime to generate features for machine learning models
* Overview of streaming platform architecture and technology stack
* Apache Beam portability framework as bridge to distributed execution without code rewrite for JVM based streaming engine
* Lessons learned
5. 5
● Dynamic Pricing- price changes minutely at each
location bucket
● Why?
○ At face value, dynamic pricing is strange
○ But Lyft’s marketplace changes quickly
Dynamic Pricing- What and Why?
6. The Marketplace affects Prices
● An Imbalanced Market is Inefficient
○ Too many available drivers: bad
○ Too few available drivers: bad
○ Solution: Price lever controls passenger
request rate, which maintains healthy
supply levels
● Result: increase price if demand >> supply
6
7. What is PrimeTime?
● Belief: There exists some set of
optimal price multipliers per
location/time bucket
● PrimeTime- Lyft product that sets a
multiplier for each gh6 each
minute
● Example: In ‘9q8yyv’, at 5:01pm
PST, PrimeTime = 2.0
● Scale: Roughly 3 million
geohashes prices every minute
7
8. Why is PrimeTime Hard?
● 1. Need low-latency information about supply and demand
● 2. Pricing is an unsupervised problem- correct answer is never observed
● Solution: break the problem into multiple models that form a DAG, where
intermediate models are solving supervised problems
○ Example: f (available_supply) -> pickup_times
8
10. Legacy architecture: A series of cron jobs
● Ingest high volume of client app events
(Kinesis, KCL)
● Compute features (e.g. demand,
conversation rate, supply) from events
● Run ML models on features to compute
primetime for all regions (per min, per gh6)
SFO, calendar_min_1: {gh6: 1.0, gh6: 2.0, ...}
NYC: calendar_min_1: {gh6, 2.0, gh6: 1.0, ...}
10
11. Problems
1. Latency
2. Code complexity (LOC)
3. Hard to add new features involving windowing/join (i.e. arbitrary demand
windows, subregional computation)
4. No dynamic / smart triggers
11
14. 14
Streaming and Python
● Flink and many other big data ecosystem projects are Java / JVM based
○ Team wants to adopt streaming, but doesn’t have the Java skills
○ Jython != Python
● Use cases for different language environments
○ Python primary option for Machine Learning
● Cost of many API styles and runtime environments
21. 21
The Beam Vision
1. End users: who want to write pipelines in a
language that’s familiar.
2. SDK writers: who want to make Beam
concepts available in new languages.
Includes IOs: connectors to data stores.
3. Runner writers: who have a distributed
processing environment and want to
support Beam pipelines
Beam Model: Fn Runners
Apache
Flink
Apache
Spark
Beam Model: Pipeline Construction
Other
LanguagesBeam Java
Beam
Python
Execution Execution
Cloud
Dataflow
Execution
https://s.apache.org/apache-beam-project-overview
22. 22
Multi-Language Support
● Initially Java SDK and Java Runners
● 2016: Start of cross-language support effort
● 2017: Python SDK on Dataflow
● 2018: Go SDK (for portable runners)
● 2018: Python on Flink MVP
● Next: Cross-language pipelines, more portable runners
23. 23
Python Example
p = beam.Pipeline(runner=runner, options=pipeline_options)
(p
| ReadFromText("/path/to/text*") | Map(lambda line: ...)
| WindowInto(FixedWindows(120)
trigger=AfterWatermark(
early=AfterProcessingTime(60),
late=AfterCount(1))
accumulation_mode=ACCUMULATING)
| CombinePerKey(sum))
| WriteToText("/path/to/outputs")
)
result = p.run()
( What, Where, When, How )
24. 24
⋮
input | Sum.PerKey()
Python
input.apply(
Sum.integersPerKey())
Java
SELECT key, SUM(value)
FROM input GROUP BY key
SQL (via Java)
⋮
Cloud Dataflow
Apache Spark
Apache Flink
Apache Apex
Gearpump
Apache Samza
Apache Nemo
(incubating)
IBM Streams
Sum Per Key
Java objects
Sum Per Key
Dataflow JSON API
Portability (originally)
https://s.apache.org/state-of-beam-sfo-2018
25. 25
⋮
input | Sum.PerKey()
Python
stats.Sum(s, input)
Go
SELECT key, SUM(value)
FROM input GROUP BY key
SQL (via Java)
⋮
input.apply(
Sum.integersPerKey())
Java Apache Spark
Apache Flink
Apache Apex
Gearpump
Cloud Dataflow
Apache Samza
Apache Nemo
(incubating)
IBM Streams
Sum Per Key
Java objects
Sum Per Key
Portable protos
Portability (current)
https://s.apache.org/state-of-beam-sfo-2018
28. 28
Lyft Flink Runner Customizations
● Translator extension for streaming sources
○ Kinesis, Kafka consumers that we also use in Java Flink jobs
○ Message decoding, watermarks
● Python execution environment for SDK workers
○ Tailored to internal deployment tooling
○ Docker-free, frozen virtual envs
● https://github.com/lyft/beam/tree/release-2.11.0-lyft
29. Robert Bradshaw, Beam Summit London, 2018
Workers have Fn
Runner
Runner worker launches services.
Control
Service
Data
Service
State
Service
Logging
Service
30. 30
Fn API
How slow is this ?
● Fn API Overhead 15% ?
● Fused stages
● Bundle size
● Parallel SDK workers
● TODO: Cython
● protobuf C++ bindings
decode, …, window count
(messages
| 'reshuffle' >> beam.Reshuffle()
| 'decode' >> beam.Map(lambda x: (__import__('random').randint(0, 511), 1))
| 'noop1' >> beam.Map(lambda x : x)
| 'noop2' >> beam.Map(lambda x : x)
| 'noop3' >> beam.Map(lambda x : x)
| 'window' >> beam.WindowInto(window.GlobalWindows(),
trigger=Repeatedly(AfterProcessingTime(5 * 1000)),
accumulation_mode= AccumulationMode.DISCARDING)
| 'group' >> beam.GroupByKey()
| 'count' >> beam.Map(count)
)
31. 31
Fast enough for real Python work !
● c5.4xlarge machines (16 vCPU, 32 GB)
● 16 SDK workers / machine
● 1000 ms or 1000 records / bundle
● 280,000 transforms / second / machine (~ 17,500 per worker)
● Python user code will be gating factor
32. 32
Beam Portability Recap
● Pipelines written in non-JVM languages on JVM runners
○ Python, Go on Flink (and others)
● Full isolation of user code
○ Native CPython execution w/o library restrictions
● Flexible SDK worker execution
○ Docker, Process, Embedded, ...
● Multiple languages in a single pipeline (WIP)
○ Use Java Beam IO with Python
○ Use TFX with Java
○ <your use case here>
35. Lessons Learned
• Python Beam SDK and portable Flink runner evolving
• Keep pipeline simple - Flink tasks / shuffles are not free
• Stateful processing is essential for complex logic
• Model execution latency matters
• Instrument everything for monitoring
• Think about pipeline restart and upgrade
• Mind your dependencies - rate limit API calls
• Long running Python processes may expose memory leaks
35
36. We’re Hiring! Apply at www.lyft.com/careers
or email data-recruiting@lyft.com
Data Engineering
Engineering Manager
San Francisco
Software Engineer
San Francisco, Seattle, &
New York City
Data Infrastructure
Engineering Manager
San Francisco
Software Engineer
San Francisco & Seattle
Experimentation
Software Engineer
San Francisco
Streaming
Software Engineer
San Francisco
Observability
Software Engineer
San Francisco