More Related Content Similar to Driving the Future of Smart Cities - How to Beat the Traffic (Pivotal talk at Strata 2014) (20) More from Ian Huston (10) Driving the Future of Smart Cities - How to Beat the Traffic (Pivotal talk at Strata 2014)2. Driving the Future of Smart Cities
How to Beat the Traffic
Strata Santa Clara – February 13, 2013
Alexander Kagoshima, Data Scientist, @akagoshima
Noelle Sio, Senior Data Scientist, @noellesio
Ian Huston, Data Scientist, @ianhuston
@gopivotal
© Copyright 2014 Pivotal. All rights reserved.
2013
2
3. What Matters: Apps. Data. Analytics.
Apps power businesses, and
those apps generate data
Analytic insights from that data
drive new app functionality,
which in-turn drives new data
The faster you can move
around that cycle, the faster
you learn, innovate & pull
away from the competition
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
3
4. Pivotal’s Opportunity
Uniquely positioned to help
enterprises modernize each
facet of this cycle today
Comprehensive portfolio of
products spanning Apps, Data
& Analytics
Converging these technologies
into a coherent, next-gen
Enterprise PaaS platform
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
4
5. The Connected Car Drives Innovation
Telematics
Stolen vehicle
Remote
recovery
Behaviour
diagnosis
monitoring
Remote
Car2X
Driver
activation
assistance solutions
Floating
Real-time
eMobility
car data
parking info
solutions
Share my
trip
Traffic
updates
Car
search
Social Media
Responsive
Navigation
PoIs
Next gen
Navigation
Hybrid
Predictive
Navigation
traffic info Map/PoI
updates
Vehicle
Concierge
Fleet
Geo
Tracking
services
Management
fencing
Handsfree
telephony
Music
WiFi
streaming
hotspot
Pay as
Online
you drive Payment Web
games
solutions radio
Road
Environmental
Parking
tolls
browsing
VoD
space reservation
Car
sharing
eCommerce
City
toll
Rich media
comms
Car2X
comms
Communication
Entertainment
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
5
6. Possible Data Science Use-Cases
! Predictive Car Maintenance
– More accurately predict part failure
– Optimize part repair and replacement schedule
! Leveraging Driving Behavior
– Useful to differentiate insurance pricing based on driving style
– Optimize car design
! Improving GPS Systems
– Establish baseline for traffic congestion
– Gain a detailed view on traffic
– Create more meaningful metrics for routing
! Predictive Power for Assistance Systems
– Optimize fuel efficiency
– Predict the future state of a car in the next 2 minutes
(starts, stops, emergency braking)
! Traffic Light Assistance
– Signal timing of traffic lights
– Crowd sourcing of traffic signals
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
6
9. How fast are vehicles moving?
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
9
10. How fast are vehicles moving?
0.015
0.000
0.005
0.010
density
0.020
0.025
0.030
Link 1000064869
0
50
100
km/h
150
200
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
10
11. When do disruptions happen?
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
11
13. Taking Lessons From Other Disciplines
Change-Point Detection can
be used to uncover regimes in
wind-turbine data.
It can also be applied to uncover
regimes in traffic light
switching patterns.
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
13
14. Taking Lessons From Other Disciplines
Link 1000064869
Different Cell
Populations
0.015
0.010
density
0.020
0.025
0.030
Combined
Component 1
Component 2
Component 3
Component 4
0.000
0.005
Different Driving
Conditions
0
50
100
150
200
@noellesio, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
14
15. Understanding Traffic Flow
A dynamic, more detailed
understanding of traffic is now possible.
Can we answer both ‘What velocity?’
and ‘Why?’
Context
! Current GPS systems are based on
average velocity over street segments
! Real-time traffic information (e.g. Waze)
does not deliver detailed view nor
prediction
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
15
16. Our Approach – Multi-step algorithm
From our experience, real-world data often requires multi-step procedures
Step 1: Answer ‘What velocity?’
First find distinct velocity
groups
Link 1000064869
Find influencing effects
?
?
?
?
?
?
0.000
0.005
0.010
density
0.015
0.020
0.025
0.030
Combined
Component 1
Component 2
Component 3
Component 4
Step 2: Answer ‘Why?’
0
50
100
150
200
km/h
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
16
17. Find Velocity Groups
! Velocity distributions can be fit well
with Gaussians
! An ‘overlay’ of multiple Gaussians is
called Gaussian Mixture Model
0.030
0.025
0.020
density
0.015
0.010
0.005
! Shapes and positions of Gaussians
determine velocity groups
Combined
Component 1
Component 2
Component 3
Component 4
0.000
! GMM fitting of the velocity
distribution is done by ExpectationMaximization algorithm
Link 1000064869
0
50
100
150
200
km/h
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
17
18. Gaussian Mixture Model
Link 1000064869
Link 1000064869
0.030
Combined
Component
Component
Component
Component
1
2
3
4
0.020
density
0.010
0.005
0.000
0
50
100
150
200
0
50
100
150
200
km/h
km/h
Link 1000064869
0.025
1
2
3
4
density
0.000
0.000
0.005
0.005
0.010
0.010
0.015
0.015
0.020
0.020
0.025
0.030
Combined
Component
Component
Component
Component
0.030
Link 1000064869
density
1
2
3
4
0.015
0.020
0.015
0.000
0.005
0.010
density
Combined
Component
Component
Component
Component
0.025
1
2
3
4
0.025
0.030
Combined
Component
Component
Component
Component
0
0
50
100
km/h
150
200
50
100
150
200
km/h
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
18
19. 0.04
Predict Gaussians
0.00
0.01
" Classification task!
0.02
density
0.03
! The second step seeks to explain/predict
which Gaussian a data point belongs to
Combined
Component 1
Component 2
Component 3
0
50
100
150
200
250
! Features for classification:
–
–
–
–
–
Time of day, day of week
Weather
Direction
Special Events
…
?
?
?
?…
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
19
20. White and Black Boxes
! Analyze correlations between features of a data point and its
assignment to a Gaussian
! From a Machine Learning point of view, this is classification
! Generate an interpretable model
description
! Can capture more complex
correlations
" Explanation of behavior
" Prediction of Gaussian assignment
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
20
21. 0.04
Putting it all together…
Combined
Component 1
Component 2
Component 3
0.02
Average velocity = 85 km/h
•
Weekdays after 8pm
•
Weekdays before 2pm, exiting
Average velocity = 120 km/h
•
Weekends
•
Weekdays before 2pm, not exiting
Two-Step algorithm:
GMM + Classification
• Identified multiple velocity
profiles for every road segment
• Intuitive and easily
interpretable results
• Highly scalable for more
features and data
0.00
0.01
density
0.03
Average velocity = 45 km/h
•
Weekdays between 2 – 8pm
0
50
100
km/h
150
200
250
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
21
22. Better Travel Time Prediction
! Traffic profiles emerged from data
– Without using metadata, we uncovered road
segment traffic patterns
! Identified Bias Effects
– Inferring the impact of turns and day of week on velocity
– Able to predict rush hour by day and time by road segment
! Traffic Light Patterns
– Infer public transportation effects on traffic
– Automatically determine different switching patterns
@akagoshima, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
22
23. London Road Traffic
Disruptions
Can we predict when
unexpected incidents will end?
Publicly available data:
! Transport for London traffic feed
(refreshed every 5 minutes)
! Weather Underground reports
Photo by James Blunt Photography on
Flickr (CC BY-ND 2.0)
@ianhuston, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
23
25. Major storm hits UK (Photo: BBC)
@ianhuston, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
25
28. Durations are very
different for
different types of
incident.
Mean duration for
Surface Damage
incidents is 107
hours!
@ianhuston, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
28
29. Rain affects
duration in a
surprising way.
Incidents which start
when it is raining
finish faster than
others.
@ianhuston, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
29
30. Models
Linear Regression
• Disruption reports &
weather features
Random Forests
• Rounded categorical
• Regression
Category MAP
• Only use category of
incident
• Maximum Likelihood
estimate
@ianhuston, @gopivotal
© Copyright 2014 Pivotal. All rights reserved.
30
32. Summary
! Making use of a vibrant ecosystem of traffic data
! Innovative approaches needed to generate value
from abundant and complex sources
! Connecting predictive models to traffic in the
physical world is the future of smart cities
© Copyright 2014 Pivotal. All rights reserved.
32
33. Thank You!
Check out more of our Data Science use-cases at
www.goPivotal.com
© Copyright 2014 Pivotal. All rights reserved.
2013
33