CitySprint Fleetmapper use case -Big Data Bootcamp

•Als PPTX, PDF herunterladen•

0 gefällt mir•273 views

Eduard Lazar

Blue signals a pick-up
Red signals a drop-off
Sample of how one driver’s journey looks like
Used for:
• Viewing the base unit of analysis

Demand heat map
Heat map of pickup locations density
Used for:
• Optimising resource allocation
• Identifying areas for potential expansion

K-means clustering analysis – 40 centres
Employed the K-means algorithm to identify clusters
of pickup points
Used for:
• Validating against current service centres map
• Identifying areas for potential expansion

K-means 100 centres
Higher granularity clustering
Used for:
• Assessing the frequency of pickups for micro-
clusters (e.g. villages, neighbourhoods)
• Directing drivers to hotter waiting areas

Geographical supply & demand
Pickup locations shown vs to routes
Used for:
• Improving likelihood of parcel pickup while on-route

0.0
4.5
9.0
13.5
18.0
0 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Expectedparcels
Time of day
Expected parcels allocated to cluster 41 (Stevenage)
Demand variation across time
Used for:
• Positioning couriers in the right place at the right time
For each demand cluster we calculated the
frequency of pickups per hour

The solution outline
• Data science capabilities of Spark, easy to use with SQL knowledge
• Map plotting on ARGIS – heat mapping, zoom in/out capabilities, real-time
• High-performance due to in-memory processing capabilities of Spark
• Can work with large data sets due to high performance disk-based data access
in Hadoop File System (HDFS)
• Can import data from EDW

Why Bigstep?
• Easy to use - Easy to deploy, redeploy, erase and rewind. Easy to experiment with
• Big Data Focus – Infrastructure, orchestration, and software ecosystem deliver
performance & ease of use for big data
• Domain Experts – Extensive hands-on experience in delivering complex big data
solutions for multiple verticals & use cases
• Consultative Approach – Direct contact and support from experienced big data, devops,
and infrastructure specialists
• Best In Class Infrastructure – The world’s highest performance cloud

Weitere ähnliche Inhalte

Was ist angesagt?

Weather Data Analytics Using HadoopNajima Begum

Advancing Scientific Data Support in ArcGISThe HDF-EOS Tools and Information Center

Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10thGeodata AS

Andrew Fage presentationCOGS Presentations

Presentation may30thCOGS Presentations

Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...Amazon Web Services

TrueReusableCode-BigDataCodeCamp2016Eduard Lazar

Leveraging Map Reduce With Hadoop for Weather Data Analytics iosrjce

SEPA - Esri UK Annual Conference 2016Esri UK

Petroleum lunch seminar 30.10.2014Geodata AS

2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...devopsdaysaustin

Co gps energy efficient gps sensing with cloud offloadingieeepondy

Deadline-aware MapReduce Job Scheduling with Dynamic Resource AvailabilityJAYAPRAKASH JPINFOTECH

Post conversion of Lidar data on complex terrainsJean-Claude Meteodyn

Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm DevelopmentEsri UK

Watershed development and drainage assessmentsAndrew Harrison

Scaling graphite to handle a zerg rushDaniel Ben-Zvi

Testbed in aarhus for precision positioning and autonomous systems (tapas)The European GNSS Agency (GSA)

GeoTrellis, GIS on ScalaGrigory Pomadchin

LIDAR-derived DTM for archaeology and landscape history research some recent ...Shaun Lewis

Was ist angesagt? (20)

Weather Data Analytics Using Hadoop

Advancing Scientific Data Support in ArcGIS

Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10th

Andrew Fage presentation

Presentation may30th

Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...

TrueReusableCode-BigDataCodeCamp2016

Leveraging Map Reduce With Hadoop for Weather Data Analytics

SEPA - Esri UK Annual Conference 2016

Petroleum lunch seminar 30.10.2014

2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...

Co gps energy efficient gps sensing with cloud offloading

Deadline-aware MapReduce Job Scheduling with Dynamic Resource Availability

Post conversion of Lidar data on complex terrains

Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm Development

Watershed development and drainage assessments

Scaling graphite to handle a zerg rush

Testbed in aarhus for precision positioning and autonomous systems (tapas)

GeoTrellis, GIS on Scala

LIDAR-derived DTM for archaeology and landscape history research some recent ...

Andere mochten auch

True Reusable Code - DevSum2016Eduard Lazar

GDPR by Identity MethodsEduard Lazar

Team3 presentationAmanda Gilbert

Big Data Conference April 2015Aaron Benz

So you want to do a Big Data project?Instanssi Oy

Big Data project offer for HSLVladimir Orekhov

이민의 포트폴리오Min Lee

포트폴리오 오경원Sio Oh

Andere mochten auch (8)

True Reusable Code - DevSum2016

GDPR by Identity Methods

Team3 presentation

Big Data Conference April 2015

So you want to do a Big Data project?

Big Data project offer for HSL

이민의 포트폴리오

포트폴리오 오경원

Ähnlich wie CitySprint Fleetmapper use case -Big Data Bootcamp

How the Internet of Things is Turning the Internet Upside DownTed Dunning

Dunning time-series-2015Ted Dunning

Dealing with an Upside Down Internet With High Performance Time Series DatabaseDataWorks Summit

Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...NoSQLmatters

Time Series Data in a Time Series WorldMapR Technologies

Scalable Deep Learning in ExtremeEarth-phiweek19ExtremeEarth

ASE2010swy351

Data warehouse 23 spatial dimension in data warehouseVaibhav Khanna

Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...Data Con LA

DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax Academy

The Future of Hadoop: A deeper look at Apache SparkCloudera, Inc.

Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks

Dealing with an Upside Down InternetMapR Technologies

How the Internet of Things are Turning the Internet Upside DownDataWorks Summit

"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Maya Lumbroso

"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Dataconomy Media

Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Dawn Wright

Software for the Hydrographic oceanHydrographic Society Benelux

Building HBase Applications - Ted DunningMapR Technologies

Making sense of the Graph RevolutionInfiniteGraph

Ähnlich wie CitySprint Fleetmapper use case -Big Data Bootcamp (20)

How the Internet of Things is Turning the Internet Upside Down

Dunning time-series-2015

Dealing with an Upside Down Internet With High Performance Time Series Database

Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...

Time Series Data in a Time Series World

Scalable Deep Learning in ExtremeEarth-phiweek19

ASE2010

Data warehouse 23 spatial dimension in data warehouse

Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...

DataStax and Esri: Geotemporal IoT Search and Analytics

The Future of Hadoop: A deeper look at Apache Spark

Apache Hadoop YARN - The Future of Data Processing with Hadoop

Dealing with an Upside Down Internet

How the Internet of Things are Turning the Internet Upside Down

"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...

Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...

Software for the Hydrographic ocean

Building HBase Applications - Ted Dunning

Making sense of the Graph Revolution

CitySprint Fleetmapper use case -Big Data Bootcamp

1. Eduard Lazar - CitySprint A geospatial and time series analysis of the CitySprint fleet

2. Blue signals a pick-up Red signals a drop-off Sample of how one driver’s journey looks like Used for: • Viewing the base unit of analysis

3. Demand heat map Heat map of pickup locations density Used for: • Optimising resource allocation • Identifying areas for potential expansion

4. K-means clustering analysis – 40 centres Employed the K-means algorithm to identify clusters of pickup points Used for: • Validating against current service centres map • Identifying areas for potential expansion

5. K-means 100 centres Higher granularity clustering Used for: • Assessing the frequency of pickups for micro- clusters (e.g. villages, neighbourhoods) • Directing drivers to hotter waiting areas

6. Geographical supply & demand Pickup locations shown vs to routes Used for: • Improving likelihood of parcel pickup while on-route

7. 0.0 4.5 9.0 13.5 18.0 0 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Expectedparcels Time of day Expected parcels allocated to cluster 41 (Stevenage) Demand variation across time Used for: • Positioning couriers in the right place at the right time For each demand cluster we calculated the frequency of pickups per hour

8. The solution outline • Data science capabilities of Spark, easy to use with SQL knowledge • Map plotting on ARGIS – heat mapping, zoom in/out capabilities, real-time • High-performance due to in-memory processing capabilities of Spark • Can work with large data sets due to high performance disk-based data access in Hadoop File System (HDFS) • Can import data from EDW

9. Why Bigstep? • Easy to use - Easy to deploy, redeploy, erase and rewind. Easy to experiment with • Big Data Focus – Infrastructure, orchestration, and software ecosystem deliver performance & ease of use for big data • Domain Experts – Extensive hands-on experience in delivering complex big data solutions for multiple verticals & use cases • Consultative Approach – Direct contact and support from experienced big data, devops, and infrastructure specialists • Best In Class Infrastructure – The world’s highest performance cloud

10. Eduard Lazar - CitySprint

Hinweis der Redaktion

Objectives: Take geospatial and time series data and make it easily manageable and usable by business users Discover new business insights to optimize operations Run real-time analysis on 22.626.119 records Test if Spark and Hadoop are suitable data analysis tools for CitySprint Design a flexible, versatile environment for analyzing fleet data Implement solution with enough performance so that real time data exploration is possible on the full dataset
Follows a random driver on a typical day through pickup and dropoff points. Map can zoom in, zoom out
Shows the hot points of pickup points along the uk. A good overview of the overall dataset.
Compared against our service center locations it shows a few differences. A clustering algorithm identifies ‘clusters’ of elements by it’s own. K-means needs to be told how many clusters to look for.
This is what happens if we tell k-means to split the dataset into 100 hot locations.
The blue dots are actual gps information of en-route drivers. Shows typical routes but only some routes go through hot areas.
A ‘cluster’ timetable is used to predict demand at a particular cluster on a particular time. Useful to instruct the driver if he is to stay or to go to it’s destination. This can help uberize the business.
Used a combination of technologies, mostly Spark on Hadoop on Bigstep. Imported data from production Postgres DB via Sqoop into avro and from there via spark into varous CSV files rendered by the ESRI (ARCGIS). Postgres concentrates information from mobile devices.

CitySprint Fleetmapper use case -Big Data Bootcamp

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (8)

Ähnlich wie CitySprint Fleetmapper use case -Big Data Bootcamp

Ähnlich wie CitySprint Fleetmapper use case -Big Data Bootcamp (20)

CitySprint Fleetmapper use case -Big Data Bootcamp

Hinweis der Redaktion