SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Eduard Lazar - CitySprint
A geospatial and time series analysis
of the
CitySprint fleet
Blue signals a pick-up
Red signals a drop-off
Sample of how one driver’s journey looks like
Used for:
• Viewing the base unit of analysis
Demand heat map
Heat map of pickup locations density
Used for:
• Optimising resource allocation
• Identifying areas for potential expansion
K-means clustering analysis – 40 centres
Employed the K-means algorithm to identify clusters
of pickup points
Used for:
• Validating against current service centres map
• Identifying areas for potential expansion
K-means 100 centres
Higher granularity clustering
Used for:
• Assessing the frequency of pickups for micro-
clusters (e.g. villages, neighbourhoods)
• Directing drivers to hotter waiting areas
Geographical supply & demand
Pickup locations shown vs to routes
Used for:
• Improving likelihood of parcel pickup while on-route
0.0
4.5
9.0
13.5
18.0
0 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Expectedparcels
Time of day
Expected parcels allocated to cluster 41 (Stevenage)
Demand variation across time
Used for:
• Positioning couriers in the right place at the right time
For each demand cluster we calculated the
frequency of pickups per hour
The solution outline
• Data science capabilities of Spark, easy to use with SQL knowledge
• Map plotting on ARGIS – heat mapping, zoom in/out capabilities, real-time
• High-performance due to in-memory processing capabilities of Spark
• Can work with large data sets due to high performance disk-based data access
in Hadoop File System (HDFS)
• Can import data from EDW
Why Bigstep?
• Easy to use - Easy to deploy, redeploy, erase and rewind. Easy to experiment with
• Big Data Focus – Infrastructure, orchestration, and software ecosystem deliver
performance & ease of use for big data
• Domain Experts – Extensive hands-on experience in delivering complex big data
solutions for multiple verticals & use cases
• Consultative Approach – Direct contact and support from experienced big data, devops,
and infrastructure specialists
• Best In Class Infrastructure – The world’s highest performance cloud
Eduard Lazar - CitySprint

Weitere ähnliche Inhalte

Was ist angesagt?

Weather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopWeather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopNajima Begum
 
Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10th
Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10thGeocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10th
Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10thGeodata AS
 
Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...
Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...
Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...Amazon Web Services
 
TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016Eduard Lazar
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics iosrjce
 
SEPA - Esri UK Annual Conference 2016
SEPA - Esri UK Annual Conference 2016SEPA - Esri UK Annual Conference 2016
SEPA - Esri UK Annual Conference 2016Esri UK
 
Petroleum lunch seminar 30.10.2014
Petroleum lunch seminar 30.10.2014Petroleum lunch seminar 30.10.2014
Petroleum lunch seminar 30.10.2014Geodata AS
 
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...devopsdaysaustin
 
Co gps energy efficient gps sensing with cloud offloading
Co gps energy efficient gps sensing with cloud offloadingCo gps energy efficient gps sensing with cloud offloading
Co gps energy efficient gps sensing with cloud offloadingieeepondy
 
Deadline-aware MapReduce Job Scheduling with Dynamic Resource Availability
Deadline-aware MapReduce Job Scheduling with Dynamic Resource AvailabilityDeadline-aware MapReduce Job Scheduling with Dynamic Resource Availability
Deadline-aware MapReduce Job Scheduling with Dynamic Resource AvailabilityJAYAPRAKASH JPINFOTECH
 
Post conversion of Lidar data on complex terrains
Post conversion of Lidar data on complex terrainsPost conversion of Lidar data on complex terrains
Post conversion of Lidar data on complex terrainsJean-Claude Meteodyn
 
Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm Development
Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm DevelopmentAtmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm Development
Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm DevelopmentEsri UK
 
Watershed development and drainage assessments
Watershed development and drainage assessmentsWatershed development and drainage assessments
Watershed development and drainage assessmentsAndrew Harrison
 
Scaling graphite to handle a zerg rush
Scaling graphite to handle a zerg rushScaling graphite to handle a zerg rush
Scaling graphite to handle a zerg rushDaniel Ben-Zvi
 
Testbed in aarhus for precision positioning and autonomous systems (tapas)
Testbed in aarhus for precision positioning and autonomous systems (tapas)Testbed in aarhus for precision positioning and autonomous systems (tapas)
Testbed in aarhus for precision positioning and autonomous systems (tapas)The European GNSS Agency (GSA)
 
LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...Shaun Lewis
 

Was ist angesagt? (20)

Weather Data Analytics Using Hadoop
Weather Data Analytics Using HadoopWeather Data Analytics Using Hadoop
Weather Data Analytics Using Hadoop
 
Advancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGISAdvancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGIS
 
Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10th
Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10thGeocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10th
Geocap seismic oil and gas for ArcGIS- Oil and Gas seminar October 10th
 
Andrew Fage presentation
Andrew Fage   presentationAndrew Fage   presentation
Andrew Fage presentation
 
Presentation may30th
Presentation may30thPresentation may30th
Presentation may30th
 
Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...
Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...
Adding Location and Geospatial Analytics to Big Data Analytics (BDT210) | AWS...
 
TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016TrueReusableCode-BigDataCodeCamp2016
TrueReusableCode-BigDataCodeCamp2016
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
 
SEPA - Esri UK Annual Conference 2016
SEPA - Esri UK Annual Conference 2016SEPA - Esri UK Annual Conference 2016
SEPA - Esri UK Annual Conference 2016
 
Petroleum lunch seminar 30.10.2014
Petroleum lunch seminar 30.10.2014Petroleum lunch seminar 30.10.2014
Petroleum lunch seminar 30.10.2014
 
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
2016 - IGNITE - Terraform to go from Zero to Prod in less than 1 month and TH...
 
Co gps energy efficient gps sensing with cloud offloading
Co gps energy efficient gps sensing with cloud offloadingCo gps energy efficient gps sensing with cloud offloading
Co gps energy efficient gps sensing with cloud offloading
 
Deadline-aware MapReduce Job Scheduling with Dynamic Resource Availability
Deadline-aware MapReduce Job Scheduling with Dynamic Resource AvailabilityDeadline-aware MapReduce Job Scheduling with Dynamic Resource Availability
Deadline-aware MapReduce Job Scheduling with Dynamic Resource Availability
 
Post conversion of Lidar data on complex terrains
Post conversion of Lidar data on complex terrainsPost conversion of Lidar data on complex terrains
Post conversion of Lidar data on complex terrains
 
Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm Development
Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm DevelopmentAtmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm Development
Atmos - Tom hartley - Modelling Bird Behaviour to Progress Wind Farm Development
 
Watershed development and drainage assessments
Watershed development and drainage assessmentsWatershed development and drainage assessments
Watershed development and drainage assessments
 
Scaling graphite to handle a zerg rush
Scaling graphite to handle a zerg rushScaling graphite to handle a zerg rush
Scaling graphite to handle a zerg rush
 
Testbed in aarhus for precision positioning and autonomous systems (tapas)
Testbed in aarhus for precision positioning and autonomous systems (tapas)Testbed in aarhus for precision positioning and autonomous systems (tapas)
Testbed in aarhus for precision positioning and autonomous systems (tapas)
 
GeoTrellis, GIS on Scala
GeoTrellis, GIS on ScalaGeoTrellis, GIS on Scala
GeoTrellis, GIS on Scala
 
LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...LIDAR-derived DTM for archaeology and landscape history research some recent ...
LIDAR-derived DTM for archaeology and landscape history research some recent ...
 

Andere mochten auch

True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016Eduard Lazar
 
GDPR by Identity Methods
GDPR by Identity MethodsGDPR by Identity Methods
GDPR by Identity MethodsEduard Lazar
 
Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015Aaron Benz
 
So you want to do a Big Data project?
So you want to do a Big Data project?So you want to do a Big Data project?
So you want to do a Big Data project?Instanssi Oy
 
Big Data project offer for HSL
Big Data project offer for HSLBig Data project offer for HSL
Big Data project offer for HSLVladimir Orekhov
 
이민의 포트폴리오
이민의 포트폴리오이민의 포트폴리오
이민의 포트폴리오Min Lee
 
포트폴리오 오경원
포트폴리오 오경원포트폴리오 오경원
포트폴리오 오경원Sio Oh
 

Andere mochten auch (8)

True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016
 
GDPR by Identity Methods
GDPR by Identity MethodsGDPR by Identity Methods
GDPR by Identity Methods
 
Team3 presentation
Team3 presentationTeam3 presentation
Team3 presentation
 
Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015
 
So you want to do a Big Data project?
So you want to do a Big Data project?So you want to do a Big Data project?
So you want to do a Big Data project?
 
Big Data project offer for HSL
Big Data project offer for HSLBig Data project offer for HSL
Big Data project offer for HSL
 
이민의 포트폴리오
이민의 포트폴리오이민의 포트폴리오
이민의 포트폴리오
 
포트폴리오 오경원
포트폴리오 오경원포트폴리오 오경원
포트폴리오 오경원
 

Ähnlich wie CitySprint Fleetmapper use case -Big Data Bootcamp

How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownTed Dunning
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015Ted Dunning
 
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet  With High Performance Time Series DatabaseDealing with an Upside Down Internet  With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series DatabaseDataWorks Summit
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...NoSQLmatters
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series WorldMapR Technologies
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19ExtremeEarth
 
ASE2010
ASE2010ASE2010
ASE2010swy351
 
Data warehouse 23 spatial dimension in data warehouse
Data warehouse 23 spatial dimension in data warehouseData warehouse 23 spatial dimension in data warehouse
Data warehouse 23 spatial dimension in data warehouseVaibhav Khanna
 
Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...
Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...
Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...Data Con LA
 
DataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax Academy
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkCloudera, Inc.
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopHortonworks
 
Dealing with an Upside Down Internet
Dealing with an Upside Down InternetDealing with an Upside Down Internet
Dealing with an Upside Down InternetMapR Technologies
 
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownHow the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownDataWorks Summit
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Maya Lumbroso
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...Dataconomy Media
 
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Dawn Wright
 
Building HBase Applications - Ted Dunning
Building HBase Applications - Ted DunningBuilding HBase Applications - Ted Dunning
Building HBase Applications - Ted DunningMapR Technologies
 
Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph RevolutionInfiniteGraph
 

Ähnlich wie CitySprint Fleetmapper use case -Big Data Bootcamp (20)

How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015
 
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet  With High Performance Time Series DatabaseDealing with an Upside Down Internet  With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series Database
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
 
Time Series Data in a Time Series World
Time Series Data in a Time Series WorldTime Series Data in a Time Series World
Time Series Data in a Time Series World
 
Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19Scalable Deep Learning in ExtremeEarth-phiweek19
Scalable Deep Learning in ExtremeEarth-phiweek19
 
ASE2010
ASE2010ASE2010
ASE2010
 
Data warehouse 23 spatial dimension in data warehouse
Data warehouse 23 spatial dimension in data warehouseData warehouse 23 spatial dimension in data warehouse
Data warehouse 23 spatial dimension in data warehouse
 
Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...
Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...
Big Data Day LA 2015 - Big Data Day LA 2015 - Applying GeoSpatial Analytics u...
 
DataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and Analytics
 
The Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache SparkThe Future of Hadoop: A deeper look at Apache Spark
The Future of Hadoop: A deeper look at Apache Spark
 
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with HadoopApache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
 
Dealing with an Upside Down Internet
Dealing with an Upside Down InternetDealing with an Upside Down Internet
Dealing with an Upside Down Internet
 
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownHow the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside Down
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
Feature Geo Analytics and Big Data Processing: Hybrid Approaches for Earth Sc...
 
Software for the Hydrographic ocean
Software for the Hydrographic oceanSoftware for the Hydrographic ocean
Software for the Hydrographic ocean
 
Building HBase Applications - Ted Dunning
Building HBase Applications - Ted DunningBuilding HBase Applications - Ted Dunning
Building HBase Applications - Ted Dunning
 
Making sense of the Graph Revolution
Making sense of the Graph RevolutionMaking sense of the Graph Revolution
Making sense of the Graph Revolution
 

CitySprint Fleetmapper use case -Big Data Bootcamp

  • 1. Eduard Lazar - CitySprint A geospatial and time series analysis of the CitySprint fleet
  • 2. Blue signals a pick-up Red signals a drop-off Sample of how one driver’s journey looks like Used for: • Viewing the base unit of analysis
  • 3. Demand heat map Heat map of pickup locations density Used for: • Optimising resource allocation • Identifying areas for potential expansion
  • 4. K-means clustering analysis – 40 centres Employed the K-means algorithm to identify clusters of pickup points Used for: • Validating against current service centres map • Identifying areas for potential expansion
  • 5. K-means 100 centres Higher granularity clustering Used for: • Assessing the frequency of pickups for micro- clusters (e.g. villages, neighbourhoods) • Directing drivers to hotter waiting areas
  • 6. Geographical supply & demand Pickup locations shown vs to routes Used for: • Improving likelihood of parcel pickup while on-route
  • 7. 0.0 4.5 9.0 13.5 18.0 0 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Expectedparcels Time of day Expected parcels allocated to cluster 41 (Stevenage) Demand variation across time Used for: • Positioning couriers in the right place at the right time For each demand cluster we calculated the frequency of pickups per hour
  • 8. The solution outline • Data science capabilities of Spark, easy to use with SQL knowledge • Map plotting on ARGIS – heat mapping, zoom in/out capabilities, real-time • High-performance due to in-memory processing capabilities of Spark • Can work with large data sets due to high performance disk-based data access in Hadoop File System (HDFS) • Can import data from EDW
  • 9. Why Bigstep? • Easy to use - Easy to deploy, redeploy, erase and rewind. Easy to experiment with • Big Data Focus – Infrastructure, orchestration, and software ecosystem deliver performance & ease of use for big data • Domain Experts – Extensive hands-on experience in delivering complex big data solutions for multiple verticals & use cases • Consultative Approach – Direct contact and support from experienced big data, devops, and infrastructure specialists • Best In Class Infrastructure – The world’s highest performance cloud
  • 10. Eduard Lazar - CitySprint

Hinweis der Redaktion

  1. Objectives: Take geospatial and time series data and make it easily manageable and usable by business users Discover new business insights to optimize operations Run real-time analysis on 22.626.119 records Test if Spark and Hadoop are suitable data analysis tools for CitySprint Design a flexible, versatile environment for analyzing fleet data Implement solution with enough performance so that real time data exploration is possible on the full dataset
  2. Follows a random driver on a typical day through pickup and dropoff points. Map can zoom in, zoom out
  3. Shows the hot points of pickup points along the uk. A good overview of the overall dataset.
  4. Compared against our service center locations it shows a few differences. A clustering algorithm identifies ‘clusters’ of elements by it’s own. K-means needs to be told how many clusters to look for.
  5. This is what happens if we tell k-means to split the dataset into 100 hot locations.
  6. The blue dots are actual gps information of en-route drivers. Shows typical routes but only some routes go through hot areas.
  7. A ‘cluster’ timetable is used to predict demand at a particular cluster on a particular time. Useful to instruct the driver if he is to stay or to go to it’s destination. This can help uberize the business.
  8. Used a combination of technologies, mostly Spark on Hadoop on Bigstep. Imported data from production Postgres DB via Sqoop into avro and from there via spark into varous CSV files rendered by the ESRI (ARCGIS). Postgres concentrates information from mobile devices.