SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Jongwook Woo
HiPIC
CalStateLA
KSII The 14th Asia Pacific International Conference
on Information Science and Technology(APIC-IST),
Beijing
June 24 2019
Dalya (Dalyapraz) Dauletbak, dmanato@calstatela.edu
Jongwook Woo, PhD
Big Data AI Center (BigDAI)
California State University Los Angeles
Traffic Data Analysis and Prediction
using Big Data
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Contents
 Introduction
 H/W Specification
 Architecture Chart
 Implementation steps
 Data structure
 Analysis
 Prediction
 Summary
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Introduction
About me:
 Graduate Computer Information Systems Student at California State University, Los Angeles
– BS (2015): Mathematics at Nazarbayev University
– Previously: Senior Consultant/Data Analyst @ Management consulting at KPMG Central Asia
– Current: Community Manager @ International Data Engineering and Science Association (IDEAS)
Data source:
 A GPS navigation mobile application
 Provide real-time directions and up-to-date information
 Traffic
 Accidents
 Road closure
 Weather hazards
 Lurking police vehicles and etc.
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Introduction
Data source:
 Navigation app traffic data set from LA City Department*
 Information reported by users - Alerts
 information captured by user’s device - Jams
 We are going to find out:
 Areas with high volume of traffic (geography)
 Peak-hours
 Density of Alerts and Incidents
 Traffic volume by road types
 Prediction of traffic jam
*Limited authorization to access the full datasets 100 GB + original; we used
limited dataset to 9 days (Dec 31– Jan 8, 2018) ~2GB
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Contents
 Introduction
 H/W Specification
 Architecture Chart
 Implementation steps
 Data structure
 Analysis
 Prediction
 Summary
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
H/W Specification
Number of nodes 6
OCPUs 12
CPU speed 2195.196MHz
Memory 180 GB
Storage 682 GB
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Architecture Chart
Source: Hadoop Masterclass
Part 4 of 4: Analyzing Big Data
Lars George | EMEA Chief Architect
Cloudera
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Implementation steps
Local Computer
Raw data
files
(JSON)
Geo-Spatial
Visualization (3D
map)
Dashboard for
Analytics
Hadoop/Hive
Upload dataset to
HDFS
Parse JSON files
using Pandas
Create tables’
schema
Clean data
Create sample/summary
dataset for prediction and
visualization
Microsoft Azure
ML Studio
Upload sample
dataset
Apply data
transformation
Split dataset for
training and scoring
Train model(s)
Evaluate model(s)
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Data structure
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Contents
 Introduction
 H/W Specification
 Architecture Chart
 Implementation steps
 Data structure
 Analysis
 Prediction
 Summary
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Analysis
 Information we are using:
 Location/Time
 Level of traffic intensity
 X and Y coordinates (Longitude & Latitude)
 Counts of jams/alerts
 Tools we are using:
 Excel - 3D map
 Power BI - Flow map, pie charts, bar charts
 What we are predicting:
 Level of traffic (1 to 3 – light, medium, heavy)
 Based on date, time, location
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Traffic in LA (captured from users' devices)
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Traffic in LA (reported by app users)
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Video-Simulation of Traffic in LA (captured from users' devices)
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Video-Simulation of Traffic in LA (reported by app users)
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Traffic Analysis Dashboard
Peak
Peak
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Traffic Analysis Dashboard
Major areas of traffic are:
Downtown Los Angeles,
Santa Monica, Hollywood,
and highways.
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Contents
 Introduction
 H/W Specification
 Architecture Chart
 Implementation steps
 Data structure
 Analysis
 Prediction
 Summary
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Prediction of traffic congestion with Machine Learning
Data
preparation
Group label values
Join additional
dataset
Apply data
transformation
Normalize data
Model building
Model(s) selection
Cross Validation
Train model(s)
Model
evaluation
Score model
Evaluate model
(Accuracy, Recall)
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Features/columns in a dataset
location x,
location y
X and Y -coordinate of location
date_pst Pacific Time of the publication of traffic report
level jam level, where 1 – almost no jam and 5 –
standstill jam
speed driver’s captured speed in mph
length length of the traffic ahead in the route of user
in meters
*date_pst *date splits into month, day, hour, min, sec,
weekday
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Data transformation
 Randomly selected data ~ 100MB
 Select relevant features
 Group level into 2 classes (label: 0 & 1)
 Join holidays dataset
 Add attribute is_holiday (0 or 1)
 Change cyclical attributes from Polar
coordinates to Cartesian
 Add is_rush, is_weekend (0 or 1)
 Normalize features
 Make categorical: is_rush, is_holiday,
is_weekend, label
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
SELECT location_x, location_y,
SIN((weekday)*(2*PI()/7)) as sin_weekday,
COS((weekday)*(2*PI()/7)) as cos_weekday,
SIN((month-1)*(2*PI()/12)) as sin_month,
COS((month-1)*(2*PI()/12)) as cos_month,
SIN((day-1)*(2*PI()/31)) as sin_day, COS((day-
1)*(2*PI()/31)) as cos_day,
SIN(hour*(2*PI()/24)) as sin_hour,
COS(hour*(2*PI()/24)) as cos_hour,
SIN(min*(2*PI()/60)) as sin_min,
COS(min*(2*PI()/60)) as cos_min ,
SIN(sec*(2*PI()/60)) as sin_sec,
COS(sec*(2*PI()/60)) as cos_sec,
…
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
MODEL Evaluation
Model Accuracy Precision Recall AUC ROC
LR 0.662 0.662 1.0 0.571
BDT 0.805 0.832 0.884 0.868
DF 0.832 0.868 0.880 0.885
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Summary of Traffic Prediction with
Machine Learning
 Model is based on sampled
dataset ~ 1M rows (100 MB)
 Best model - Decision Forest
 Accuracy – 0.832
 Precision - 0.868
 Recall - 0.880
 Area under the Curve – 0.885
Confusion Matrix
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Contents
 Introduction
 H/W Specification
 Architecture Chart
 Implementation steps
 Data structure
 Analysis
 Prediction
 Summary
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Summary
Denser traffic on Freeways 101, 405, 10
Rush hours from 7 am to 9 am produce a lot of traffic, the
heaviest traffic time start from 3pm and gets better after 6pm.
Major areas of traffic in DTLA, Santa Monica, Hollywood
More insights can be found with bigger dataset using this
framework for analysis of traffic
Using such data and platform can also give an opportunity to
predict traffic congestions. Prediction can be performed using
machine learning algorithm – Decision Forest with the
accuracy of 83% for predicting the heaviest traffic jam.
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
Questions?
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
References
1. J. Barbaresso, G. Cordahi, D. Garcia et al., “USDOT’s Intelligent Transportation Systems (ITS) ITS Strategic Plan
2015- 2019,” 2014.
2. “Integrated Corridor Management,” Intelligent Transportation Systems - Integrated Corridor Management,
www.its.dot.gov/research_archives/icms/. Accessed April 14, 2019.
3. J. Kestelyn, “Real-Time Data Visualization and Machine Learning for London Traffic Analysis,” Google Cloud,
2016, cloud.google.com/blog/products/gcp/real-time-data-visualization-and-machine-learning-for-london-
traffic-analysis. Accessed April 14, 2019.
4. “Connected Citizens by Waze,” Waze, www.waze.com/ccp. Accessed April 14, 2019.
5. M. Schnuerle, “Louisville and Waze: Applying Mobility Data in Cities,” Harvard Civic Analytics Network
Summit on Data-Smart Government, 2017.
6. Louisville Metro. “Thunder Jams, 2017 Traffic Delays.” CARTO, louisvillemetro-
ms.carto.com/builder/d98732d0-1f6a-4db2-9f8a-e58026bf0d39/embed. Accessed April 14, 2019.
7. Louisville Metro. “Pothole Animation.” CARTO, cdolabs-admin.carto.com/builder/a80f62bf-98e1-4591-8354-
acfa8e51a8de/embed. Accessed April 14, 2019.
8. E. Necula, “Analyzing Traffic Patterns on Street Segments Based on GPS Data Using R,” Transportation
Research Procedia, Vol. 10, pp. 276–285, 2015.
Big Data Artificial Intelligence Center (BigDAI)
Jongwook Woo
CalStateLA
References
9. J. Woo and Y. Xu, “Market Basket Analysis Algorithm with Map/Reduce of Cloud Computing,” in Proc. of
International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las
Vegas. 2011.
10. “Pandas.io.json.json_normalize.” Pandas.io.json.json_normalize - Pandas 0.24.2 Documentation,
pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html. Accessed April
14, 2019.
11. United States, Chief Executive Office County of Los Angeles. “Cities within the County of Los Angeles.”
lacounty.gov. Accessed April 14, 2019.
12. Garyericson. “What Is - Azure Machine Learning Studio.” Microsoft Docs, docs.microsoft.com/en-
us/azure/machine-learning/studio/what-is-ml-studio. Accessed April 14, 2019.
13. A. Tharwat, “Classification Assessment Methods.” Applied Computing and Informatics, 2018.
14. M. Sokolova and L. Guy, “A Systematic Analysis of Performance Measures for Classification
Tasks,” Information Processing & Management, Vol. 45. No. 4, pp. 427–437, 2009.

Weitere ähnliche Inhalte

Was ist angesagt?

Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network reportAnjali Agrawal
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
Lung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image ClassificationLung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image ClassificationShreshth Saxena
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
Python programming using problem solving approach by thareja, reema (z lib.org)
Python programming using problem solving approach by thareja, reema (z lib.org)Python programming using problem solving approach by thareja, reema (z lib.org)
Python programming using problem solving approach by thareja, reema (z lib.org)arshpreetkaur07
 
Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.Mohd Faiz
 
Project Documentation
Project DocumentationProject Documentation
Project DocumentationRohan Reddy
 
Introduction to artificial neural network
Introduction to artificial neural networkIntroduction to artificial neural network
Introduction to artificial neural networkDr. C.V. Suresh Babu
 
The Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedInThe Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedInrajappaiyer
 
Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...
Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...
Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...Selvaraj Seerangan
 
M2M systems layers and designs standardizations
M2M systems layers and designs standardizationsM2M systems layers and designs standardizations
M2M systems layers and designs standardizationsFabMinds
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
PhD Thesis Proposal
PhD Thesis Proposal PhD Thesis Proposal
PhD Thesis Proposal Ziqiang Feng
 

Was ist angesagt? (20)

Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network report
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Lung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image ClassificationLung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image Classification
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Nano computing.
Nano computing.Nano computing.
Nano computing.
 
Python programming using problem solving approach by thareja, reema (z lib.org)
Python programming using problem solving approach by thareja, reema (z lib.org)Python programming using problem solving approach by thareja, reema (z lib.org)
Python programming using problem solving approach by thareja, reema (z lib.org)
 
Bitcoin Price Prediction
Bitcoin Price PredictionBitcoin Price Prediction
Bitcoin Price Prediction
 
Relationship Between Big Data & AI
Relationship Between Big Data & AIRelationship Between Big Data & AI
Relationship Between Big Data & AI
 
Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.Artificial Neural Network seminar presentation using ppt.
Artificial Neural Network seminar presentation using ppt.
 
Project Documentation
Project DocumentationProject Documentation
Project Documentation
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
Introduction to artificial neural network
Introduction to artificial neural networkIntroduction to artificial neural network
Introduction to artificial neural network
 
The Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedInThe Big Data Analytics Ecosystem at LinkedIn
The Big Data Analytics Ecosystem at LinkedIn
 
Cnn
CnnCnn
Cnn
 
Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...
Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...
Unit 2,3,4 _ Internet of Things A Hands-On Approach (Arshdeep Bahga, Vijay Ma...
 
M2M systems layers and designs standardizations
M2M systems layers and designs standardizationsM2M systems layers and designs standardizations
M2M systems layers and designs standardizations
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
PhD Thesis Proposal
PhD Thesis Proposal PhD Thesis Proposal
PhD Thesis Proposal
 
Concept learning
Concept learningConcept learning
Concept learning
 

Ähnlich wie Traffic Prediction Using Big Data Analytics

Chek mate geolocation analyzer
Chek mate geolocation analyzerChek mate geolocation analyzer
Chek mate geolocation analyzerpriyal mistry
 
Big Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on NetworksBig Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on NetworksJongwook Woo
 
Big Data and Predictive Analysis
Big Data and Predictive AnalysisBig Data and Predictive Analysis
Big Data and Predictive AnalysisJongwook Woo
 
Realtime Big Data Analytics for Event Detection in Highways
Realtime Big Data Analytics for Event Detection in HighwaysRealtime Big Data Analytics for Event Detection in Highways
Realtime Big Data Analytics for Event Detection in HighwaysYork University
 
Analysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic ToolsAnalysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic Toolsijeei-iaes
 
BDA Mod1@AzDOCUMENTS.in.pdf
BDA Mod1@AzDOCUMENTS.in.pdfBDA Mod1@AzDOCUMENTS.in.pdf
BDA Mod1@AzDOCUMENTS.in.pdfJayanthSram
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression IJECEIAES
 
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdfbig-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdfAkuhuruf
 
11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data miningAlexander Decker
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfArmyTrilidiaDevegaSK
 
Scalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AIScalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AIJongwook Woo
 
On-the-fly Integration of Static and Dynamic Linked Data
On-the-fly Integration of Static and Dynamic Linked DataOn-the-fly Integration of Static and Dynamic Linked Data
On-the-fly Integration of Static and Dynamic Linked Dataaharth
 
Big Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive ComputingBig Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive ComputingJongwook Woo
 
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...IRJET Journal
 
Application of Big Data in Intelligent Traffic System
Application of Big Data in Intelligent Traffic SystemApplication of Big Data in Intelligent Traffic System
Application of Big Data in Intelligent Traffic SystemIOSR Journals
 
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsComparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsJongwook Woo
 
Predictive Analysis of Financial Fraud Detection using Azure and Spark ML
Predictive Analysis of Financial Fraud Detection using Azure and Spark MLPredictive Analysis of Financial Fraud Detection using Azure and Spark ML
Predictive Analysis of Financial Fraud Detection using Azure and Spark MLJongwook Woo
 

Ähnlich wie Traffic Prediction Using Big Data Analytics (20)

Chek mate geolocation analyzer
Chek mate geolocation analyzerChek mate geolocation analyzer
Chek mate geolocation analyzer
 
Big Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on NetworksBig Data and Data Intensive Computing on Networks
Big Data and Data Intensive Computing on Networks
 
Big Data and Predictive Analysis
Big Data and Predictive AnalysisBig Data and Predictive Analysis
Big Data and Predictive Analysis
 
Visualizing CDR Data
Visualizing CDR DataVisualizing CDR Data
Visualizing CDR Data
 
Realtime Big Data Analytics for Event Detection in Highways
Realtime Big Data Analytics for Event Detection in HighwaysRealtime Big Data Analytics for Event Detection in Highways
Realtime Big Data Analytics for Event Detection in Highways
 
Analysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic ToolsAnalysing Transportation Data with Open Source Big Data Analytic Tools
Analysing Transportation Data with Open Source Big Data Analytic Tools
 
BDA Mod1@AzDOCUMENTS.in.pdf
BDA Mod1@AzDOCUMENTS.in.pdfBDA Mod1@AzDOCUMENTS.in.pdf
BDA Mod1@AzDOCUMENTS.in.pdf
 
Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression Predictive geospatial analytics using principal component regression
Predictive geospatial analytics using principal component regression
 
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdfbig-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
 
11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining11.challenging issues of spatio temporal data mining
11.challenging issues of spatio temporal data mining
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
 
Scalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AIScalable Predictive Analysis and The Trend with Big Data & AI
Scalable Predictive Analysis and The Trend with Big Data & AI
 
On-the-fly Integration of Static and Dynamic Linked Data
On-the-fly Integration of Static and Dynamic Linked DataOn-the-fly Integration of Static and Dynamic Linked Data
On-the-fly Integration of Static and Dynamic Linked Data
 
Big Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive ComputingBig Data and Advanced Data Intensive Computing
Big Data and Advanced Data Intensive Computing
 
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
Wearable Technology Orientation using Big Data Analytics for Improving Qualit...
 
Application of Big Data in Intelligent Traffic System
Application of Big Data in Intelligent Traffic SystemApplication of Big Data in Intelligent Traffic System
Application of Big Data in Intelligent Traffic System
 
A017160104
A017160104A017160104
A017160104
 
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost PlatformsComparing Scalable Predictive Analysis using Spark XGBoost Platforms
Comparing Scalable Predictive Analysis using Spark XGBoost Platforms
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
 
Predictive Analysis of Financial Fraud Detection using Azure and Spark ML
Predictive Analysis of Financial Fraud Detection using Azure and Spark MLPredictive Analysis of Financial Fraud Detection using Azure and Spark ML
Predictive Analysis of Financial Fraud Detection using Azure and Spark ML
 

Mehr von Jongwook Woo

Machine Learning in Quantum Computing
Machine Learning in Quantum ComputingMachine Learning in Quantum Computing
Machine Learning in Quantum ComputingJongwook Woo
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionJongwook Woo
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its TrendsJongwook Woo
 
Rating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkRating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkJongwook Woo
 
History and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningHistory and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningJongwook Woo
 
The Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraThe Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraJongwook Woo
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryJongwook Woo
 
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon SungjaeWhose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon SungjaeJongwook Woo
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017Jongwook Woo
 
Big Data Trend with Open Platform
Big Data Trend with Open PlatformBig Data Trend with Open Platform
Big Data Trend with Open PlatformJongwook Woo
 
Big Data Trend and Open Data
Big Data Trend and Open DataBig Data Trend and Open Data
Big Data Trend and Open DataJongwook Woo
 
Big Data Platform adopting Spark and Use Cases with Open Data
Big Data  Platform adopting Spark and Use Cases with Open DataBig Data  Platform adopting Spark and Use Cases with Open Data
Big Data Platform adopting Spark and Use Cases with Open DataJongwook Woo
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLBig Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLJongwook Woo
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and SparkJongwook Woo
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and SparkJongwook Woo
 
Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Jongwook Woo
 
Big Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using SparkBig Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using SparkJongwook Woo
 
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...Jongwook Woo
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Jongwook Woo
 

Mehr von Jongwook Woo (20)

Machine Learning in Quantum Computing
Machine Learning in Quantum ComputingMachine Learning in Quantum Computing
Machine Learning in Quantum Computing
 
Introduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and PredictionIntroduction to Big Data and AI for Business Analytics and Prediction
Introduction to Big Data and AI for Business Analytics and Prediction
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its Trends
 
Rating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and SparkRating Prediction using Deep Learning and Spark
Rating Prediction using Deep Learning and Spark
 
History and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep LearningHistory and Trend of Big Data and Deep Learning
History and Trend of Big Data and Deep Learning
 
The Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraThe Importance of Open Innovation in AI era
The Importance of Open Innovation in AI era
 
Introduction to Big Data: Smart Factory
Introduction to Big Data: Smart FactoryIntroduction to Big Data: Smart Factory
Introduction to Big Data: Smart Factory
 
AI on Big Data
AI on Big DataAI on Big Data
AI on Big Data
 
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon SungjaeWhose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
Whose tombs are so called Nakrang tombs in Pyungyang? By Moon Sungjae
 
President Election of Korea in 2017
President Election of Korea in 2017President Election of Korea in 2017
President Election of Korea in 2017
 
Big Data Trend with Open Platform
Big Data Trend with Open PlatformBig Data Trend with Open Platform
Big Data Trend with Open Platform
 
Big Data Trend and Open Data
Big Data Trend and Open DataBig Data Trend and Open Data
Big Data Trend and Open Data
 
Big Data Platform adopting Spark and Use Cases with Open Data
Big Data  Platform adopting Spark and Use Cases with Open DataBig Data  Platform adopting Spark and Use Cases with Open Data
Big Data Platform adopting Spark and Use Cases with Open Data
 
Big Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure MLBig Data Analysis in Hydrogen Station using Spark and Azure ML
Big Data Analysis in Hydrogen Station using Spark and Azure ML
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
 
Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data Introduction to Spark: Data Analysis and Use Cases in Big Data
Introduction to Spark: Data Analysis and Use Cases in Big Data
 
Big Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using SparkBig Data Analysis and Industrial Approach using Spark
Big Data Analysis and Industrial Approach using Spark
 
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
Special talk: Introduction to Big Data and FinTech at Financial Supervisory S...
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015
 

Kürzlich hochgeladen

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

Traffic Prediction Using Big Data Analytics

  • 1. Jongwook Woo HiPIC CalStateLA KSII The 14th Asia Pacific International Conference on Information Science and Technology(APIC-IST), Beijing June 24 2019 Dalya (Dalyapraz) Dauletbak, dmanato@calstatela.edu Jongwook Woo, PhD Big Data AI Center (BigDAI) California State University Los Angeles Traffic Data Analysis and Prediction using Big Data
  • 2. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents  Introduction  H/W Specification  Architecture Chart  Implementation steps  Data structure  Analysis  Prediction  Summary
  • 3. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Introduction About me:  Graduate Computer Information Systems Student at California State University, Los Angeles – BS (2015): Mathematics at Nazarbayev University – Previously: Senior Consultant/Data Analyst @ Management consulting at KPMG Central Asia – Current: Community Manager @ International Data Engineering and Science Association (IDEAS) Data source:  A GPS navigation mobile application  Provide real-time directions and up-to-date information  Traffic  Accidents  Road closure  Weather hazards  Lurking police vehicles and etc.
  • 4. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Introduction Data source:  Navigation app traffic data set from LA City Department*  Information reported by users - Alerts  information captured by user’s device - Jams  We are going to find out:  Areas with high volume of traffic (geography)  Peak-hours  Density of Alerts and Incidents  Traffic volume by road types  Prediction of traffic jam *Limited authorization to access the full datasets 100 GB + original; we used limited dataset to 9 days (Dec 31– Jan 8, 2018) ~2GB
  • 5. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents  Introduction  H/W Specification  Architecture Chart  Implementation steps  Data structure  Analysis  Prediction  Summary
  • 6. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA H/W Specification Number of nodes 6 OCPUs 12 CPU speed 2195.196MHz Memory 180 GB Storage 682 GB
  • 7. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Architecture Chart Source: Hadoop Masterclass Part 4 of 4: Analyzing Big Data Lars George | EMEA Chief Architect Cloudera
  • 8. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Implementation steps Local Computer Raw data files (JSON) Geo-Spatial Visualization (3D map) Dashboard for Analytics Hadoop/Hive Upload dataset to HDFS Parse JSON files using Pandas Create tables’ schema Clean data Create sample/summary dataset for prediction and visualization Microsoft Azure ML Studio Upload sample dataset Apply data transformation Split dataset for training and scoring Train model(s) Evaluate model(s)
  • 9. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data structure
  • 10. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents  Introduction  H/W Specification  Architecture Chart  Implementation steps  Data structure  Analysis  Prediction  Summary
  • 11. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Analysis  Information we are using:  Location/Time  Level of traffic intensity  X and Y coordinates (Longitude & Latitude)  Counts of jams/alerts  Tools we are using:  Excel - 3D map  Power BI - Flow map, pie charts, bar charts  What we are predicting:  Level of traffic (1 to 3 – light, medium, heavy)  Based on date, time, location
  • 12. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Traffic in LA (captured from users' devices)
  • 13. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Traffic in LA (reported by app users)
  • 14. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Video-Simulation of Traffic in LA (captured from users' devices)
  • 15. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Video-Simulation of Traffic in LA (reported by app users)
  • 16. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Traffic Analysis Dashboard Peak Peak
  • 17. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Traffic Analysis Dashboard Major areas of traffic are: Downtown Los Angeles, Santa Monica, Hollywood, and highways.
  • 18. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents  Introduction  H/W Specification  Architecture Chart  Implementation steps  Data structure  Analysis  Prediction  Summary
  • 19. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Prediction of traffic congestion with Machine Learning Data preparation Group label values Join additional dataset Apply data transformation Normalize data Model building Model(s) selection Cross Validation Train model(s) Model evaluation Score model Evaluate model (Accuracy, Recall)
  • 20. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Features/columns in a dataset location x, location y X and Y -coordinate of location date_pst Pacific Time of the publication of traffic report level jam level, where 1 – almost no jam and 5 – standstill jam speed driver’s captured speed in mph length length of the traffic ahead in the route of user in meters *date_pst *date splits into month, day, hour, min, sec, weekday
  • 21. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Data transformation  Randomly selected data ~ 100MB  Select relevant features  Group level into 2 classes (label: 0 & 1)  Join holidays dataset  Add attribute is_holiday (0 or 1)  Change cyclical attributes from Polar coordinates to Cartesian  Add is_rush, is_weekend (0 or 1)  Normalize features  Make categorical: is_rush, is_holiday, is_weekend, label
  • 22. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA SELECT location_x, location_y, SIN((weekday)*(2*PI()/7)) as sin_weekday, COS((weekday)*(2*PI()/7)) as cos_weekday, SIN((month-1)*(2*PI()/12)) as sin_month, COS((month-1)*(2*PI()/12)) as cos_month, SIN((day-1)*(2*PI()/31)) as sin_day, COS((day- 1)*(2*PI()/31)) as cos_day, SIN(hour*(2*PI()/24)) as sin_hour, COS(hour*(2*PI()/24)) as cos_hour, SIN(min*(2*PI()/60)) as sin_min, COS(min*(2*PI()/60)) as cos_min , SIN(sec*(2*PI()/60)) as sin_sec, COS(sec*(2*PI()/60)) as cos_sec, …
  • 23. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA MODEL Evaluation Model Accuracy Precision Recall AUC ROC LR 0.662 0.662 1.0 0.571 BDT 0.805 0.832 0.884 0.868 DF 0.832 0.868 0.880 0.885
  • 24. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Summary of Traffic Prediction with Machine Learning  Model is based on sampled dataset ~ 1M rows (100 MB)  Best model - Decision Forest  Accuracy – 0.832  Precision - 0.868  Recall - 0.880  Area under the Curve – 0.885 Confusion Matrix
  • 25. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Contents  Introduction  H/W Specification  Architecture Chart  Implementation steps  Data structure  Analysis  Prediction  Summary
  • 26. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Summary Denser traffic on Freeways 101, 405, 10 Rush hours from 7 am to 9 am produce a lot of traffic, the heaviest traffic time start from 3pm and gets better after 6pm. Major areas of traffic in DTLA, Santa Monica, Hollywood More insights can be found with bigger dataset using this framework for analysis of traffic Using such data and platform can also give an opportunity to predict traffic congestions. Prediction can be performed using machine learning algorithm – Decision Forest with the accuracy of 83% for predicting the heaviest traffic jam.
  • 27. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA Questions?
  • 28. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA References 1. J. Barbaresso, G. Cordahi, D. Garcia et al., “USDOT’s Intelligent Transportation Systems (ITS) ITS Strategic Plan 2015- 2019,” 2014. 2. “Integrated Corridor Management,” Intelligent Transportation Systems - Integrated Corridor Management, www.its.dot.gov/research_archives/icms/. Accessed April 14, 2019. 3. J. Kestelyn, “Real-Time Data Visualization and Machine Learning for London Traffic Analysis,” Google Cloud, 2016, cloud.google.com/blog/products/gcp/real-time-data-visualization-and-machine-learning-for-london- traffic-analysis. Accessed April 14, 2019. 4. “Connected Citizens by Waze,” Waze, www.waze.com/ccp. Accessed April 14, 2019. 5. M. Schnuerle, “Louisville and Waze: Applying Mobility Data in Cities,” Harvard Civic Analytics Network Summit on Data-Smart Government, 2017. 6. Louisville Metro. “Thunder Jams, 2017 Traffic Delays.” CARTO, louisvillemetro- ms.carto.com/builder/d98732d0-1f6a-4db2-9f8a-e58026bf0d39/embed. Accessed April 14, 2019. 7. Louisville Metro. “Pothole Animation.” CARTO, cdolabs-admin.carto.com/builder/a80f62bf-98e1-4591-8354- acfa8e51a8de/embed. Accessed April 14, 2019. 8. E. Necula, “Analyzing Traffic Patterns on Street Segments Based on GPS Data Using R,” Transportation Research Procedia, Vol. 10, pp. 276–285, 2015.
  • 29. Big Data Artificial Intelligence Center (BigDAI) Jongwook Woo CalStateLA References 9. J. Woo and Y. Xu, “Market Basket Analysis Algorithm with Map/Reduce of Cloud Computing,” in Proc. of International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas. 2011. 10. “Pandas.io.json.json_normalize.” Pandas.io.json.json_normalize - Pandas 0.24.2 Documentation, pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html. Accessed April 14, 2019. 11. United States, Chief Executive Office County of Los Angeles. “Cities within the County of Los Angeles.” lacounty.gov. Accessed April 14, 2019. 12. Garyericson. “What Is - Azure Machine Learning Studio.” Microsoft Docs, docs.microsoft.com/en- us/azure/machine-learning/studio/what-is-ml-studio. Accessed April 14, 2019. 13. A. Tharwat, “Classification Assessment Methods.” Applied Computing and Informatics, 2018. 14. M. Sokolova and L. Guy, “A Systematic Analysis of Performance Measures for Classification Tasks,” Information Processing & Management, Vol. 45. No. 4, pp. 427–437, 2009.