SlideShare a Scribd company logo
1 of 21
Before we start!
Access content for today. One of two ways:
Recommended:
• Go to https://rstudio.cloud and create an account.
• Once that’s completed, go to https://rstudio.cloud/project/358879
Less Recommended
• Go to https://github.com/rharrington31/drexel_visualization_workshop.
You can clone or download the repository. You will need R and RStudio installed locally.
Visualizing Data with R
Ryan Harrington
@rharrington31
I’m Ryan.
Goals for today
Understand how to think about exploratory data analysis
Understand how to use R to create graphs
ggplot2
“Grammar of Graphics”
Plot specification at a high level of abstraction
Very flexible
Theme system for polishing plot appearance
Mature and complete graphics system
Source: http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.htm
ggplot2
Source: https://fivethirtyeight.com/features/baseballs-are-more-consistently-juiced-than-ever/
Source: https://bbc.github.io/rcookbook/
ggplot2 is all about layers.
Example: https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#35
What makes up a layer?
• data— what data is being used to build the layer?
• mapping— what field(s) from the data are being used to build the layer?
• geometry— what shape should our data take when building the layer?
• statistic— how should our data be transformed when building the layer?
• position— where should our data be placed when building the layer?
Source: https://rpubs.com/hadley/ggplot2-layers
Getting Started
Access content for today. One of two ways:
Recommended:
• Go to https://rstudio.cloud and create an account.
• Once that’s completed, go to https://rstudio.cloud/project/358879
Less Recommended
• Go to https://github.com/rharrington31/drexel_visualization_workshop.
You can clone or download the repository. You will need R and RStudio installed locally.
Data we’ll be working with
Last month, the City of Chicago began publishing anonymized
rideshare data
• Drivers
• Vehicles
• Trips
We’ll be specifically focused on a sample of the Trips dataset
Goal:
Can we predict whether or not a ride will be tipped?
Trips Data: https://data.cityofchicago.org/Transportation/Transportation-Network-Providers-Trips/m6dm-c72p
Background Info: https://www.chicago.gov/city/en/depts/bacp/provdrs/vehic/news/2019/april/tnpdata.html
RMarkdown
More Information: https://rmarkdown.rstudio.com
Like Jupyter Notebooks, but for R.
R Markdown documents are fully reproducible. Use a
productive notebook interface to weave together narrative text
and code to produce elegantly formatted output. Use multiple
languages including R, Python, and SQL.
YAML Header
Only occurs once at the
top of the document
Allows you to specify
meta-data about your
document
Markdown
Text that is not evaluated
as code. Basic
formatting of text is
possible, from bolding
and italicizing text to
utilizing lists.
It is possible to include
code inline with the
markdown that will be
evaluated when the
document is “knit"
Code Chunks
Actual code. Each chunk
can be evaluated
independently.
It is possible to use a
variety of languages
beyond R in the chunks.
Output
Output from the code
chunks are included
immediately below the
chunk. This allows for
easier exploration.
Tidyverse
More Information: https://tidyverse.org
graphing tidy data read data manipulate strings
data manipulation functional programming better data frames better factors
ggplot2 Layers
More Information: https://rpubs.com/hadley/ggplot2-layers
Date Item Quantit Price
3/1/19 Pants 2 $19.99
3/2/19 Shirt 1 $14.99
3/3/19 Shirt 4 $14.99
3/3/19 Belt 2 $8.99
3/4/19 Pants 1 $19.99
3/5/19 Hat 1 $12.99
3/6/19 Pants 3 $19.99
3/6/19 Belt 3 $8.99
data
data = _
ggplot2 Layers
More Information: https://rpubs.com/hadley/ggplot2-layers
Date Item Quantit Price
3/1/19 Pants 2 $19.99
3/2/19 Shirt 1 $14.99
3/3/19 Shirt 4 $14.99
3/3/19 Belt 2 $8.99
3/4/19 Pants 1 $19.99
3/5/19 Hat 1 $12.99
3/6/19 Pants 3 $19.99
3/6/19 Belt 3 $8.99
x =
y=
alpha=
colour =
fill=
group=
linetype=
size=
+
data
data = _
aesthetic mapping
aes(x =_, y=_, …)
ggplot2 Layers
More Information: https://rpubs.com/hadley/ggplot2-layers
Cheatsheet: https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf
Date Item Quantit Price
3/1/19 Pants 2 $19.99
3/2/19 Shirt 1 $14.99
3/3/19 Shirt 4 $14.99
3/3/19 Belt 2 $8.99
3/4/19 Pants 1 $19.99
3/5/19 Hat 1 $12.99
3/6/19 Pants 3 $19.99
3/6/19 Belt 3 $8.99
x =
y=
alpha=
colour =
fill=
group=
linetype=
size=
data
data = _
aesthetic mapping
aes(x =_, y=_, …)
geometry
geom_bar()
+ +
Where can you get data?
Open Data Network (opendatanetwork.com/)
OpenDataPhilly (opendataphilly.org)
Data is Plural (tinyletter.com/data-is-plural)
Kaggle (kaggle.com)
Data.World (data.world)
What we covered today
medium.com/@rharrington31

More Related Content

What's hot

Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefineOpen Knowledge Belgium
 
Introduction to r
Introduction to rIntroduction to r
Introduction to rgslicraf
 
Introduction To R
Introduction To RIntroduction To R
Introduction To RSpotle.ai
 
Finding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In JournalismFinding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In JournalismWilliam Lyon
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talkrtelmore
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction ServicePromptCloud
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong KongSammy Fung
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD CloudRuben Verborgh
 
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...Sarah Weeks
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonPaco Nathan
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked DataRuben Verborgh
 
R Programming Overview
R Programming Overview R Programming Overview
R Programming Overview dlamb3244
 

What's hot (15)

Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Introduction To R
Introduction To RIntroduction To R
Introduction To R
 
Finding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In JournalismFinding Insights In Connected Data: Using Graph Databases In Journalism
Finding Insights In Connected Data: Using Graph Databases In Journalism
 
Big Data Integration
Big Data IntegrationBig Data Integration
Big Data Integration
 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
 
useR! 2012 Talk
useR! 2012 TalkuseR! 2012 Talk
useR! 2012 Talk
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Use of Open Data in Hong Kong
Use of Open Data in Hong KongUse of Open Data in Hong Kong
Use of Open Data in Hong Kong
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
 
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...
OMG! My metadata is as fresh as the Backstreet Boys: How Google Refine can up...
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in Python
 
Sustainable queryable access to Linked Data
Sustainable queryable access to Linked DataSustainable queryable access to Linked Data
Sustainable queryable access to Linked Data
 
R Programming Overview
R Programming Overview R Programming Overview
R Programming Overview
 

Similar to Data Visualizations with ggplot2

Distributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLDistributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLYulia Tell
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R StudioSusan Johnston
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
A intro to (hosted) Shiny Apps
A intro to (hosted) Shiny AppsA intro to (hosted) Shiny Apps
A intro to (hosted) Shiny AppsDaniel Koller
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big AnalyticsAjay Ohri
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesStratio
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
Tech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboardsTech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboardsAppsilon Data Science
 
Big data analysis using spark r published
Big data analysis using spark r publishedBig data analysis using spark r published
Big data analysis using spark r publishedDipendra Kusi
 
Transformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigTransformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigLester Martin
 
Open source analytics
Open source analyticsOpen source analytics
Open source analyticsAjay Ohri
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in Rliz__is
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTHaritikaChhatwal1
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Codemotion
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RGraphRM
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018Holden Karau
 
Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?Holden Karau
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphTigerGraph
 

Similar to Data Visualizations with ggplot2 (20)

Distributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLDistributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDL
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R Studio
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
A intro to (hosted) Shiny Apps
A intro to (hosted) Shiny AppsA intro to (hosted) Shiny Apps
A intro to (hosted) Shiny Apps
 
Reproducible research
Reproducible researchReproducible research
Reproducible research
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph Datasources
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
Tech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboardsTech Talk - Overview of Dash framework for building dashboards
Tech Talk - Overview of Dash framework for building dashboards
 
Big data analysis using spark r published
Big data analysis using spark r publishedBig data analysis using spark r published
Big data analysis using spark r published
 
Transformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs PigTransformation Processing Smackdown; Spark vs Hive vs Pig
Transformation Processing Smackdown; Spark vs Hive vs Pig
 
Open source analytics
Open source analyticsOpen source analytics
Open source analytics
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in R
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
 
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
Perchè potresti aver bisogno di un database NoSQL anche se non sei Google o F...
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
 
Big data beyond the JVM - DDTX 2018
Big data beyond the JVM -  DDTX 2018Big data beyond the JVM -  DDTX 2018
Big data beyond the JVM - DDTX 2018
 
Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?Are general purpose big data systems eating the world?
Are general purpose big data systems eating the world?
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Graph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise GraphGraph Gurus Episode 1: Enterprise Graph
Graph Gurus Episode 1: Enterprise Graph
 

Recently uploaded

Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 

Recently uploaded (20)

Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 

Data Visualizations with ggplot2

  • 1. Before we start! Access content for today. One of two ways: Recommended: • Go to https://rstudio.cloud and create an account. • Once that’s completed, go to https://rstudio.cloud/project/358879 Less Recommended • Go to https://github.com/rharrington31/drexel_visualization_workshop. You can clone or download the repository. You will need R and RStudio installed locally.
  • 2. Visualizing Data with R Ryan Harrington @rharrington31
  • 4. Goals for today Understand how to think about exploratory data analysis Understand how to use R to create graphs
  • 5. ggplot2 “Grammar of Graphics” Plot specification at a high level of abstraction Very flexible Theme system for polishing plot appearance Mature and complete graphics system Source: http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.htm
  • 7. ggplot2 is all about layers. Example: https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#35
  • 8. What makes up a layer? • data— what data is being used to build the layer? • mapping— what field(s) from the data are being used to build the layer? • geometry— what shape should our data take when building the layer? • statistic— how should our data be transformed when building the layer? • position— where should our data be placed when building the layer? Source: https://rpubs.com/hadley/ggplot2-layers
  • 9. Getting Started Access content for today. One of two ways: Recommended: • Go to https://rstudio.cloud and create an account. • Once that’s completed, go to https://rstudio.cloud/project/358879 Less Recommended • Go to https://github.com/rharrington31/drexel_visualization_workshop. You can clone or download the repository. You will need R and RStudio installed locally.
  • 10. Data we’ll be working with Last month, the City of Chicago began publishing anonymized rideshare data • Drivers • Vehicles • Trips We’ll be specifically focused on a sample of the Trips dataset Goal: Can we predict whether or not a ride will be tipped? Trips Data: https://data.cityofchicago.org/Transportation/Transportation-Network-Providers-Trips/m6dm-c72p Background Info: https://www.chicago.gov/city/en/depts/bacp/provdrs/vehic/news/2019/april/tnpdata.html
  • 11. RMarkdown More Information: https://rmarkdown.rstudio.com Like Jupyter Notebooks, but for R. R Markdown documents are fully reproducible. Use a productive notebook interface to weave together narrative text and code to produce elegantly formatted output. Use multiple languages including R, Python, and SQL.
  • 12. YAML Header Only occurs once at the top of the document Allows you to specify meta-data about your document
  • 13. Markdown Text that is not evaluated as code. Basic formatting of text is possible, from bolding and italicizing text to utilizing lists. It is possible to include code inline with the markdown that will be evaluated when the document is “knit"
  • 14. Code Chunks Actual code. Each chunk can be evaluated independently. It is possible to use a variety of languages beyond R in the chunks.
  • 15. Output Output from the code chunks are included immediately below the chunk. This allows for easier exploration.
  • 16. Tidyverse More Information: https://tidyverse.org graphing tidy data read data manipulate strings data manipulation functional programming better data frames better factors
  • 17. ggplot2 Layers More Information: https://rpubs.com/hadley/ggplot2-layers Date Item Quantit Price 3/1/19 Pants 2 $19.99 3/2/19 Shirt 1 $14.99 3/3/19 Shirt 4 $14.99 3/3/19 Belt 2 $8.99 3/4/19 Pants 1 $19.99 3/5/19 Hat 1 $12.99 3/6/19 Pants 3 $19.99 3/6/19 Belt 3 $8.99 data data = _
  • 18. ggplot2 Layers More Information: https://rpubs.com/hadley/ggplot2-layers Date Item Quantit Price 3/1/19 Pants 2 $19.99 3/2/19 Shirt 1 $14.99 3/3/19 Shirt 4 $14.99 3/3/19 Belt 2 $8.99 3/4/19 Pants 1 $19.99 3/5/19 Hat 1 $12.99 3/6/19 Pants 3 $19.99 3/6/19 Belt 3 $8.99 x = y= alpha= colour = fill= group= linetype= size= + data data = _ aesthetic mapping aes(x =_, y=_, …)
  • 19. ggplot2 Layers More Information: https://rpubs.com/hadley/ggplot2-layers Cheatsheet: https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf Date Item Quantit Price 3/1/19 Pants 2 $19.99 3/2/19 Shirt 1 $14.99 3/3/19 Shirt 4 $14.99 3/3/19 Belt 2 $8.99 3/4/19 Pants 1 $19.99 3/5/19 Hat 1 $12.99 3/6/19 Pants 3 $19.99 3/6/19 Belt 3 $8.99 x = y= alpha= colour = fill= group= linetype= size= data data = _ aesthetic mapping aes(x =_, y=_, …) geometry geom_bar() + +
  • 20. Where can you get data? Open Data Network (opendatanetwork.com/) OpenDataPhilly (opendataphilly.org) Data is Plural (tinyletter.com/data-is-plural) Kaggle (kaggle.com) Data.World (data.world)
  • 21. What we covered today medium.com/@rharrington31