SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Stat405
       Graphical theory & critique


                          Hadley Wickham
Monday, 2 November 2009
Exploratory graphics
                  Are for you (not others). Need to be able
                  to create rapidly because your first
                  attempt will never be the most revealing.
                  Iteration is crucial for developing the best
                  display of your data.
                  Gives rise to two key questions:



Monday, 2 November 2009
What should I plot?
                    How can I plot it?


Monday, 2 November 2009
Two general tools
                                Plot critique toolkit:
                           “graphics are like pumpkin pie”
                               Theory behind ggplot2:
                          “A layered grammar of graphics”



                           plus lots of practice...

Monday, 2 November 2009
Graphics are like
                     pumpkin pie

                          The four C’s of critiquing a graphic
Monday, 2 November 2009
Content



Monday, 2 November 2009
Construction



Monday, 2 November 2009
Context
Monday, 2 November 2009
Consumption
Monday, 2 November 2009
Content
                    What data (variables) does the
                    graph display?
                    What non-data is present?
                    What is pumpkin (essence of the
                    graphic) vs what is spice (useful
                    additional info)?

Monday, 2 November 2009
Your turn

                    Identify the data and non-data
                    on “Napoleon's march” and
                    “Building an electoral victory”.
                    Which features are the most
                    important? Which are just
                    useful background information?

Monday, 2 November 2009
Results
                    Minard’s march: (top) latitude,
                    longitude, number of troops,
                    direction, branch, city name
                    (bottom) latitude, temperature, date
                    Building an electoral victory: state,
                    number of electoral college votes,
                    winner, margin of victory

Monday, 2 November 2009
Construction

                    How many layers are on the plot?
                    What data does each layer
                    display? What sort of geometric
                    object does it use? Is it a summary
                    of the raw data? How are
                    variables mapped to aesthetics?

Monday, 2 November 2009
Fo r i a
                                                                  r c bl
                                                                   va
                           Perceptual mapping




                                                                     on e s
                                                                       t in on
                                                                           uo l y !
                                                                             us
                          Best   1. Position along a common scale
                                 2. Position along nonaligned scale
                                 3. Length
                                 4. Angle/slope
                                 5. Area
                                 6. Volume
                          Worst 7. Colour


Monday, 2 November 2009
Your turn
                    Answer the following questions
                    for “Napoleon's march” and
                    “Flight delays”:
                    How many layers are on the plot?
                    What data does the layer
                    display? How does it display it?

Monday, 2 November 2009
Results
                    Napoleon’s march: (top) (1) path plot with width
                    mapped to number of troops, colour to direction,
                    separate group for each branch (2) labels giving
                    city names (bottom) (1) line plot with longitude on
                    x-axis and temperature on y-axis (2) text labels
                    giving dates
                    Flight delays: (1) white circles showing 100%
                    cancellation, (2) outline of states, (3) points with
                    size proportional to percent cancellations at each
                    airport.


Monday, 2 November 2009
Can the explain
                          composition of a graphic
                          in words, but how do we
                                 create it?



Monday, 2 November 2009
“If any number of
                          magnitudes are each
                          the same multiple of
                          the same number of
                          other magnitudes,
                          then the sum is that
                          multiple of the sum.”
                          Euclid, ~300 BC




Monday, 2 November 2009
“If any number of
                                magnitudes are each
                                the same multiple of
                                the same number of
                                other magnitudes,
                                then the sum is that
                                multiple of the sum.”
                                Euclid, ~300 BC




                          m(Σx) = Σ(mx)
Monday, 2 November 2009
The grammar of graphics
                  An abstraction which makes thinking about,
                  reasoning about and communicating
                  graphics easier.
                  Developed by Leland Wilkinson, particularly
                  in “The Grammar of Graphics” 1999/2005
                  You’ve been using it in ggplot2 without
                  knowing it! But to do more, you need to
                  learn more about the theory.


Monday, 2 November 2009
What is a layer?
                  • Data
                  • Mappings from variables to aesthetics
                    (aes)
                  • A geometric object (geom)
                  • A statistical transformation (stat)
                  • A position adjustment (position)


Monday, 2 November 2009
layer(geom, stat, position, data, mapping, ...)

     layer(
       data = mpg,
       mapping = aes(x = displ, y = hwy),
       geom = "point",
       stat = "identity",
       position = "identity"
     )

     layer(
       data = diamonds,
       mapping = aes(x = carat),
       geom = "bar",
       stat = "bin",
       position = "stack"
     )

Monday, 2 November 2009
# A lot of typing!

     layer(
       data = mpg,
       mapping = aes(x = displ, y = hwy),
       geom = "point",
       stat = "identity",
       position = "identity"
     )

     # Every geom has an associated default statistic
     # (and vice versa), and position adjustment.

     geom_point(aes(displ, hwy), data = mpg)
     geom_histogram(aes(displ), data = mpg)
Monday, 2 November 2009
# To actually create the plot
     ggplot() +
       geom_point(aes(displ, hwy), data = mpg)

     ggplot() +
       geom_histogram(aes(displ), data = mpg)




Monday, 2 November 2009
# Multiple layers
     ggplot() +
       geom_point(aes(displ, hwy), data = mpg) +
       geom_smooth(aes(displ, hwy), data = mpg)

     # Avoid redundancy:
     ggplot(mpg, aes(displ, hwy)) +
       geom_point() +
       geom_smooth()




Monday, 2 November 2009
# Different layers can have different aesthetics
     ggplot(mpg, aes(displ, hwy)) +
       geom_point(aes(colour = class)) +
       geom_smooth()

     ggplot(mpg, aes(displ, hwy)) +
       geom_point(aes(colour = class)) +
       geom_smooth(aes(group = class), method = "lm",
         se = F)




Monday, 2 November 2009
Your turn
                  For each of the following plots created with
                  qplot, recreate the equivalent ggplot code.
                  qplot(price, carat, data = diamonds)
                  qplot(hwy, cty, data = mpg, geom = "jitter")
                  qplot(reorder(class, hwy), hwy, data = mpg,
                    geom = c("jitter", "boxplot"))
                  qplot(log10(price), log10(carat),
                  data = diamonds), colour = color) +
                  geom_smooth(method = "lm")


Monday, 2 November 2009
ggplot(diamonds, aes(price, data)) +
       geom_smooth()

     gglot(mpg, aes(hwy, cty)) +
       geom_jitter()

     ggplot(mpg, aes(reorder(class, hwy), hwy)) +
       geom_jitter() +
       geom_boxplot()

     ggplot(diamonds, aes(log10(price), log10(carat),
       colour = color)) +
       geom_point() +
       geom_smooth(method = "lm")

Monday, 2 November 2009
More geoms & stats

                  See http://had.co.nz/ggplot2 for complete
                  list with helpful icons:
                  Geoms: (0d) point, (1d) line, path, (2d)
                  boxplot, bar, tile, text, polygon
                  Stats: bin, summary, sum



Monday, 2 November 2009
Your turn

                  Go back to the descriptions of “Minard’s
                  march” and “Flight delays” that you
                  created before. Start converting your
                  textual description in to working ggplot2
                  code. Use the data sets available on line
                  to produce the graphics.



Monday, 2 November 2009
Other features
                  Scales. Used to override default perceptual
                  mappings. Mainly useful for polishing plot for
                  communication.
                  Coordinate system. Rarely useful, but when
                  needed are critical.
                  Facetting. Have seen in use already. Only other
                  feature of importance is interaction with scales.
                  Themes: control presentation of non-data
                  elements.


Monday, 2 November 2009
Project 3 preview
                  Choose your own dataset.
                  Results will be presented as poster during
                  a formal poster session in the last week of
                  class.
                  On November 16, Tracey Volz (a
                  communications expert) will come and
                  talk about how to create a good poster


Monday, 2 November 2009
Data suggestions

                  http://delicious.com/hadley/data
                  http://github.com/hadley/data-housing-
                  crisis




Monday, 2 November 2009

Weitere ähnliche Inhalte

Andere mochten auch

Motivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality IndustryMotivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality IndustryEcole Hôtelière de Lausanne
 
#2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
 #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
#2 DataBeersBCN - "Why counting people at public transport" by Caterina FontDataBeersBCN
 
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier GuardiolaDataBeersBCN
 
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
 #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J... #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...DataBeersBCN
 
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_BovinoFedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_BovinoFedegan
 

Andere mochten auch (11)

23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Motivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality IndustryMotivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality Industry
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
24 modelling
24 modelling24 modelling
24 modelling
 
#2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
 #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
#2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
 
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
 
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
 #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J... #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
 
R packages
R packagesR packages
R packages
 
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_BovinoFedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
 
Rhetorical Terms
Rhetorical TermsRhetorical Terms
Rhetorical Terms
 
27 development
27 development27 development
27 development
 

Mehr von Hadley Wickham (20)

21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 
09 bootstrapping
09 bootstrapping09 bootstrapping
09 bootstrapping
 
08 functions
08 functions08 functions
08 functions
 
07 problem-solving
07 problem-solving07 problem-solving
07 problem-solving
 
06 data
06 data06 data
06 data
 
05 subsetting
05 subsetting05 subsetting
05 subsetting
 
04 reports
04 reports04 reports
04 reports
 
03 extensions
03 extensions03 extensions
03 extensions
 

Kürzlich hochgeladen

UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 

Kürzlich hochgeladen (20)

UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 

19 Critique

  • 1. Stat405 Graphical theory & critique Hadley Wickham Monday, 2 November 2009
  • 2. Exploratory graphics Are for you (not others). Need to be able to create rapidly because your first attempt will never be the most revealing. Iteration is crucial for developing the best display of your data. Gives rise to two key questions: Monday, 2 November 2009
  • 3. What should I plot? How can I plot it? Monday, 2 November 2009
  • 4. Two general tools Plot critique toolkit: “graphics are like pumpkin pie” Theory behind ggplot2: “A layered grammar of graphics” plus lots of practice... Monday, 2 November 2009
  • 5. Graphics are like pumpkin pie The four C’s of critiquing a graphic Monday, 2 November 2009
  • 10. Content What data (variables) does the graph display? What non-data is present? What is pumpkin (essence of the graphic) vs what is spice (useful additional info)? Monday, 2 November 2009
  • 11. Your turn Identify the data and non-data on “Napoleon's march” and “Building an electoral victory”. Which features are the most important? Which are just useful background information? Monday, 2 November 2009
  • 12. Results Minard’s march: (top) latitude, longitude, number of troops, direction, branch, city name (bottom) latitude, temperature, date Building an electoral victory: state, number of electoral college votes, winner, margin of victory Monday, 2 November 2009
  • 13. Construction How many layers are on the plot? What data does each layer display? What sort of geometric object does it use? Is it a summary of the raw data? How are variables mapped to aesthetics? Monday, 2 November 2009
  • 14. Fo r i a r c bl va Perceptual mapping on e s t in on uo l y ! us Best 1. Position along a common scale 2. Position along nonaligned scale 3. Length 4. Angle/slope 5. Area 6. Volume Worst 7. Colour Monday, 2 November 2009
  • 15. Your turn Answer the following questions for “Napoleon's march” and “Flight delays”: How many layers are on the plot? What data does the layer display? How does it display it? Monday, 2 November 2009
  • 16. Results Napoleon’s march: (top) (1) path plot with width mapped to number of troops, colour to direction, separate group for each branch (2) labels giving city names (bottom) (1) line plot with longitude on x-axis and temperature on y-axis (2) text labels giving dates Flight delays: (1) white circles showing 100% cancellation, (2) outline of states, (3) points with size proportional to percent cancellations at each airport. Monday, 2 November 2009
  • 17. Can the explain composition of a graphic in words, but how do we create it? Monday, 2 November 2009
  • 18. “If any number of magnitudes are each the same multiple of the same number of other magnitudes, then the sum is that multiple of the sum.” Euclid, ~300 BC Monday, 2 November 2009
  • 19. “If any number of magnitudes are each the same multiple of the same number of other magnitudes, then the sum is that multiple of the sum.” Euclid, ~300 BC m(Σx) = Σ(mx) Monday, 2 November 2009
  • 20. The grammar of graphics An abstraction which makes thinking about, reasoning about and communicating graphics easier. Developed by Leland Wilkinson, particularly in “The Grammar of Graphics” 1999/2005 You’ve been using it in ggplot2 without knowing it! But to do more, you need to learn more about the theory. Monday, 2 November 2009
  • 21. What is a layer? • Data • Mappings from variables to aesthetics (aes) • A geometric object (geom) • A statistical transformation (stat) • A position adjustment (position) Monday, 2 November 2009
  • 22. layer(geom, stat, position, data, mapping, ...) layer( data = mpg, mapping = aes(x = displ, y = hwy), geom = "point", stat = "identity", position = "identity" ) layer( data = diamonds, mapping = aes(x = carat), geom = "bar", stat = "bin", position = "stack" ) Monday, 2 November 2009
  • 23. # A lot of typing! layer( data = mpg, mapping = aes(x = displ, y = hwy), geom = "point", stat = "identity", position = "identity" ) # Every geom has an associated default statistic # (and vice versa), and position adjustment. geom_point(aes(displ, hwy), data = mpg) geom_histogram(aes(displ), data = mpg) Monday, 2 November 2009
  • 24. # To actually create the plot ggplot() + geom_point(aes(displ, hwy), data = mpg) ggplot() + geom_histogram(aes(displ), data = mpg) Monday, 2 November 2009
  • 25. # Multiple layers ggplot() + geom_point(aes(displ, hwy), data = mpg) + geom_smooth(aes(displ, hwy), data = mpg) # Avoid redundancy: ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() Monday, 2 November 2009
  • 26. # Different layers can have different aesthetics ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + geom_smooth() ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + geom_smooth(aes(group = class), method = "lm", se = F) Monday, 2 November 2009
  • 27. Your turn For each of the following plots created with qplot, recreate the equivalent ggplot code. qplot(price, carat, data = diamonds) qplot(hwy, cty, data = mpg, geom = "jitter") qplot(reorder(class, hwy), hwy, data = mpg, geom = c("jitter", "boxplot")) qplot(log10(price), log10(carat), data = diamonds), colour = color) + geom_smooth(method = "lm") Monday, 2 November 2009
  • 28. ggplot(diamonds, aes(price, data)) + geom_smooth() gglot(mpg, aes(hwy, cty)) + geom_jitter() ggplot(mpg, aes(reorder(class, hwy), hwy)) + geom_jitter() + geom_boxplot() ggplot(diamonds, aes(log10(price), log10(carat), colour = color)) + geom_point() + geom_smooth(method = "lm") Monday, 2 November 2009
  • 29. More geoms & stats See http://had.co.nz/ggplot2 for complete list with helpful icons: Geoms: (0d) point, (1d) line, path, (2d) boxplot, bar, tile, text, polygon Stats: bin, summary, sum Monday, 2 November 2009
  • 30. Your turn Go back to the descriptions of “Minard’s march” and “Flight delays” that you created before. Start converting your textual description in to working ggplot2 code. Use the data sets available on line to produce the graphics. Monday, 2 November 2009
  • 31. Other features Scales. Used to override default perceptual mappings. Mainly useful for polishing plot for communication. Coordinate system. Rarely useful, but when needed are critical. Facetting. Have seen in use already. Only other feature of importance is interaction with scales. Themes: control presentation of non-data elements. Monday, 2 November 2009
  • 32. Project 3 preview Choose your own dataset. Results will be presented as poster during a formal poster session in the last week of class. On November 16, Tracey Volz (a communications expert) will come and talk about how to create a good poster Monday, 2 November 2009
  • 33. Data suggestions http://delicious.com/hadley/data http://github.com/hadley/data-housing- crisis Monday, 2 November 2009