SlideShare ist ein Scribd-Unternehmen logo
1 von 36

 The three rules of data analysis won’t be difficult to
remember:
1. Make a picture—things may be revealed that are not
obvious in the raw data. These will be things to think
about.
2. Make a picture—important features of and patterns in
the data will show up. You may also see things that
you did not expect.
3. Make a picture—the best way to tell others about your
data is with a well-chosen picture.
The Three Rules of Data Analysis

 We can “pile” the data by counting the number of
data values in each category of interest.
 We can organize these counts into a frequency table,
which records the totals and the category names.
Frequency Tables

 What is your favorite color on this list?
 Red
 Blue
 Green
 Purple
Frequency Tables Example
Color Frequency
Red
Blue
Green
Purple

 A relative frequency table is similar, but gives the
percentages (instead of counts) for each category.
 Cumulative Frequency – calculates the sum of all
relative frequencies for that particular category and
all previous categories
Relative Frequency Tables

 Both types of tables show how cases are distributed across
the categories.
 They describe the distribution of a categorical variable
because they name the possible categories and tell how
frequently each occurs.
 To calculate the relative frequency:
 Divide the count by the total number to cases
 Multiply by 100 to express as a percentage
 To calculate the cumulative relative frequency:
 Take the current category’s relative frequency and add it all
previous relative frequencies
Creating Frequency Tables
 Let’s use the previous data we collected to create a relative
frequency table.
Relative Frequency: Example
Color Frequency Relative Frequency Cumulative Relative
Frequency
Red
Blue
Green
Purple
TOTAL

 You might think that
a good way to show
the Titanic data is
with this display. Is it?
What’s Wrong With This Picture?

 The ship display makes it look like most of the
people on the Titanic were crew members, with a few
passengers along for the ride.
 When we look at each ship, we see the area taken up
by the ship, instead of the length of the ship.
 The ship display violates the area principle:
 The area occupied by a part of the graph should
correspond to the magnitude of the value it
represents.
The Area Principle

 A bar chart displays the distribution of a categorical
variable, showing the counts for each category next to
each other for easy comparison.
 A bar chart stays true
to the area principle.
 Thus, a better display
for the ship data is:
 Bar charts have spaces
between the categories,
while histograms do not.
Bar Charts

 A relative frequency bar chart displays the relative
proportion of counts for each category.
 A relative frequency bar chart also stays true to the
area principle.
 Replacing counts
with percentages
in the ship data:
Bar Charts (cont.)

 Use equal increments along the dependent (y) axis.
 Follow the area principle and be sure to keep the area equal in each
segment (the x-axis should also have equal increments).
 Example: Create a bar graph based on the collected data:
How to Create Bar Charts
Color Frequency
Red
Blue
Green
Purple

 When you are interested in parts of the whole, a pie
chart might be your display of choice.
 Pie charts show the whole
group of cases as a circle.
 They slice the circle into
pieces whose size is
proportional to the
fraction of the whole
in each category.
Pie Charts
 Convert Your Data:
 Convert all data values to percentages of the whole data set. For example,
four radishes, three cucumbers, two carrots and one pepper equals 40
percent radishes, 30 percent cucumbers, 20 percent carrots and 10 percent
peppers.
 Convert the percentages into angles. Since a full circle is 360 degrees,
multiply this by the percentages to get the angle for each section of the
pie. For the radishes, 0.4 X 360 = 144 degrees. For the cucumbers, 0.3 X
360 = 108 degrees. For the carrots, 0.2 X 360 = 72 degrees. For the peppers,
0.1 X 360 = 36 degrees.
 Make sure the angle calculations are correct by adding all the angles. The
total should be 360. 144 + 108 + 72 + 36 = 360.
 You may be off by a tenth or so due to rounding, so be careful.
How to Accurately Create Pie
Charts
 Draw the Chart:
 Draw a circle on a blank sheet of paper, using a compass. While a compass is not
necessary, using one will make the chart much neater and clearer by ensuring the
circle is even.
 Draw a radius, from the center to the right edge of the circle, using the ruler or
straight edge. This will be the first base line.
 Measure the largest angle in the data with the protractor, starting at the baseline,
and mark it on the edge of the circle. Use the ruler to draw another radius to that
point. Use this new radius as a base line for your next largest angle and continue
this process until you get to the last data point. You will only need to measure the
last angle to verify its value since both lines will already be drawn.
 Label and shade the sections of the pie chart to highlight whatever data is
important for your use.
How to Accurately Create Pie
Charts (cont.)

 What’s your favorite academic subject?
 Create your own pie chart to represent the class’s
distribution of subject preferences.
Class Activity
Color Frequency
Math
English
History
Science
Day 2

 A contingency table allows us to look at two categorical
variables together.
 It shows how individuals are distributed along each
variable, depending on the value of the other variable.
 Example: we can examine the class of ticket and
whether a person survived the Titanic:
Contingency Tables

 The margins of the table, both on the right and on the
bottom, give totals and the frequency distributions for
each of the variables.
 Each frequency distribution is called a marginal
distribution of its respective variable.
 The marginal distribution of Survival is:
Contingency Tables (cont.)

 Each cell of the table gives the count for a combination of
values of the two values.
 For example, the second cell in the crew column tells
us that 673 crew members died when the Titanic sunk.
Contingency Tables (cont.)

 The percentage of passengers who were both in 1st
class and survived:
 The percentage of 1st
class passengers who survived:
 The percentage of the survivors who were in 1st
class:
 Contingency tables allow us to make many comparisons.
Contingency Table Example

 This two-way table shows the favorite leisure
activities for 50 adults - 20 men and 30 women.
 What is the marginal distribution of the responses?
Example: Finding Marginal Distributions

 A conditional distribution shows the distribution of
one variable for just the individuals who satisfy some
condition on another variable.
 The following is the conditional distribution of ticket
Class, conditional on having survived:
Conditional Distributions

 The following is the conditional distribution of ticket
Class, conditional on having passed:
Conditional Distributions (cont.)

 The conditional distributions tell us that there is a
difference in class for those who survived and
those who perished.
 This is better
shown with
pie charts of
the two
distributions:
Conditional Distributions (cont.)

 We see that the distribution of Class for the
survivors is different from that of the
nonsurvivors..
 This leads us to believe that Class and Survival
are associated, that they are not independent.
 The variables would be considered independent
when the distribution of one variable in a
contingency table is the same for all categories of
the other variable.
Conditional Distributions (cont.)

1. What % of the class are girls with liberal political views?
2. What % of conservatives are boys?
3. What % of girls are moderate?
4. What is the marginal frequency distribution of political views?
Activity: Political Views
Conservative Liberal Moderate Total
Female
Male
TOTAL

 When averages are taken across different
groups, they can appear to contradict the overall
averages.
Simpson’s Paradox
Day 3

 It’s the last inning of an important game. Your team is one run
down with the bases loaded and two outs. The pitcher is due
up, so you’ll be sending in a pinch-hitter. There are two batters
available on the bench. Whom should you send in to bat?
 Do the averages line up with the overall?
Simpson’s Paradox (cont.)
 A segmented bar chart
displays the same
information as a pie
chart, but in the form
of bars instead of
circles.
 Each bar is treated as
the “whole” and is
divided proportionally
into segments
corresponding o the
percentage in each
group.
 Here is the segmented
bar chart for ticket
Class by Survival status:
Segmented Bar Charts

 Don’t violate the area principle.
 While some people might like the pie chart on the left
better, it is harder to compare fractions of the whole,
which a well-done pie chart does.
What Can Go Wrong?

 Keep it honest—make sure your display shows what
it says it shows.
 This plot of the percentage of high-school students who
engage in specified dangerous behaviors has a
problem. Can you see it?
What Can Go Wrong? (cont.)

 Don’t confuse similar-sounding percentages—pay
particular attention to the wording of the context.
 Don’t forget to look at the variables separately too—
examine the marginal distributions, since it is
important to know how many cases are in each
category.
What Can Go Wrong? (cont.)

 Be sure to use enough individuals!
 Do not make a report like “We found that 66.67% of
the rats improved their performance with training. The
other rat died.”
What Can Go Wrong? (cont.)

 Don’t overstate your case—don’t claim something
you can’t.
 Don’t use unfair or silly averages—this could lead to
Simpson’s Paradox, so be careful when you average
one variable across different levels of a second
variable.
What Can Go Wrong? (cont.)

 We can summarize categorical data by counting the
number of cases in each category (expressing these as
counts or percents).
 We can display the distribution in a bar chart or pie
chart.
 And, we can examine two-way tables called
contingency tables, examining marginal and/or
conditional distributions of the variables.
Recap

 Day 1: #5 – 9 ODD, 11, 13 – 16
 Day 2: # 17, 19 – 21
 Day 3: # 23, 25, 28
 Day 4: # 32, 33, 39
 Read Chapter 4
Assignments: pp. 38-39

Weitere ähnliche Inhalte

Was ist angesagt?

Parametric & non-parametric
Parametric & non-parametricParametric & non-parametric
Parametric & non-parametricSoniaBabaee
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAnand Thokal
 
Basics of Educational Statistics (Descriptive statistics)
Basics of Educational Statistics (Descriptive statistics)Basics of Educational Statistics (Descriptive statistics)
Basics of Educational Statistics (Descriptive statistics)HennaAnsari
 
Statistics
StatisticsStatistics
Statisticsdiereck
 
Sampling and Sampling Distributions
Sampling and Sampling DistributionsSampling and Sampling Distributions
Sampling and Sampling DistributionsJessa Albit
 
2.3 stem and leaf displays
2.3 stem and leaf displays2.3 stem and leaf displays
2.3 stem and leaf displaysleblance
 
Collection and Organization of Data.pptx
Collection and Organization of Data.pptxCollection and Organization of Data.pptx
Collection and Organization of Data.pptxKristineJoyGuting1
 
A power point presentation on statistics
A power point presentation on statisticsA power point presentation on statistics
A power point presentation on statisticsKriace Ward
 
Statistical distributions
Statistical distributionsStatistical distributions
Statistical distributionsTanveerRehman4
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statisticsAmira Talic
 
Tipos de gráficos
Tipos de gráficosTipos de gráficos
Tipos de gráficosmonicasari
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersionAbhinav yadav
 
Graphical Representation of Data
Graphical Representation of DataGraphical Representation of Data
Graphical Representation of Dataforgetfulmailer
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyJoydeep Hazarika
 
03.data presentation(2015) 2
03.data presentation(2015) 203.data presentation(2015) 2
03.data presentation(2015) 2Mmedsc Hahm
 

Was ist angesagt? (20)

3.1 Measures of center
3.1 Measures of center3.1 Measures of center
3.1 Measures of center
 
Parametric & non-parametric
Parametric & non-parametricParametric & non-parametric
Parametric & non-parametric
 
Tipos de graficas
Tipos de graficasTipos de graficas
Tipos de graficas
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
In Anova
In  AnovaIn  Anova
In Anova
 
Basics of Educational Statistics (Descriptive statistics)
Basics of Educational Statistics (Descriptive statistics)Basics of Educational Statistics (Descriptive statistics)
Basics of Educational Statistics (Descriptive statistics)
 
Statistics
StatisticsStatistics
Statistics
 
Sampling and Sampling Distributions
Sampling and Sampling DistributionsSampling and Sampling Distributions
Sampling and Sampling Distributions
 
Statistics...
Statistics...Statistics...
Statistics...
 
2.3 stem and leaf displays
2.3 stem and leaf displays2.3 stem and leaf displays
2.3 stem and leaf displays
 
Collection and Organization of Data.pptx
Collection and Organization of Data.pptxCollection and Organization of Data.pptx
Collection and Organization of Data.pptx
 
Statistical graphs
Statistical graphsStatistical graphs
Statistical graphs
 
A power point presentation on statistics
A power point presentation on statisticsA power point presentation on statistics
A power point presentation on statistics
 
Statistical distributions
Statistical distributionsStatistical distributions
Statistical distributions
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Tipos de gráficos
Tipos de gráficosTipos de gráficos
Tipos de gráficos
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
 
Graphical Representation of Data
Graphical Representation of DataGraphical Representation of Data
Graphical Representation of Data
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
03.data presentation(2015) 2
03.data presentation(2015) 203.data presentation(2015) 2
03.data presentation(2015) 2
 

Ähnlich wie The Three Rules of Data Visualization

Displaying and describing categorical data
Displaying and describing categorical dataDisplaying and describing categorical data
Displaying and describing categorical dataOlivia Dombrowski
 
Displaying Distributions with Graphs
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphsnszakir
 
Quantitative techniques in business
Quantitative techniques in businessQuantitative techniques in business
Quantitative techniques in businesssameer sheikh
 
Frequency Tables - Statistics
Frequency Tables - StatisticsFrequency Tables - Statistics
Frequency Tables - Statisticsmscartersmaths
 
Lecture One· Statistics · .docx
Lecture One·             Statistics         ·            .docxLecture One·             Statistics         ·            .docx
Lecture One· Statistics · .docxLaticiaGrissomzz
 
Vsual analysis.pptx
Vsual analysis.pptxVsual analysis.pptx
Vsual analysis.pptxDPBarboa
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statisticsDhwani Shah
 
2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regressionLong Beach City College
 
2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptxssuser504dda
 
4-types-of-graphs.pptx
4-types-of-graphs.pptx4-types-of-graphs.pptx
4-types-of-graphs.pptxMJGamboa2
 
Chapter 2 @ Exploratory Data Analysis -1.pptx
Chapter 2 @ Exploratory Data Analysis -1.pptxChapter 2 @ Exploratory Data Analysis -1.pptx
Chapter 2 @ Exploratory Data Analysis -1.pptxPatience Tladi
 
2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceive2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceiveLong Beach City College
 
Standerd Deviation and calculation.pdf
Standerd Deviation and calculation.pdfStanderd Deviation and calculation.pdf
Standerd Deviation and calculation.pdfNJJAISWALPC
 

Ähnlich wie The Three Rules of Data Visualization (20)

Displaying and describing categorical data
Displaying and describing categorical dataDisplaying and describing categorical data
Displaying and describing categorical data
 
Displaying Distributions with Graphs
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphs
 
Quantitative techniques in business
Quantitative techniques in businessQuantitative techniques in business
Quantitative techniques in business
 
Visualization-1
Visualization-1Visualization-1
Visualization-1
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Frequency Tables - Statistics
Frequency Tables - StatisticsFrequency Tables - Statistics
Frequency Tables - Statistics
 
Exploring Data
Exploring DataExploring Data
Exploring Data
 
data collection.pptx
data collection.pptxdata collection.pptx
data collection.pptx
 
Lecture One· Statistics · .docx
Lecture One·             Statistics         ·            .docxLecture One·             Statistics         ·            .docx
Lecture One· Statistics · .docx
 
Tps4e ch1 1.1
Tps4e ch1 1.1Tps4e ch1 1.1
Tps4e ch1 1.1
 
cross tabulation
 cross tabulation cross tabulation
cross tabulation
 
FREQUENCIES.ppt
FREQUENCIES.pptFREQUENCIES.ppt
FREQUENCIES.ppt
 
Vsual analysis.pptx
Vsual analysis.pptxVsual analysis.pptx
Vsual analysis.pptx
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression2.4 Scatterplots, correlation, and regression
2.4 Scatterplots, correlation, and regression
 
2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx
 
4-types-of-graphs.pptx
4-types-of-graphs.pptx4-types-of-graphs.pptx
4-types-of-graphs.pptx
 
Chapter 2 @ Exploratory Data Analysis -1.pptx
Chapter 2 @ Exploratory Data Analysis -1.pptxChapter 2 @ Exploratory Data Analysis -1.pptx
Chapter 2 @ Exploratory Data Analysis -1.pptx
 
2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceive2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceive
 
Standerd Deviation and calculation.pdf
Standerd Deviation and calculation.pdfStanderd Deviation and calculation.pdf
Standerd Deviation and calculation.pdf
 

Kürzlich hochgeladen

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 

Kürzlich hochgeladen (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 

The Three Rules of Data Visualization

  • 1.
  • 2.   The three rules of data analysis won’t be difficult to remember: 1. Make a picture—things may be revealed that are not obvious in the raw data. These will be things to think about. 2. Make a picture—important features of and patterns in the data will show up. You may also see things that you did not expect. 3. Make a picture—the best way to tell others about your data is with a well-chosen picture. The Three Rules of Data Analysis
  • 3.   We can “pile” the data by counting the number of data values in each category of interest.  We can organize these counts into a frequency table, which records the totals and the category names. Frequency Tables
  • 4.   What is your favorite color on this list?  Red  Blue  Green  Purple Frequency Tables Example Color Frequency Red Blue Green Purple
  • 5.   A relative frequency table is similar, but gives the percentages (instead of counts) for each category.  Cumulative Frequency – calculates the sum of all relative frequencies for that particular category and all previous categories Relative Frequency Tables
  • 6.   Both types of tables show how cases are distributed across the categories.  They describe the distribution of a categorical variable because they name the possible categories and tell how frequently each occurs.  To calculate the relative frequency:  Divide the count by the total number to cases  Multiply by 100 to express as a percentage  To calculate the cumulative relative frequency:  Take the current category’s relative frequency and add it all previous relative frequencies Creating Frequency Tables
  • 7.  Let’s use the previous data we collected to create a relative frequency table. Relative Frequency: Example Color Frequency Relative Frequency Cumulative Relative Frequency Red Blue Green Purple TOTAL
  • 8.   You might think that a good way to show the Titanic data is with this display. Is it? What’s Wrong With This Picture?
  • 9.   The ship display makes it look like most of the people on the Titanic were crew members, with a few passengers along for the ride.  When we look at each ship, we see the area taken up by the ship, instead of the length of the ship.  The ship display violates the area principle:  The area occupied by a part of the graph should correspond to the magnitude of the value it represents. The Area Principle
  • 10.   A bar chart displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison.  A bar chart stays true to the area principle.  Thus, a better display for the ship data is:  Bar charts have spaces between the categories, while histograms do not. Bar Charts
  • 11.   A relative frequency bar chart displays the relative proportion of counts for each category.  A relative frequency bar chart also stays true to the area principle.  Replacing counts with percentages in the ship data: Bar Charts (cont.)
  • 12.   Use equal increments along the dependent (y) axis.  Follow the area principle and be sure to keep the area equal in each segment (the x-axis should also have equal increments).  Example: Create a bar graph based on the collected data: How to Create Bar Charts Color Frequency Red Blue Green Purple
  • 13.   When you are interested in parts of the whole, a pie chart might be your display of choice.  Pie charts show the whole group of cases as a circle.  They slice the circle into pieces whose size is proportional to the fraction of the whole in each category. Pie Charts
  • 14.  Convert Your Data:  Convert all data values to percentages of the whole data set. For example, four radishes, three cucumbers, two carrots and one pepper equals 40 percent radishes, 30 percent cucumbers, 20 percent carrots and 10 percent peppers.  Convert the percentages into angles. Since a full circle is 360 degrees, multiply this by the percentages to get the angle for each section of the pie. For the radishes, 0.4 X 360 = 144 degrees. For the cucumbers, 0.3 X 360 = 108 degrees. For the carrots, 0.2 X 360 = 72 degrees. For the peppers, 0.1 X 360 = 36 degrees.  Make sure the angle calculations are correct by adding all the angles. The total should be 360. 144 + 108 + 72 + 36 = 360.  You may be off by a tenth or so due to rounding, so be careful. How to Accurately Create Pie Charts
  • 15.  Draw the Chart:  Draw a circle on a blank sheet of paper, using a compass. While a compass is not necessary, using one will make the chart much neater and clearer by ensuring the circle is even.  Draw a radius, from the center to the right edge of the circle, using the ruler or straight edge. This will be the first base line.  Measure the largest angle in the data with the protractor, starting at the baseline, and mark it on the edge of the circle. Use the ruler to draw another radius to that point. Use this new radius as a base line for your next largest angle and continue this process until you get to the last data point. You will only need to measure the last angle to verify its value since both lines will already be drawn.  Label and shade the sections of the pie chart to highlight whatever data is important for your use. How to Accurately Create Pie Charts (cont.)
  • 16.   What’s your favorite academic subject?  Create your own pie chart to represent the class’s distribution of subject preferences. Class Activity Color Frequency Math English History Science Day 2
  • 17.   A contingency table allows us to look at two categorical variables together.  It shows how individuals are distributed along each variable, depending on the value of the other variable.  Example: we can examine the class of ticket and whether a person survived the Titanic: Contingency Tables
  • 18.   The margins of the table, both on the right and on the bottom, give totals and the frequency distributions for each of the variables.  Each frequency distribution is called a marginal distribution of its respective variable.  The marginal distribution of Survival is: Contingency Tables (cont.)
  • 19.   Each cell of the table gives the count for a combination of values of the two values.  For example, the second cell in the crew column tells us that 673 crew members died when the Titanic sunk. Contingency Tables (cont.)
  • 20.   The percentage of passengers who were both in 1st class and survived:  The percentage of 1st class passengers who survived:  The percentage of the survivors who were in 1st class:  Contingency tables allow us to make many comparisons. Contingency Table Example
  • 21.   This two-way table shows the favorite leisure activities for 50 adults - 20 men and 30 women.  What is the marginal distribution of the responses? Example: Finding Marginal Distributions
  • 22.   A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable.  The following is the conditional distribution of ticket Class, conditional on having survived: Conditional Distributions
  • 23.   The following is the conditional distribution of ticket Class, conditional on having passed: Conditional Distributions (cont.)
  • 24.   The conditional distributions tell us that there is a difference in class for those who survived and those who perished.  This is better shown with pie charts of the two distributions: Conditional Distributions (cont.)
  • 25.   We see that the distribution of Class for the survivors is different from that of the nonsurvivors..  This leads us to believe that Class and Survival are associated, that they are not independent.  The variables would be considered independent when the distribution of one variable in a contingency table is the same for all categories of the other variable. Conditional Distributions (cont.)
  • 26.  1. What % of the class are girls with liberal political views? 2. What % of conservatives are boys? 3. What % of girls are moderate? 4. What is the marginal frequency distribution of political views? Activity: Political Views Conservative Liberal Moderate Total Female Male TOTAL
  • 27.   When averages are taken across different groups, they can appear to contradict the overall averages. Simpson’s Paradox Day 3
  • 28.   It’s the last inning of an important game. Your team is one run down with the bases loaded and two outs. The pitcher is due up, so you’ll be sending in a pinch-hitter. There are two batters available on the bench. Whom should you send in to bat?  Do the averages line up with the overall? Simpson’s Paradox (cont.)
  • 29.  A segmented bar chart displays the same information as a pie chart, but in the form of bars instead of circles.  Each bar is treated as the “whole” and is divided proportionally into segments corresponding o the percentage in each group.  Here is the segmented bar chart for ticket Class by Survival status: Segmented Bar Charts
  • 30.   Don’t violate the area principle.  While some people might like the pie chart on the left better, it is harder to compare fractions of the whole, which a well-done pie chart does. What Can Go Wrong?
  • 31.   Keep it honest—make sure your display shows what it says it shows.  This plot of the percentage of high-school students who engage in specified dangerous behaviors has a problem. Can you see it? What Can Go Wrong? (cont.)
  • 32.   Don’t confuse similar-sounding percentages—pay particular attention to the wording of the context.  Don’t forget to look at the variables separately too— examine the marginal distributions, since it is important to know how many cases are in each category. What Can Go Wrong? (cont.)
  • 33.   Be sure to use enough individuals!  Do not make a report like “We found that 66.67% of the rats improved their performance with training. The other rat died.” What Can Go Wrong? (cont.)
  • 34.   Don’t overstate your case—don’t claim something you can’t.  Don’t use unfair or silly averages—this could lead to Simpson’s Paradox, so be careful when you average one variable across different levels of a second variable. What Can Go Wrong? (cont.)
  • 35.   We can summarize categorical data by counting the number of cases in each category (expressing these as counts or percents).  We can display the distribution in a bar chart or pie chart.  And, we can examine two-way tables called contingency tables, examining marginal and/or conditional distributions of the variables. Recap
  • 36.   Day 1: #5 – 9 ODD, 11, 13 – 16  Day 2: # 17, 19 – 21  Day 3: # 23, 25, 28  Day 4: # 32, 33, 39  Read Chapter 4 Assignments: pp. 38-39