SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Describing
Quantitative Data
with Numbers
Summarizing distributions of
univariate data
1. Measuring center: median, mean
2. Measuring spread: range, interquartile
range, standard deviation
3. Measuring position: quartiles, percentiles,
standardized scores (z-scores)
4. Using boxplots
5. The effect of changing units on summary
measures
Measuring Center
 When describing the “center” of a set of
data, we can use the mean or the median.
 Mean: “Average” value
 Median: “Center” value (Q2)
Where is the Center of the
Distribution?
 If you had to pick a single number to describe
all the data what would you pick?
 It’s easy to find the center when a histogram is
unimodal and symmetric—it’s right in the
middle.
 On the other hand, it’s not so easy to find the
center of a skewed histogram or a histogram
with more than one mode.
Mean
To find the mean
of a set of
observations, add
their values and
divide by the
number of
observations.
x =
xi∑
n
Find the mean of:
2 3 4 6 8 12
6
1286432 +++++
833.5=x
 Although the mean is the most popular
measure of center, it is not always the most
appropriate.
 The mean is very sensitive to extreme
observations (outliers).
 Because outliers affect the mean, we say
that the mean is NOT a resistant measure of
center.
 So if the mean is not a resistant measure of
center, what is? Median
Median
The median is the value with
exactly half the data values
below it and half above it.
 It is the middle data value
once the data values have
been ordered) that divides
the histogram into two
equal areas
 It has the same units as
the data
 The median is not
influenced by extreme
observations, so we say
that the median is a
resistant measure of
center.
Finding the Median
First sort the values (arrange them in order),
then follow one of these:
1. If the number of data values is even, the
median is found by computing the mean of
the two middle numbers.
2. If the number of data values is odd, the
median is the number located in the exact
middle of the list.
5.40 1.10 0.42 0.73 0.48 1.10
0.42 0.48 0.73 1.10 1.10 5.40
(in order - even number of values – no exact middle shared by two numbers)
0.73 + 1.1 MEDIAN is 0.915
2
5.40 1.10 0.42 0.73 0.48 1.10 0.66
0.42 0.48 0.66 0.73 1.10 1.10 5.40
(in order - odd number of values)
exact middle MEDIAN is 0.73
Mean vs Median
Mean Median
Average value of variable Typical value of variable
Not resistant to outliers Resistant to outliers
A good measure when the data
is symmetric
A reliable measure regardless
of the shape of the distribution
Farther out in the long tail than
the median when data is
skewed
Close to the center even when
the data is skewed
Easy to find Less prone to mistakes
Check For Understanding
Check For Understanding
Measuring Spread
 Range
 Interquartile Range (IQR)
 Standard Deviation
Range
 Distance between largest and smallest values.
 Range = Maximum – Minimum
 Range is useful if there are no outliers.
Interquartile Range
How to find the IQR:
1. Find median
2. Find the median of both halves of data
the lower median is 1st
Quartile
the upper median is 3rd
Quartile
3. Subtract the two quartile scores
Outliers
One general rule of thumb for identifying
outliers is finding any data points that lie:
Lower than 1.5 * IQR below Q1
OR
Higher than 1.5 * IQR above Q3
Check For Understanding
• The “Descriptive Statistics” of test grades for a certain
class are listed below.
Mean = 74.71
Median = 76
Standard Deviation = 12.61
Minimum = 35
Maximum = 94
Q1 = 68
Q3 = 84
• (a) Determine the IQR for this data.
• (b) Using the answer from part (a), determine whether
the lowest and highest values in the data are outliers.
Standard Deviation
A standard deviation is a measure of the average
deviation from the mean.
sx =
1
n −1
(xi − x)2
∑
If the data is uniform or symmetric use:
If the data is skewed, use:
MeanCenter:
Spread:standard deviation
MedianCenter:
Spread:Five-number summary, Range, IQR
Distributions with Outliers
 Since outliers affect mean and standard
deviation, it is usually better to use median
and IQR
 However, if the distribution is unimodal—use
mean and median and just report outliers
separately
 However, if you find a simple reason for
outlier, eliminate it and use mean and
standard devation—if symmetric
Measuring Position
 Quartiles
 Percentiles
 Z-scores
• We can either use z-
Scores or percentiles to
declare the location of
an observation in a
distribution.
• z-Scores use the mean
and standard deviation.
• Percentiles use a
position relative to the
starting point.
Percentiles/Quartiles
• is the notation for
the kth percentile
• is the notation for
the nth quartile
P Q25 1=
P Q50 2= = median
P Q75 3=
Finding Percentiles
If you are trying to find the percentile
corresponding to a certain score x:
number of scores <
100
total number of scores
x
Percentile = ×
• Percentiles are used often when reporting academic
scores such as SAT scores. Let’s say you get a 620 on
the math portion of the SAT. It might also indicate
that you are in the “78th percentile”. That means
that you scored better than 78% of all students
taking that particular SAT.
Measuring Relative Standing With
Standardized Values (z-Scores)
• One way to compare an individual to the whole
distribution is to describe it’s location in the
distribution relative to the mean.
• Let’s do this by describing how many standard
deviations an individual is away from the mean value.
• We call this the “standardized value,” or, the “z-
Score.”
Here is how to interpret z-scores:
 A z-score less than 0 represents an element less than
the mean.
 A z-score greater than 0 represents an element
greater than the mean.
 A z-score equal to 0 represents an element equal to
the mean.
 A z-score equal to 1 represents an element that is 1
standard deviation greater than the mean; a z-score
equal to 2, 2 standard deviations greater than the
mean; etc.
 A z-score equal to -1 represents an element that is 1
standard deviation less than the mean; a z-score equal
to -2, 2 standard deviations less than the mean; etc.
Five-Number Summary
The five-number summary of a distribution
consists of the smallest observation, the first
quartile, the median, the third quartile, and the
largest observation, written in order from
smallest to largest.
Minimum Q1 Median Q3 Maximum
Boxplots
The five-number summary divides the
distribution roughly into quarters. This leads
to a new way to display quantitative data, the
boxplot.
How to make a boxplot:
1. Draw and label a number line that includes
the range of the distribution.
2. Draw a central box from Q1 to Q3.
3. Note the median M inside the box.
4. Extend lines (whiskers) from the box out to
the minimum and maximum values that are
not outliers.
Comparing Boxplots
Check For Understanding
Effect of Changing Units
 If you add a constant to every
value, the mean and median
increase by the same
constant.
Example:
Suppose you have a set of
scores with a mean equal to 5
and a median equal to 6. If
you add 10 to every score,
the new mean will be 5 + 10 =
15; and the new median will
be 6 + 10 = 16.
 If you multiply every value
by a constant. Then, the
mean and the median will
also be multiplied by that
constant.
Example:
Assume that a set of scores
has a mean of 5 and a
median of 6. If you multiply
each of these scores by 10,
the new mean will be 5 * 10
= 50; and the new median
will be 6 * 10 = 60.
Sometimes, researchers change units (minutes to hours,
feet to meters, etc.). Here is how measures of central
tendency are affected when we change units:
Check For Understanding
The average score on a test is 150 with a
standard deviation of 15. Each score is then
increased by 25. What are the new mean and
standard deviation?
Check For Understanding
The test grades from a college statistics class are shown
below.
85 72 64 65 98 78 75 76 82 80 61 92 72 58 65 74 92 85 74 76 77 77
62 68 68 54 62 76 73 85 88 91 99 82 80 74 76 77 70 60
(a) Construct two different graphs of these data
(b) Calculate the five-number summary and the mean and
standard deviation of the data.
(c) Describe the distribution of the data, citing both the
plots
and the summary statistics found in questions (a) and (b).

Weitere ähnliche Inhalte

Was ist angesagt?

Measure OF Central Tendency
Measure OF Central TendencyMeasure OF Central Tendency
Measure OF Central TendencyIqrabutt038
 
MERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEM
MERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEMMERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEM
MERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEMRephelPaulManasaiS
 
CABT Math 8 measures of central tendency and dispersion
CABT Math 8   measures of central tendency and dispersionCABT Math 8   measures of central tendency and dispersion
CABT Math 8 measures of central tendency and dispersionGilbert Joseph Abueg
 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersionForensic Pathology
 
Measure of central tendency
Measure of central tendency Measure of central tendency
Measure of central tendency Kannan Iyanar
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersionAbhinav yadav
 
Properties of Standard Deviation
Properties of Standard DeviationProperties of Standard Deviation
Properties of Standard DeviationRizwan Sharif
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyRichard Paulino
 
Measure of-central-tendency-ppt
Measure of-central-tendency-pptMeasure of-central-tendency-ppt
Measure of-central-tendency-pptMark Jhon Dumadag
 
3.1 measures of central tendency
3.1 measures of central tendency3.1 measures of central tendency
3.1 measures of central tendencyleblance
 
Measure of Central Tendency
Measure of Central Tendency Measure of Central Tendency
Measure of Central Tendency Umme Habiba
 
Biostatistics measures of central tendency
Biostatistics   measures of central tendencyBiostatistics   measures of central tendency
Biostatistics measures of central tendencyKarmadipsinh Zala
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyguest232a662
 
Thiyagu measures of central tendency final
Thiyagu   measures of central tendency finalThiyagu   measures of central tendency final
Thiyagu measures of central tendency finalThiyagu K
 

Was ist angesagt? (19)

Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
Measure OF Central Tendency
Measure OF Central TendencyMeasure OF Central Tendency
Measure OF Central Tendency
 
Measures Of Central Tendencies
Measures Of Central TendenciesMeasures Of Central Tendencies
Measures Of Central Tendencies
 
MERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEM
MERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEMMERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEM
MERITS AND DEMERITS OF MEAN,MEDIAN,MODE,GM,HM AND WHEN TO USE THEM
 
CABT Math 8 measures of central tendency and dispersion
CABT Math 8   measures of central tendency and dispersionCABT Math 8   measures of central tendency and dispersion
CABT Math 8 measures of central tendency and dispersion
 
Stat3 central tendency & dispersion
Stat3 central tendency & dispersionStat3 central tendency & dispersion
Stat3 central tendency & dispersion
 
Measure of central tendency
Measure of central tendency Measure of central tendency
Measure of central tendency
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
 
Properties of Standard Deviation
Properties of Standard DeviationProperties of Standard Deviation
Properties of Standard Deviation
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Measure of-central-tendency-ppt
Measure of-central-tendency-pptMeasure of-central-tendency-ppt
Measure of-central-tendency-ppt
 
3.1 measures of central tendency
3.1 measures of central tendency3.1 measures of central tendency
3.1 measures of central tendency
 
Measure of Central Tendency
Measure of Central Tendency Measure of Central Tendency
Measure of Central Tendency
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
Biostatistics measures of central tendency
Biostatistics   measures of central tendencyBiostatistics   measures of central tendency
Biostatistics measures of central tendency
 
central tendency.pptx
central tendency.pptxcentral tendency.pptx
central tendency.pptx
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Thiyagu measures of central tendency final
Thiyagu   measures of central tendency finalThiyagu   measures of central tendency final
Thiyagu measures of central tendency final
 

Ähnlich wie Describing quantitative data with numbers

best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.pptDejeneDay
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptNazarudinManik1
 
Descriptions of data statistics for research
Descriptions of data   statistics for researchDescriptions of data   statistics for research
Descriptions of data statistics for researchHarve Abella
 
2-Descriptive statistics.pptx
2-Descriptive statistics.pptx2-Descriptive statistics.pptx
2-Descriptive statistics.pptxSandipanMaji3
 
Empirics of standard deviation
Empirics of standard deviationEmpirics of standard deviation
Empirics of standard deviationAdebanji Ayeni
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematicshktripathy
 
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyPrithwis Mukerjee
 
Describing Distributions with Numbers
Describing Distributions with NumbersDescribing Distributions with Numbers
Describing Distributions with Numbersnszakir
 
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptxVanmala Buchke
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mininghktripathy
 
Topic 2 Measures of Central Tendency.pptx
Topic 2   Measures of Central Tendency.pptxTopic 2   Measures of Central Tendency.pptx
Topic 2 Measures of Central Tendency.pptxCallplanetsDeveloper
 
3. measures of central tendency
3. measures of central tendency3. measures of central tendency
3. measures of central tendencyrenz50
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic StatisticsYee Bee Choo
 
descriptive data analysis
 descriptive data analysis descriptive data analysis
descriptive data analysisgnanasarita1
 
Measures of central tendancy
Measures of central tendancy Measures of central tendancy
Measures of central tendancy Pranav Krishna
 
Basic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxBasic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxAnusuya123
 

Ähnlich wie Describing quantitative data with numbers (20)

best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
 
Stat11t chapter3
Stat11t chapter3Stat11t chapter3
Stat11t chapter3
 
statistics
statisticsstatistics
statistics
 
Descriptions of data statistics for research
Descriptions of data   statistics for researchDescriptions of data   statistics for research
Descriptions of data statistics for research
 
Statistics
StatisticsStatistics
Statistics
 
2-Descriptive statistics.pptx
2-Descriptive statistics.pptx2-Descriptive statistics.pptx
2-Descriptive statistics.pptx
 
Empirics of standard deviation
Empirics of standard deviationEmpirics of standard deviation
Empirics of standard deviation
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematics
 
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
 
Describing Distributions with Numbers
Describing Distributions with NumbersDescribing Distributions with Numbers
Describing Distributions with Numbers
 
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptx
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mining
 
Data analysis
Data analysisData analysis
Data analysis
 
Topic 2 Measures of Central Tendency.pptx
Topic 2   Measures of Central Tendency.pptxTopic 2   Measures of Central Tendency.pptx
Topic 2 Measures of Central Tendency.pptx
 
3. measures of central tendency
3. measures of central tendency3. measures of central tendency
3. measures of central tendency
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic Statistics
 
descriptive data analysis
 descriptive data analysis descriptive data analysis
descriptive data analysis
 
Measures of central tendancy
Measures of central tendancy Measures of central tendancy
Measures of central tendancy
 
Basic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxBasic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptx
 

Mehr von Ulster BOCES

Sampling distributions
Sampling distributionsSampling distributions
Sampling distributionsUlster BOCES
 
Geometric distributions
Geometric distributionsGeometric distributions
Geometric distributionsUlster BOCES
 
Binomial distributions
Binomial distributionsBinomial distributions
Binomial distributionsUlster BOCES
 
Means and variances of random variables
Means and variances of random variablesMeans and variances of random variables
Means and variances of random variablesUlster BOCES
 
General probability rules
General probability rulesGeneral probability rules
General probability rulesUlster BOCES
 
Planning and conducting surveys
Planning and conducting surveysPlanning and conducting surveys
Planning and conducting surveysUlster BOCES
 
Overview of data collection methods
Overview of data collection methodsOverview of data collection methods
Overview of data collection methodsUlster BOCES
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate dataUlster BOCES
 
Normal probability plot
Normal probability plotNormal probability plot
Normal probability plotUlster BOCES
 
Exploring data stemplot
Exploring data   stemplotExploring data   stemplot
Exploring data stemplotUlster BOCES
 
Exploring data other plots
Exploring data   other plotsExploring data   other plots
Exploring data other plotsUlster BOCES
 
Exploring data histograms
Exploring data   histogramsExploring data   histograms
Exploring data histogramsUlster BOCES
 
Calculating percentages from z scores
Calculating percentages from z scoresCalculating percentages from z scores
Calculating percentages from z scoresUlster BOCES
 
Standardizing scores
Standardizing scoresStandardizing scores
Standardizing scoresUlster BOCES
 
Intro to statistics
Intro to statisticsIntro to statistics
Intro to statisticsUlster BOCES
 
Displaying quantitative data
Displaying quantitative dataDisplaying quantitative data
Displaying quantitative dataUlster BOCES
 

Mehr von Ulster BOCES (20)

Sampling means
Sampling meansSampling means
Sampling means
 
Sampling distributions
Sampling distributionsSampling distributions
Sampling distributions
 
Geometric distributions
Geometric distributionsGeometric distributions
Geometric distributions
 
Binomial distributions
Binomial distributionsBinomial distributions
Binomial distributions
 
Means and variances of random variables
Means and variances of random variablesMeans and variances of random variables
Means and variances of random variables
 
Simulation
SimulationSimulation
Simulation
 
General probability rules
General probability rulesGeneral probability rules
General probability rules
 
Planning and conducting surveys
Planning and conducting surveysPlanning and conducting surveys
Planning and conducting surveys
 
Overview of data collection methods
Overview of data collection methodsOverview of data collection methods
Overview of data collection methods
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
 
Normal probability plot
Normal probability plotNormal probability plot
Normal probability plot
 
Exploring data stemplot
Exploring data   stemplotExploring data   stemplot
Exploring data stemplot
 
Exploring data other plots
Exploring data   other plotsExploring data   other plots
Exploring data other plots
 
Exploring data histograms
Exploring data   histogramsExploring data   histograms
Exploring data histograms
 
Calculating percentages from z scores
Calculating percentages from z scoresCalculating percentages from z scores
Calculating percentages from z scores
 
Density curve
Density curveDensity curve
Density curve
 
Standardizing scores
Standardizing scoresStandardizing scores
Standardizing scores
 
Intro to statistics
Intro to statisticsIntro to statistics
Intro to statistics
 
Displaying quantitative data
Displaying quantitative dataDisplaying quantitative data
Displaying quantitative data
 
A.2 se and sd
A.2 se  and sdA.2 se  and sd
A.2 se and sd
 

Kürzlich hochgeladen

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Describing quantitative data with numbers

  • 2. Summarizing distributions of univariate data 1. Measuring center: median, mean 2. Measuring spread: range, interquartile range, standard deviation 3. Measuring position: quartiles, percentiles, standardized scores (z-scores) 4. Using boxplots 5. The effect of changing units on summary measures
  • 3. Measuring Center  When describing the “center” of a set of data, we can use the mean or the median.  Mean: “Average” value  Median: “Center” value (Q2)
  • 4. Where is the Center of the Distribution?  If you had to pick a single number to describe all the data what would you pick?  It’s easy to find the center when a histogram is unimodal and symmetric—it’s right in the middle.  On the other hand, it’s not so easy to find the center of a skewed histogram or a histogram with more than one mode.
  • 5. Mean To find the mean of a set of observations, add their values and divide by the number of observations. x = xi∑ n
  • 6. Find the mean of: 2 3 4 6 8 12 6 1286432 +++++ 833.5=x
  • 7.  Although the mean is the most popular measure of center, it is not always the most appropriate.  The mean is very sensitive to extreme observations (outliers).  Because outliers affect the mean, we say that the mean is NOT a resistant measure of center.  So if the mean is not a resistant measure of center, what is? Median
  • 8. Median The median is the value with exactly half the data values below it and half above it.  It is the middle data value once the data values have been ordered) that divides the histogram into two equal areas  It has the same units as the data  The median is not influenced by extreme observations, so we say that the median is a resistant measure of center.
  • 9. Finding the Median First sort the values (arrange them in order), then follow one of these: 1. If the number of data values is even, the median is found by computing the mean of the two middle numbers. 2. If the number of data values is odd, the median is the number located in the exact middle of the list.
  • 10. 5.40 1.10 0.42 0.73 0.48 1.10 0.42 0.48 0.73 1.10 1.10 5.40 (in order - even number of values – no exact middle shared by two numbers) 0.73 + 1.1 MEDIAN is 0.915 2 5.40 1.10 0.42 0.73 0.48 1.10 0.66 0.42 0.48 0.66 0.73 1.10 1.10 5.40 (in order - odd number of values) exact middle MEDIAN is 0.73
  • 11. Mean vs Median Mean Median Average value of variable Typical value of variable Not resistant to outliers Resistant to outliers A good measure when the data is symmetric A reliable measure regardless of the shape of the distribution Farther out in the long tail than the median when data is skewed Close to the center even when the data is skewed Easy to find Less prone to mistakes
  • 14. Measuring Spread  Range  Interquartile Range (IQR)  Standard Deviation
  • 15. Range  Distance between largest and smallest values.  Range = Maximum – Minimum  Range is useful if there are no outliers.
  • 16. Interquartile Range How to find the IQR: 1. Find median 2. Find the median of both halves of data the lower median is 1st Quartile the upper median is 3rd Quartile 3. Subtract the two quartile scores
  • 17. Outliers One general rule of thumb for identifying outliers is finding any data points that lie: Lower than 1.5 * IQR below Q1 OR Higher than 1.5 * IQR above Q3
  • 18. Check For Understanding • The “Descriptive Statistics” of test grades for a certain class are listed below. Mean = 74.71 Median = 76 Standard Deviation = 12.61 Minimum = 35 Maximum = 94 Q1 = 68 Q3 = 84 • (a) Determine the IQR for this data. • (b) Using the answer from part (a), determine whether the lowest and highest values in the data are outliers.
  • 19. Standard Deviation A standard deviation is a measure of the average deviation from the mean. sx = 1 n −1 (xi − x)2 ∑
  • 20. If the data is uniform or symmetric use: If the data is skewed, use: MeanCenter: Spread:standard deviation MedianCenter: Spread:Five-number summary, Range, IQR
  • 21. Distributions with Outliers  Since outliers affect mean and standard deviation, it is usually better to use median and IQR  However, if the distribution is unimodal—use mean and median and just report outliers separately  However, if you find a simple reason for outlier, eliminate it and use mean and standard devation—if symmetric
  • 22. Measuring Position  Quartiles  Percentiles  Z-scores • We can either use z- Scores or percentiles to declare the location of an observation in a distribution. • z-Scores use the mean and standard deviation. • Percentiles use a position relative to the starting point.
  • 23. Percentiles/Quartiles • is the notation for the kth percentile • is the notation for the nth quartile P Q25 1= P Q50 2= = median P Q75 3=
  • 24. Finding Percentiles If you are trying to find the percentile corresponding to a certain score x: number of scores < 100 total number of scores x Percentile = × • Percentiles are used often when reporting academic scores such as SAT scores. Let’s say you get a 620 on the math portion of the SAT. It might also indicate that you are in the “78th percentile”. That means that you scored better than 78% of all students taking that particular SAT.
  • 25. Measuring Relative Standing With Standardized Values (z-Scores) • One way to compare an individual to the whole distribution is to describe it’s location in the distribution relative to the mean. • Let’s do this by describing how many standard deviations an individual is away from the mean value. • We call this the “standardized value,” or, the “z- Score.”
  • 26. Here is how to interpret z-scores:  A z-score less than 0 represents an element less than the mean.  A z-score greater than 0 represents an element greater than the mean.  A z-score equal to 0 represents an element equal to the mean.  A z-score equal to 1 represents an element that is 1 standard deviation greater than the mean; a z-score equal to 2, 2 standard deviations greater than the mean; etc.  A z-score equal to -1 represents an element that is 1 standard deviation less than the mean; a z-score equal to -2, 2 standard deviations less than the mean; etc.
  • 27. Five-Number Summary The five-number summary of a distribution consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. Minimum Q1 Median Q3 Maximum
  • 28. Boxplots The five-number summary divides the distribution roughly into quarters. This leads to a new way to display quantitative data, the boxplot.
  • 29. How to make a boxplot: 1. Draw and label a number line that includes the range of the distribution. 2. Draw a central box from Q1 to Q3. 3. Note the median M inside the box. 4. Extend lines (whiskers) from the box out to the minimum and maximum values that are not outliers.
  • 32. Effect of Changing Units  If you add a constant to every value, the mean and median increase by the same constant. Example: Suppose you have a set of scores with a mean equal to 5 and a median equal to 6. If you add 10 to every score, the new mean will be 5 + 10 = 15; and the new median will be 6 + 10 = 16.  If you multiply every value by a constant. Then, the mean and the median will also be multiplied by that constant. Example: Assume that a set of scores has a mean of 5 and a median of 6. If you multiply each of these scores by 10, the new mean will be 5 * 10 = 50; and the new median will be 6 * 10 = 60. Sometimes, researchers change units (minutes to hours, feet to meters, etc.). Here is how measures of central tendency are affected when we change units:
  • 33. Check For Understanding The average score on a test is 150 with a standard deviation of 15. Each score is then increased by 25. What are the new mean and standard deviation?
  • 34. Check For Understanding The test grades from a college statistics class are shown below. 85 72 64 65 98 78 75 76 82 80 61 92 72 58 65 74 92 85 74 76 77 77 62 68 68 54 62 76 73 85 88 91 99 82 80 74 76 77 70 60 (a) Construct two different graphs of these data (b) Calculate the five-number summary and the mean and standard deviation of the data. (c) Describe the distribution of the data, citing both the plots and the summary statistics found in questions (a) and (b).