SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Chapter 4
Probability and Sampling
Distributions
1
Random Variable


Definition: A random variable is a variable
whose value is a numerical outcome of a
random phenomenon.
 The

statistic calculated from a randomly chosen
sample is an example of a random variable.


We don’t know the exact outcome beforehand.

A

statistic from a random sample will take
different values if we take more samples from the
same population.
2
Section 4.4
The Sampling Distribution of a
Sample Mean
3
Introduction
A statistic from a random sample will take
different values if we take more samples
from the same population
 The values of a statistic do no vary
haphazardly from sample to sample but
have a regular pattern in many samples


 We



already saw the sampling distribution

We’re going to discuss an important
sampling distribution. The sampling
distribution of the sample mean,4x-bar( )
Example



Suppose that we are interested in the workout
times of ISU students at the Recreation center.
Let’s assume that μ is the average workout time of
all ISU students
 To

estimate μ lets take a simple random sample of 100
students at ISU



We will record each students work out time (x)
Then we find the average workout time for the 100 students




x

The population mean μ is the parameter of interest.
The sample mean, x , is the statistic (which is a random variable).
Use x to estimate μ (This seems like a sensible thing to do).

5
Example


A SRS should be a fairly good representation of
the population so the x-bar should be somewhere
near the µ.
from a SRS is an unbiased estimate of µ due to
the randomization

 x-bar



We don’t expect x-bar to be exactly equal to µ
 There



is variability in x-bar from sample to sample

If we take another simple random sample (SRS) of
100 students, then the x-bar will probably be different.
 Why,

then, can I use the results of one sample to
estimate µ?
6
Statistical Estimation


If x-bar is rarely exactly right and varies from
sample to sample, why is x-bar a reasonable
estimate of the population mean µ?
 Answer:

if we keep on taking larger and larger
samples, the statistic x-bar is guaranteed to get
closer and closer to the parameter µ



We have the comfort of knowing that if we can
afford to keep on measuring more subjects,
eventually we will estimate the mean amount of
workout time for ISU students very accurately
7
The Law of Large Numbers


Law of Large Numbers (LLN):
 Draw

independent observations at random from
any population with finite mean µ
 As the number of observations drawn increases,
the mean x-bar of the observed values gets closer
and closer to the mean µ of the population
 If n is the sample size as n gets large x → µ


The Law of Large Numbers holds for any
population, not just for special classes such
as Normal distributions
8
Example


Suppose we have a bowl with 21 small pieces of paper
inside. Each paper is labeled with a number 0-20. We
will draw several random samples out of the bowl of
size n and record the sample means, x-bar for each
sample.






What is the population?

Since we know the values for each individual in the
population (i.e. for each paper in the bowl), we can
actually calculate the value of µ, the true population
mean. µ = 10
Draw a random sample of size n = 1.
Calculate x-bar for this sample.
9
Example


Draw a second random sample of size n = 5. Calculate x for this
sample.

x



Draw a third random sample of size n = 10. Calculate
sample.



Draw a fourth random sample of size n = 15. Calculate
sample.



Draw a fifth random sample of size n = 20. Calculate
sample.



What can we conclude about the value of
increases?

for this

x

for this

x for this

x as the sample size

THIS IS CALLED THE LAW OF LARGE NUMBERS.

10
Another Example

5.710
5.705
5.700
5.695

Mean of first n observations



Example: Suppose we know that the average height of all high
school students in Iowa is 5.70 feet.
We get SRS’s from the population and calculate the height.
mean of first n observations (feet)



0

5000

10000
number of observations

15000

11

20000
Example 4.21 From Book




Sulfur compounds such as dimethyl sulfide (DMS)
are sometimes present in wine
DMS causes “off-odors” in wine, so winemakers
want to know the odor threshold
 What

is the lowest concentration of DMS that the
human nose can detect



Different people have different thresholds, so we
start by asking about the mean threshold µ in the
population of all adults
µ

is a parameter that describes this population
12
Example 4.21 From Text




To estimate µ, we present tasters with both
natural wine and the same wine spiked with DMS
at different concentrations to find the lowest
concentration at which they can identify the
spiked wine
The odor thresholds for 10 randomly chosen
subjects (in micrograms/liter):
 28



40 28 33 20 31 29 27 17 21

The mean threshold for these subjects is 27.4
 x-bar

is a statistic calculated from this sample
 A statistic, such as the mean of a random sample of
13
10 adults, is a random variable.
Example
Suppose µ = 25 is the true value of the
parameter we seek to estimate
 The first subject had threshold 28 so the
line starts there
 The second point is the mean of the first
28 + 40
two subjects:
x=
= 34
2




This process continues many many times,
and our line begins to settle around µ = 25
14
Example 4.21From Book
The law of large
numbers in action: as
we take more
observations, the
sample mean x
always approaches the
mean of the population

µ = 25
15
The Law of Large Numbers


The law of large numbers is the foundation of
business enterprises such as casinos and
insurance companies
 The

winnings (or losses) of a gambler on a few plays are
uncertain -- that’s why gambling is exciting(?)



But, the “house” plays tens of thousands of times
 So

the house, unlike individual gamblers, can count on
the long-run regularity described by the Law of Large
Numbers
 The average winnings of the house on tens of thousands
of plays will be very close to the mean of the distribution
of winnings
 Hence, the LLN guarantees the house a profit!
16
Thinking about the
Law of Large Numbers
The Law of Large Numbers says broadly that
the average results of many independent
observations are stable and predictable
 A grocery store deciding how many gallons of
milk to stock and a fast-food restaurant
deciding how many beef patties to prepare
can predict demand even though their
customers make independent decisions


 The

Law of Large Numbers says that the many
individual decisions will produce a stable result
17
The “Law of Small Numbers”
or “Averages”


The Law of Large Numbers describes the
regular behavior of chance phenomena in the
long run



Many people believe in an incorrect “law of
small numbers”
 We

falsely expect even short sequences of
random events to show the kind of average
behaviors that in fact appears only in the long run
18
The “Law of Small Numbers”
or “Averages”


Example: Pretend you have an average free throw
success rate of 70%. One day on the free throw
line, you miss 8 shots in a row. Should you hit the
next shot by the mythical “law of averages.”



No. The law of large numbers tells us that the long run
average will be close to 70%. Missing 8 shots in a row
simply means you are having a bad day. 8 shots is hardly
the “long run”. Furthermore, the law of large numbers says
nothing about the next event. It only tells us what will
happen if we keep track of the long run average.
19
The Hot Hand Debate







In some sports If player makes several consecutive
good plays, like a few good golf shots in a row, often
they claim to have the “hot hand”, which generally
implies that their next shot is likely to a good one.
There have been studies that suggests that runs of golf
shots good or bad are no more frequent in golf than
would be expected if each shot were independent of the
player’s previous shots
Players perform consistently, not in streaks
Our perception of hot or cold streaks simply shows that
we don’t perceive random behavior very well!
20
The Gambling Hot Hand



Gamblers often follow the hot-hand theory,
betting that a “lucky” run will continue
At other times, however, they draw the opposite
conclusion when confronted with a run of
outcomes
 If

a coin gives 10 straight heads, some gamblers feel
that it must now produce some extra tails to get back
into the average of half heads and half tails
 Not true! If the next 10,000 tosses give about 50%
tails, those 10 straight heads will be swamped by the
later thousands of heads and tails.
 No short run compensation is needed to get back to
21
the average in the long run.
Need for Law of Large Numbers


Our inability to accurately distinguish
random behavior from systematic
influences points out the need for
statistical inference to supplement
exploratory analysis of data



Probability calculations can help verify
that what we see in the data is more than
a random pattern
22
How Large is a Large
Number?


The Law of Large Numbers says that the actual
mean outcome of many trials gets close to the
distribution mean µ as more trials are made



It doesn’t say how many trials are needed to
guarantee a mean outcome close to µ
 That



depends on the variability of the random outcomes

The more variable the outcomes, the more trials
are needed to ensure that the mean outcome xbar is close to the distribution µ
23
More Laws of Large Numbers


The Law of Large Numbers is one of the central
facts about probability
 LLN

explains why gambling, casinos, and insurance
companies make money
 LLN assures us that statistical estimation will be accurate
if we can afford enough observations


The basic Law of Large Numbers applies to
independent observations that all have the same
distribution
 Mathematicians

general settings

have extended the law to many more
24
What if Observations are not
Independent








You are in charge of a process that
manufactures video screens for computer
monitors
Your equipment measures the tension on the
metal mesh that lies behind each screen and is
critical to its image quality
You want to estimate the mean tension µ for the
process by the average x-bar of the
measurements
The tension measurements are not independent
25
AYK 4.82


Use the Law of Large Numbers applet on
the text book website

26
Sampling Distributions


The Law of Large Numbers assures us that if
we measure enough subjects, the statistic xbar will eventually get very close to the
unknown parameter µ

27
Sampling Distributions


What if we don’t have a large sample?
 Take

a large number of samples of the same
size from the same population

 Calculate
 Make


the sample mean for each sample

a histogram of the sample means

the histogram of values of the statistic
approximates the sampling distribution that we
would see if we kept on sampling forever…
28


The idea of a sampling distribution is
the foundation of statistical inference
 The

laws of probability can tell us about
sampling distributions without the need to
actually choose or simulate a large number
of samples

29
Mean and Standard Deviation of a
Sample Mean


Suppose that x-bar is the mean of a SRS of size
n drawn from a large population with mean µ and
standard deviation σ



The mean of the sampling distribution of x-bar is
σ
µ and its standard deviation is n
 Notice:

averages are less variable than individual
observations!
30
Mean and Standard Deviation of a
Sample Mean


The mean of the statistic x-bar is always the same
as the mean µ of the population
sampling distribution of x-bar is centered at µ
 in repeated sampling, x-bar will sometimes fall above
the true value of the parameter µ and sometimes below,
but there is no systematic tendency to overestimate or
underestimate the parameter
 because the mean of x-bar is equal to µ, we say that the
statistic x-bar is an unbiased estimator of the
parameter µ
 the

31
Mean and Standard Deviation of a
Sample Mean


An unbiased estimator is “correct on the
average” in many samples
 how

close the estimator falls to the parameter in most
samples is determined by the spread of the sampling
distribution
 if individual observations have standard deviation σ,
then sample means x-bar from samples of size n
σ
have standard deviation
n



Again, notice that averages are less variable
than individual observations
32
Mean and Standard Deviation of a
Sample Mean


Not only is the standard deviation of the
distribution of x-bar smaller than the standard
deviation of individual observations, but it gets
smaller as we take larger samples
 The

results of large samples are less variable than
the results of small samples


Remember, we divided by the square root of n

33
Mean and Standard Deviation of a
Sample Mean


If n is large, the standard deviation of x-bar is
small and almost all samples will give values of xbar that lie very close to the true parameter µ
 The

sample mean from a large sample can be trusted
to estimate the population mean accurately



Notice, that the standard deviation of the sample
distribution gets smaller only at the rate n
 To

cut the standard deviation of x-bar in half, we must
take four times as many observations, not just twice as
many (square root of 4 is 2)
34
Example


Suppose we take samples of size 15
from a distribution with mean 25 and
standard deviation 7
 the

distribution of x-bar is:

 the

mean of x-bar is:



25

 the


7 

 25,


15 

standard deviation of x-bar is:

1.80739
35
What About Shape?


We have described the center and spread
of the sampling distribution of a sample
mean x-bar, but not its shape



The shape of the distribution of x-bar
depends on the shape of the population
distribution
36
Sampling Distribution of a
Sample Mean


If a population has the N(µ, σ) distribution,
then the sample mean x-bar of n
independent observations has the
σ

N µ
,
 distribution
n 


37
Example


Adults differ in the smallest amount of
dimethyl sulfide they can detect in wine



Extensive studies have found that the
DMS odor threshold of adults follows
roughly a Normal distribution with mean µ
= 25 micrograms per liter and standard
deviation σ = 7 micrograms per liter
38
Example


Because the population distribution is
Normal, the sampling distribution of xbar is also Normal



If n = 10, what is the distribution of xbar?

7 

N  25,

10 


39
What if the Population Distribution
is not Normal?


As the sample size increases, the distribution
of x-bar changes shape
 The

distribution looks less like that of the
population and more like a Normal distribution



When the sample is large enough, the
distribution of x-bar is very close to Normal
 This

result is true no matter what shape of the
population distribution as long as the population
has a finite standard deviation σ
40
Central Limit Theorem


Draw a SRS of size n from any population
with mean µ and finite standard deviation σ



When n is large, the sampling distribution of
the sample mean x-bar is approximately
Normal:
x-bar is approximately

σ

N µ
,

n 

41
Central Limit Theorem


More general versions of the central limit
theorem say that the distribution of a sum or
average of many small random quantities is
close to Normal



The central limit theorem suggests why the
Normal distributions are common models for
observed data
42
How Large a Sample is Needed?


Sample Size depends on whether the
population distribution is close to Normal
 We

require more observations if the shape of
the population distribution is far from Normal

43
Example





The time X that a technician requires to perform
preventive maintenance on an air-conditioning unit
is governed by the Exponential distribution (figure
4.17 (a)) with mean time µ = 1 hour and standard
deviation σ = 1 hour
Your company operates 70 of these units
The distribution of the mean time your company
spends on preventative maintenance is:

1 

N 1,
 = N (1,0.12 )
70 


44
Example


What is the probability that P ( x > 0.83)
your company’s units


average maintenance time


 x − µ > 0.83 − 1 
exceeds 50 minutes?
=P
 σ 
0.12 
50/60 = 0.83 hour



 So we want to know P(x-bar >
 n 

0.83)
= P ( z > −1.42 )




Use Normal distribution
calculations we learned in
Chapter 2!

= 1 − P ( z < −1.42 )

= 1 − 0.0778 = 0.9222
45
4.86 ACT scores


The scores of students on the ACT college
entrance examination in a recent year had
the Normal distribution with mean µ = 18.6
and standard deviation σ = 5.9

46
4.86 ACT scores


What is the probability that a single
student randomly chosen from all those
taking the test scores 21 or higher?

P( x ≥ 21)
 x − µ 21 − 18.6 
= P
≥

5.9 
 σ
= P ( z ≥ 0.4068) = 1 − P ( z < 0.41)
= 1 − 0.6591 = 0.3409

47
4.86 ACT scores


About 34% of students (from this
population) scored a 21 or higher on the
ACT



The probability that a single student
randomly chosen from this population
would have a score of 21 or higher is 0.34
48
4.86 ACT scores


Now take a SRS of 50 students who took
the test. What are the mean and standard
deviation of the sample mean score x-bar
of these 50 students?
 Mean

= 18.6 [same as µ]
 Standard Deviation = 0.8344 [sigma/sqrt(50)]

49
4.86 ACT scores


What is the probability that the mean
score x-bar of these students is 21 or
higher?
P ( x ≥ 21)




x − µ 21 − 18.6 
= P
≥
 σ 
0.834 



 n 

= P ( z ≥ 2.8778) = 1 − P ( z < 2.88)
= 1 − 0.9980 = 0.002

50
4.86 ACT scores


About 0.2 % of all random samples of size
50 (from this population) would have a
mean score x-bar of 21 or higher.



The probability of having a mean score xbar of 21 or higher from a sample of 50
students (from this population) is 0.002.
51
Section 4.4 Summary


When we want information about the
population mean µ for some variable, we
often take a SRS and use the sample
mean x-bar to estimate the unknown
parameter µ.

52
Section 4.4 Summary


The Law of Large Numbers states that
the actually observed mean outcome xbar must approach the mean µ of the
population as the number of observations
increases.

53
Section 4.4 Summary


The sampling distribution of x-bar
describes how the statistic x-bar varies in
all possible samples of the same size from
the same population.

54
Section 4.4 Summary


The mean of the sampling distribution is
µ, so that x-bar is an unbiased estimator
of µ.

55
Section 4.4 Summary


The standard deviation of the sampling
distribution of x-bar is sigma over the
square root of n for a SRS of size n if the
population has standard deviation sigma.
That is, averages are less variable than
individual observations.

56
Section 4.4 Summary


If the population has a Normal distribution,
so does x-bar.

57
Section 4.4 Summary


The Central Limit Theorem states that for large
n the sampling distribution of x-bar is
approximately Normal for any population with
finite standard deviation sigma. That is,
averages are more Normal than individual
observations. We can use the fact that x-bar
has a known Normal distribution to calculate
approximate probabilities for events involving xbar.
58

Weitere ähnliche Inhalte

Was ist angesagt?

Biostats Origional
Biostats OrigionalBiostats Origional
Biostats Origionalsanchitbaba
 
Probability distribution
Probability distributionProbability distribution
Probability distributionRohit kumar
 
Stat lesson 5.1 probability distributions
Stat lesson 5.1 probability distributionsStat lesson 5.1 probability distributions
Stat lesson 5.1 probability distributionspipamutuc
 
Odds ratio and confidence interval
Odds ratio and confidence intervalOdds ratio and confidence interval
Odds ratio and confidence intervalUttamaTungkhang
 
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3Daniel Katz
 
Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...
Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...
Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...Daniel Katz
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testingSumit Sharma
 
Statistical thinking
Statistical thinkingStatistical thinking
Statistical thinkingmij1120
 
Fin415 Week 2 Slides
Fin415 Week 2 SlidesFin415 Week 2 Slides
Fin415 Week 2 Slidessmarkbarnes
 
Chapter 7 Powerpoint
Chapter 7 PowerpointChapter 7 Powerpoint
Chapter 7 PowerpointZIADALRIFAI
 
4 1 probability and discrete probability distributions
4 1 probability and discrete    probability distributions4 1 probability and discrete    probability distributions
4 1 probability and discrete probability distributionsLama K Banna
 
Statistik Chapter 6
Statistik Chapter 6Statistik Chapter 6
Statistik Chapter 6WanBK Leo
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probabilityguest45a926
 
Statistik Chapter 4
Statistik Chapter 4Statistik Chapter 4
Statistik Chapter 4WanBK Leo
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributionsRajaKrishnan M
 
Continuous probability Business Statistics, Management
Continuous probability Business Statistics, ManagementContinuous probability Business Statistics, Management
Continuous probability Business Statistics, ManagementDebjit Das
 
Chapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample SizeChapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample Sizeguest3720ca
 

Was ist angesagt? (20)

Biostats Origional
Biostats OrigionalBiostats Origional
Biostats Origional
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 
Stat lesson 5.1 probability distributions
Stat lesson 5.1 probability distributionsStat lesson 5.1 probability distributions
Stat lesson 5.1 probability distributions
 
Odds ratio and confidence interval
Odds ratio and confidence intervalOdds ratio and confidence interval
Odds ratio and confidence interval
 
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
Quantitative Methods for Lawyers - Class #20 - Regression Analysis - Part 3
 
Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...
Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...
Quantitative Methods for Lawyers - Class #13 - Students "t" Distribution - Pr...
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Probability Concepts
Probability ConceptsProbability Concepts
Probability Concepts
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Statistical thinking
Statistical thinkingStatistical thinking
Statistical thinking
 
Fin415 Week 2 Slides
Fin415 Week 2 SlidesFin415 Week 2 Slides
Fin415 Week 2 Slides
 
Chapter 7 Powerpoint
Chapter 7 PowerpointChapter 7 Powerpoint
Chapter 7 Powerpoint
 
4 1 probability and discrete probability distributions
4 1 probability and discrete    probability distributions4 1 probability and discrete    probability distributions
4 1 probability and discrete probability distributions
 
Statistik Chapter 6
Statistik Chapter 6Statistik Chapter 6
Statistik Chapter 6
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probability
 
Statistik Chapter 4
Statistik Chapter 4Statistik Chapter 4
Statistik Chapter 4
 
Chapter 6 Probability
Chapter 6  ProbabilityChapter 6  Probability
Chapter 6 Probability
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributions
 
Continuous probability Business Statistics, Management
Continuous probability Business Statistics, ManagementContinuous probability Business Statistics, Management
Continuous probability Business Statistics, Management
 
Chapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample SizeChapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample Size
 

Andere mochten auch

Samuelson and Davidson on Ergodicity
Samuelson and Davidson on ErgodicitySamuelson and Davidson on Ergodicity
Samuelson and Davidson on Ergodicitypkconference
 
Continous random variable.
Continous random variable.Continous random variable.
Continous random variable.Shakeel Nouman
 
law of large number and central limit theorem
 law of large number and central limit theorem law of large number and central limit theorem
law of large number and central limit theoremlovemucheca
 

Andere mochten auch (7)

Sampling theory
Sampling theorySampling theory
Sampling theory
 
Samuelson and Davidson on Ergodicity
Samuelson and Davidson on ErgodicitySamuelson and Davidson on Ergodicity
Samuelson and Davidson on Ergodicity
 
Continous random variable.
Continous random variable.Continous random variable.
Continous random variable.
 
law of large number and central limit theorem
 law of large number and central limit theorem law of large number and central limit theorem
law of large number and central limit theorem
 
Random variables
Random variablesRandom variables
Random variables
 
Sampling theorem
Sampling theoremSampling theorem
Sampling theorem
 
Sampling
SamplingSampling
Sampling
 

Ähnlich wie 226 lec9 jda

Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data sciencepujashri1975
 
data handling class 8
data handling class 8data handling class 8
data handling class 8HimakshiKava
 
Math class 8 data handling
Math class 8 data handling Math class 8 data handling
Math class 8 data handling HimakshiKava
 
Confidencesignificancelimtis
ConfidencesignificancelimtisConfidencesignificancelimtis
Confidencesignificancelimtisguest9fa52
 
Confidencesignificancelimtis
ConfidencesignificancelimtisConfidencesignificancelimtis
Confidencesignificancelimtisguest2137aa
 
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...nszakir
 
5_lectureslides.pptx
5_lectureslides.pptx5_lectureslides.pptx
5_lectureslides.pptxsuchita74
 
Pengenalan Ekonometrika
Pengenalan EkonometrikaPengenalan Ekonometrika
Pengenalan EkonometrikaXYZ Williams
 
Introduction to Statistics - Part 2
Introduction to Statistics - Part 2Introduction to Statistics - Part 2
Introduction to Statistics - Part 2Damian T. Gordon
 
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxAnswer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxboyfieldhouse
 
Sampling, Statistics and Sample Size
Sampling, Statistics and Sample SizeSampling, Statistics and Sample Size
Sampling, Statistics and Sample Sizeclearsateam
 
1 Review and Practice Exam Questions for Exam 2 Lea.docx
1  Review and Practice Exam Questions for Exam 2 Lea.docx1  Review and Practice Exam Questions for Exam 2 Lea.docx
1 Review and Practice Exam Questions for Exam 2 Lea.docxmercysuttle
 
35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx
35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx
35813 Topic Discussion2Number of Pages 1 (Double Spaced).docxrhetttrevannion
 
35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx
35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx
35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docxrhetttrevannion
 

Ähnlich wie 226 lec9 jda (20)

Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data science
 
data handling class 8
data handling class 8data handling class 8
data handling class 8
 
Math class 8 data handling
Math class 8 data handling Math class 8 data handling
Math class 8 data handling
 
Confidencesignificancelimtis
ConfidencesignificancelimtisConfidencesignificancelimtis
Confidencesignificancelimtis
 
Confidencesignificancelimtis
ConfidencesignificancelimtisConfidencesignificancelimtis
Confidencesignificancelimtis
 
05inference_2011.ppt
05inference_2011.ppt05inference_2011.ppt
05inference_2011.ppt
 
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
Chapter 5 part2- Sampling Distributions for Counts and Proportions (Binomial ...
 
Statistics
StatisticsStatistics
Statistics
 
02a one sample_t-test
02a one sample_t-test02a one sample_t-test
02a one sample_t-test
 
Statistics
StatisticsStatistics
Statistics
 
5_lectureslides.pptx
5_lectureslides.pptx5_lectureslides.pptx
5_lectureslides.pptx
 
Pengenalan Ekonometrika
Pengenalan EkonometrikaPengenalan Ekonometrika
Pengenalan Ekonometrika
 
Statistics
StatisticsStatistics
Statistics
 
Introduction to Statistics - Part 2
Introduction to Statistics - Part 2Introduction to Statistics - Part 2
Introduction to Statistics - Part 2
 
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxAnswer the questions in one paragraph 4-5 sentences. · Why did t.docx
Answer the questions in one paragraph 4-5 sentences. · Why did t.docx
 
Sampling, Statistics and Sample Size
Sampling, Statistics and Sample SizeSampling, Statistics and Sample Size
Sampling, Statistics and Sample Size
 
1 Review and Practice Exam Questions for Exam 2 Lea.docx
1  Review and Practice Exam Questions for Exam 2 Lea.docx1  Review and Practice Exam Questions for Exam 2 Lea.docx
1 Review and Practice Exam Questions for Exam 2 Lea.docx
 
Machine learning session2
Machine learning   session2Machine learning   session2
Machine learning session2
 
35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx
35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx
35813 Topic Discussion2Number of Pages 1 (Double Spaced).docx
 
35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx
35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx
35845 Topic Group AssignmentNumber of Pages 1 (Double Spaced.docx
 

Kürzlich hochgeladen

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Kürzlich hochgeladen (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

226 lec9 jda

  • 1. Chapter 4 Probability and Sampling Distributions 1
  • 2. Random Variable  Definition: A random variable is a variable whose value is a numerical outcome of a random phenomenon.  The statistic calculated from a randomly chosen sample is an example of a random variable.  We don’t know the exact outcome beforehand. A statistic from a random sample will take different values if we take more samples from the same population. 2
  • 3. Section 4.4 The Sampling Distribution of a Sample Mean 3
  • 4. Introduction A statistic from a random sample will take different values if we take more samples from the same population  The values of a statistic do no vary haphazardly from sample to sample but have a regular pattern in many samples   We  already saw the sampling distribution We’re going to discuss an important sampling distribution. The sampling distribution of the sample mean,4x-bar( )
  • 5. Example   Suppose that we are interested in the workout times of ISU students at the Recreation center. Let’s assume that μ is the average workout time of all ISU students  To estimate μ lets take a simple random sample of 100 students at ISU   We will record each students work out time (x) Then we find the average workout time for the 100 students    x The population mean μ is the parameter of interest. The sample mean, x , is the statistic (which is a random variable). Use x to estimate μ (This seems like a sensible thing to do). 5
  • 6. Example  A SRS should be a fairly good representation of the population so the x-bar should be somewhere near the µ. from a SRS is an unbiased estimate of µ due to the randomization  x-bar  We don’t expect x-bar to be exactly equal to µ  There  is variability in x-bar from sample to sample If we take another simple random sample (SRS) of 100 students, then the x-bar will probably be different.  Why, then, can I use the results of one sample to estimate µ? 6
  • 7. Statistical Estimation  If x-bar is rarely exactly right and varies from sample to sample, why is x-bar a reasonable estimate of the population mean µ?  Answer: if we keep on taking larger and larger samples, the statistic x-bar is guaranteed to get closer and closer to the parameter µ  We have the comfort of knowing that if we can afford to keep on measuring more subjects, eventually we will estimate the mean amount of workout time for ISU students very accurately 7
  • 8. The Law of Large Numbers  Law of Large Numbers (LLN):  Draw independent observations at random from any population with finite mean µ  As the number of observations drawn increases, the mean x-bar of the observed values gets closer and closer to the mean µ of the population  If n is the sample size as n gets large x → µ  The Law of Large Numbers holds for any population, not just for special classes such as Normal distributions 8
  • 9. Example  Suppose we have a bowl with 21 small pieces of paper inside. Each paper is labeled with a number 0-20. We will draw several random samples out of the bowl of size n and record the sample means, x-bar for each sample.    What is the population? Since we know the values for each individual in the population (i.e. for each paper in the bowl), we can actually calculate the value of µ, the true population mean. µ = 10 Draw a random sample of size n = 1. Calculate x-bar for this sample. 9
  • 10. Example  Draw a second random sample of size n = 5. Calculate x for this sample. x  Draw a third random sample of size n = 10. Calculate sample.  Draw a fourth random sample of size n = 15. Calculate sample.  Draw a fifth random sample of size n = 20. Calculate sample.  What can we conclude about the value of increases? for this x for this x for this x as the sample size THIS IS CALLED THE LAW OF LARGE NUMBERS. 10
  • 11. Another Example 5.710 5.705 5.700 5.695 Mean of first n observations  Example: Suppose we know that the average height of all high school students in Iowa is 5.70 feet. We get SRS’s from the population and calculate the height. mean of first n observations (feet)  0 5000 10000 number of observations 15000 11 20000
  • 12. Example 4.21 From Book   Sulfur compounds such as dimethyl sulfide (DMS) are sometimes present in wine DMS causes “off-odors” in wine, so winemakers want to know the odor threshold  What is the lowest concentration of DMS that the human nose can detect  Different people have different thresholds, so we start by asking about the mean threshold µ in the population of all adults µ is a parameter that describes this population 12
  • 13. Example 4.21 From Text   To estimate µ, we present tasters with both natural wine and the same wine spiked with DMS at different concentrations to find the lowest concentration at which they can identify the spiked wine The odor thresholds for 10 randomly chosen subjects (in micrograms/liter):  28  40 28 33 20 31 29 27 17 21 The mean threshold for these subjects is 27.4  x-bar is a statistic calculated from this sample  A statistic, such as the mean of a random sample of 13 10 adults, is a random variable.
  • 14. Example Suppose µ = 25 is the true value of the parameter we seek to estimate  The first subject had threshold 28 so the line starts there  The second point is the mean of the first 28 + 40 two subjects: x= = 34 2   This process continues many many times, and our line begins to settle around µ = 25 14
  • 15. Example 4.21From Book The law of large numbers in action: as we take more observations, the sample mean x always approaches the mean of the population µ = 25 15
  • 16. The Law of Large Numbers  The law of large numbers is the foundation of business enterprises such as casinos and insurance companies  The winnings (or losses) of a gambler on a few plays are uncertain -- that’s why gambling is exciting(?)  But, the “house” plays tens of thousands of times  So the house, unlike individual gamblers, can count on the long-run regularity described by the Law of Large Numbers  The average winnings of the house on tens of thousands of plays will be very close to the mean of the distribution of winnings  Hence, the LLN guarantees the house a profit! 16
  • 17. Thinking about the Law of Large Numbers The Law of Large Numbers says broadly that the average results of many independent observations are stable and predictable  A grocery store deciding how many gallons of milk to stock and a fast-food restaurant deciding how many beef patties to prepare can predict demand even though their customers make independent decisions   The Law of Large Numbers says that the many individual decisions will produce a stable result 17
  • 18. The “Law of Small Numbers” or “Averages”  The Law of Large Numbers describes the regular behavior of chance phenomena in the long run  Many people believe in an incorrect “law of small numbers”  We falsely expect even short sequences of random events to show the kind of average behaviors that in fact appears only in the long run 18
  • 19. The “Law of Small Numbers” or “Averages”  Example: Pretend you have an average free throw success rate of 70%. One day on the free throw line, you miss 8 shots in a row. Should you hit the next shot by the mythical “law of averages.”  No. The law of large numbers tells us that the long run average will be close to 70%. Missing 8 shots in a row simply means you are having a bad day. 8 shots is hardly the “long run”. Furthermore, the law of large numbers says nothing about the next event. It only tells us what will happen if we keep track of the long run average. 19
  • 20. The Hot Hand Debate     In some sports If player makes several consecutive good plays, like a few good golf shots in a row, often they claim to have the “hot hand”, which generally implies that their next shot is likely to a good one. There have been studies that suggests that runs of golf shots good or bad are no more frequent in golf than would be expected if each shot were independent of the player’s previous shots Players perform consistently, not in streaks Our perception of hot or cold streaks simply shows that we don’t perceive random behavior very well! 20
  • 21. The Gambling Hot Hand   Gamblers often follow the hot-hand theory, betting that a “lucky” run will continue At other times, however, they draw the opposite conclusion when confronted with a run of outcomes  If a coin gives 10 straight heads, some gamblers feel that it must now produce some extra tails to get back into the average of half heads and half tails  Not true! If the next 10,000 tosses give about 50% tails, those 10 straight heads will be swamped by the later thousands of heads and tails.  No short run compensation is needed to get back to 21 the average in the long run.
  • 22. Need for Law of Large Numbers  Our inability to accurately distinguish random behavior from systematic influences points out the need for statistical inference to supplement exploratory analysis of data  Probability calculations can help verify that what we see in the data is more than a random pattern 22
  • 23. How Large is a Large Number?  The Law of Large Numbers says that the actual mean outcome of many trials gets close to the distribution mean µ as more trials are made  It doesn’t say how many trials are needed to guarantee a mean outcome close to µ  That  depends on the variability of the random outcomes The more variable the outcomes, the more trials are needed to ensure that the mean outcome xbar is close to the distribution µ 23
  • 24. More Laws of Large Numbers  The Law of Large Numbers is one of the central facts about probability  LLN explains why gambling, casinos, and insurance companies make money  LLN assures us that statistical estimation will be accurate if we can afford enough observations  The basic Law of Large Numbers applies to independent observations that all have the same distribution  Mathematicians general settings have extended the law to many more 24
  • 25. What if Observations are not Independent     You are in charge of a process that manufactures video screens for computer monitors Your equipment measures the tension on the metal mesh that lies behind each screen and is critical to its image quality You want to estimate the mean tension µ for the process by the average x-bar of the measurements The tension measurements are not independent 25
  • 26. AYK 4.82  Use the Law of Large Numbers applet on the text book website 26
  • 27. Sampling Distributions  The Law of Large Numbers assures us that if we measure enough subjects, the statistic xbar will eventually get very close to the unknown parameter µ 27
  • 28. Sampling Distributions  What if we don’t have a large sample?  Take a large number of samples of the same size from the same population  Calculate  Make  the sample mean for each sample a histogram of the sample means the histogram of values of the statistic approximates the sampling distribution that we would see if we kept on sampling forever… 28
  • 29.  The idea of a sampling distribution is the foundation of statistical inference  The laws of probability can tell us about sampling distributions without the need to actually choose or simulate a large number of samples 29
  • 30. Mean and Standard Deviation of a Sample Mean  Suppose that x-bar is the mean of a SRS of size n drawn from a large population with mean µ and standard deviation σ  The mean of the sampling distribution of x-bar is σ µ and its standard deviation is n  Notice: averages are less variable than individual observations! 30
  • 31. Mean and Standard Deviation of a Sample Mean  The mean of the statistic x-bar is always the same as the mean µ of the population sampling distribution of x-bar is centered at µ  in repeated sampling, x-bar will sometimes fall above the true value of the parameter µ and sometimes below, but there is no systematic tendency to overestimate or underestimate the parameter  because the mean of x-bar is equal to µ, we say that the statistic x-bar is an unbiased estimator of the parameter µ  the 31
  • 32. Mean and Standard Deviation of a Sample Mean  An unbiased estimator is “correct on the average” in many samples  how close the estimator falls to the parameter in most samples is determined by the spread of the sampling distribution  if individual observations have standard deviation σ, then sample means x-bar from samples of size n σ have standard deviation n  Again, notice that averages are less variable than individual observations 32
  • 33. Mean and Standard Deviation of a Sample Mean  Not only is the standard deviation of the distribution of x-bar smaller than the standard deviation of individual observations, but it gets smaller as we take larger samples  The results of large samples are less variable than the results of small samples  Remember, we divided by the square root of n 33
  • 34. Mean and Standard Deviation of a Sample Mean  If n is large, the standard deviation of x-bar is small and almost all samples will give values of xbar that lie very close to the true parameter µ  The sample mean from a large sample can be trusted to estimate the population mean accurately  Notice, that the standard deviation of the sample distribution gets smaller only at the rate n  To cut the standard deviation of x-bar in half, we must take four times as many observations, not just twice as many (square root of 4 is 2) 34
  • 35. Example  Suppose we take samples of size 15 from a distribution with mean 25 and standard deviation 7  the distribution of x-bar is:  the mean of x-bar is:  25  the  7    25,   15  standard deviation of x-bar is: 1.80739 35
  • 36. What About Shape?  We have described the center and spread of the sampling distribution of a sample mean x-bar, but not its shape  The shape of the distribution of x-bar depends on the shape of the population distribution 36
  • 37. Sampling Distribution of a Sample Mean  If a population has the N(µ, σ) distribution, then the sample mean x-bar of n independent observations has the σ  N µ ,  distribution n   37
  • 38. Example  Adults differ in the smallest amount of dimethyl sulfide they can detect in wine  Extensive studies have found that the DMS odor threshold of adults follows roughly a Normal distribution with mean µ = 25 micrograms per liter and standard deviation σ = 7 micrograms per liter 38
  • 39. Example  Because the population distribution is Normal, the sampling distribution of xbar is also Normal  If n = 10, what is the distribution of xbar? 7   N  25,  10   39
  • 40. What if the Population Distribution is not Normal?  As the sample size increases, the distribution of x-bar changes shape  The distribution looks less like that of the population and more like a Normal distribution  When the sample is large enough, the distribution of x-bar is very close to Normal  This result is true no matter what shape of the population distribution as long as the population has a finite standard deviation σ 40
  • 41. Central Limit Theorem  Draw a SRS of size n from any population with mean µ and finite standard deviation σ  When n is large, the sampling distribution of the sample mean x-bar is approximately Normal: x-bar is approximately σ  N µ ,  n   41
  • 42. Central Limit Theorem  More general versions of the central limit theorem say that the distribution of a sum or average of many small random quantities is close to Normal  The central limit theorem suggests why the Normal distributions are common models for observed data 42
  • 43. How Large a Sample is Needed?  Sample Size depends on whether the population distribution is close to Normal  We require more observations if the shape of the population distribution is far from Normal 43
  • 44. Example    The time X that a technician requires to perform preventive maintenance on an air-conditioning unit is governed by the Exponential distribution (figure 4.17 (a)) with mean time µ = 1 hour and standard deviation σ = 1 hour Your company operates 70 of these units The distribution of the mean time your company spends on preventative maintenance is: 1   N 1,  = N (1,0.12 ) 70   44
  • 45. Example  What is the probability that P ( x > 0.83) your company’s units   average maintenance time    x − µ > 0.83 − 1  exceeds 50 minutes? =P  σ  0.12  50/60 = 0.83 hour     So we want to know P(x-bar >  n   0.83) = P ( z > −1.42 )   Use Normal distribution calculations we learned in Chapter 2! = 1 − P ( z < −1.42 ) = 1 − 0.0778 = 0.9222 45
  • 46. 4.86 ACT scores  The scores of students on the ACT college entrance examination in a recent year had the Normal distribution with mean µ = 18.6 and standard deviation σ = 5.9 46
  • 47. 4.86 ACT scores  What is the probability that a single student randomly chosen from all those taking the test scores 21 or higher? P( x ≥ 21)  x − µ 21 − 18.6  = P ≥  5.9   σ = P ( z ≥ 0.4068) = 1 − P ( z < 0.41) = 1 − 0.6591 = 0.3409 47
  • 48. 4.86 ACT scores  About 34% of students (from this population) scored a 21 or higher on the ACT  The probability that a single student randomly chosen from this population would have a score of 21 or higher is 0.34 48
  • 49. 4.86 ACT scores  Now take a SRS of 50 students who took the test. What are the mean and standard deviation of the sample mean score x-bar of these 50 students?  Mean = 18.6 [same as µ]  Standard Deviation = 0.8344 [sigma/sqrt(50)] 49
  • 50. 4.86 ACT scores  What is the probability that the mean score x-bar of these students is 21 or higher? P ( x ≥ 21)     x − µ 21 − 18.6  = P ≥  σ  0.834      n   = P ( z ≥ 2.8778) = 1 − P ( z < 2.88) = 1 − 0.9980 = 0.002 50
  • 51. 4.86 ACT scores  About 0.2 % of all random samples of size 50 (from this population) would have a mean score x-bar of 21 or higher.  The probability of having a mean score xbar of 21 or higher from a sample of 50 students (from this population) is 0.002. 51
  • 52. Section 4.4 Summary  When we want information about the population mean µ for some variable, we often take a SRS and use the sample mean x-bar to estimate the unknown parameter µ. 52
  • 53. Section 4.4 Summary  The Law of Large Numbers states that the actually observed mean outcome xbar must approach the mean µ of the population as the number of observations increases. 53
  • 54. Section 4.4 Summary  The sampling distribution of x-bar describes how the statistic x-bar varies in all possible samples of the same size from the same population. 54
  • 55. Section 4.4 Summary  The mean of the sampling distribution is µ, so that x-bar is an unbiased estimator of µ. 55
  • 56. Section 4.4 Summary  The standard deviation of the sampling distribution of x-bar is sigma over the square root of n for a SRS of size n if the population has standard deviation sigma. That is, averages are less variable than individual observations. 56
  • 57. Section 4.4 Summary  If the population has a Normal distribution, so does x-bar. 57
  • 58. Section 4.4 Summary  The Central Limit Theorem states that for large n the sampling distribution of x-bar is approximately Normal for any population with finite standard deviation sigma. That is, averages are more Normal than individual observations. We can use the fact that x-bar has a known Normal distribution to calculate approximate probabilities for events involving xbar. 58