SlideShare a Scribd company logo
1 of 60
Download to read offline
POPULATION AND
SAMPLE MEAN
Avjinder Singh Kaler and Kristi Mai
• Estimating a Population Mean
• 𝜎 unknown
• 𝜎 known
• Estimating the difference between two population means
• Independent samples
• Dependent samples
Main Ideas:
• The sample mean is the best point estimate of the population mean
• We can use a sample mean to construct a C.I. to estimate the true value of a
population mean
• We must learn how to find the sample size necessary to estimate a population mean
Recall:
• 𝑥 =
𝑥
𝑛
: sample mean
• 𝑥 targets 𝜇 and is an individual value that is used as an estimate (i.e. it is a point
estimate for 𝜇)
Notice: There are two situations when estimating a population mean
1. 𝜎, the population standard deviation, is known
2. 𝜎, the population standard deviation, is NOT known
• Margin of Error (estimating the population mean when 𝜎 is known)
• 𝐸 = 𝑍 𝛼/2 ∗
𝜎
𝑛
• Notice: The margin of error changes when what we are estimating changes!!
• Constructing a C.I.
• Requirements:
 The sample must be a SRS
 The value of 𝜎 is known
 The population is normal OR 𝑛 > 30
• C.I.:
 𝑥 − 𝐸 < 𝜇 < 𝑥 + 𝐸
 Same as: 𝑥 ± 𝐸 and (𝑥 − 𝐸, 𝑥 + 𝐸)
• Minimum required sample size
• Sample size needed:
• If 𝜎 is known:
𝑛 =
(𝑍 𝛼/2) ∗ 𝜎
𝐸
2
• If not a whole number, ALWAYS round up to the nearest
whole number for minimum required sample sizes
Some Key Points:
• The sample mean is still the best point estimate of the population
mean
• We can use a sample mean to construct a C.I. to estimate the true
value of a population mean even when we do not know the
population standard deviation
• We see that if requirements are generally met but 𝜎 is unknown, we
must use a t-distribution
The Student t Distribution:
• If a population has a normal distribution, then the following formula describes the t-distribution:
𝑡 =
𝑥−𝜇
𝑠
𝑛
• The above formula is a t-score; a measure of relative standing
• We are estimating the unknown population standard deviation with the sample standard
deviation
• This estimation would typically lead to unreliability and so we compensate for this inherent
unreliability with wider intervals and “fatter tails” displayed in the density curve
• We must utilize a t-table or t-calculator when using the t-distribution
• We NEED degrees of freedom
 Degrees of Freedom (𝑑𝑓) for a collection of sample data is the number of sample values that can vary
after certain restrictions have been imposed upon all the data values
• Recall:
• 𝑠 =
𝑥−𝑥 2
𝑛−1
: sample standard deviation
• Margin of Error
 𝐸 = 𝑡 𝛼/2 ∗
𝑠
𝑛
with 𝑑𝑓 = 𝑛 − 1
 Notice: The margin of error also changes when the
information we have changes
Constructing a C.I.
• Requirements:
• The sample must be a SRS
• The value of 𝜎 is NOT known
• The population is normal OR 𝑛 > 30
• C.I.:
• 𝑥 − 𝐸 < 𝜇 < 𝑥 + 𝐸
• Same As: 𝑥 ± 𝐸 and 𝑥 − 𝐸, 𝑥 + 𝐸
• Notice that the C.I. appears to be the same – however, it will NOT be the
same as the previous CI for 𝜇 because (with our uncertainty about 𝜎) the
margin of error changed
• The student t distribution is different for different sample sizes
• The t distribution has the same general symmetric bell shape as the Normal
distribution, but reflects the greater variability that is expected when samples
are smaller
• The t distribution has a mean of 𝑡 = 0 just as the standard normal distribution has
a mean of 𝑧 = 0
Is the population
normal OR is n>30
Is 𝜎 known or
unknown?
Use normal
distribution
(Normal -Calculator)
Use t distribution
(t-calculator)
Use nonparametric
method or
bootstrapping
technique
Yes
No
Known
Unknown
 Requirements:
 The sample must be a SRS
 The value of 𝜎 is known
 The population is normal OR 𝑛 > 30
 Test Statistic: z =
𝑥−𝜇
𝜎
𝑛
 𝜇: population mean (assumed true
under 𝐻0)
Note: p-values and critical values are from
Z-table
 Requirements:
 The sample must be a SRS
 The value of 𝜎 is NOT known
 The population is normal OR 𝑛 > 30
 Test Statistic: t =
𝑥−𝜇
𝑠
𝑛
; 𝑑𝑓 = 𝑛 − 1
 𝜇: population mean (assumed true
under 𝐻0)
Note: p-values and critical values are from
t-table
𝜎 known 𝜎 NOT known
Listed below are the measured radiation emissions (in W/kg) corresponding to
a sample of cell phones.
Use a 0.05 level of significance to test the claim that cell phones have a mean
radiation level that is less than 1.00 W/kg.
The summary statistics are: .
0.38 0.55 1.54 1.55 0.50 0.60 0.92 0.96 1.00 0.86 1.46
0.938 and 0.423x s 
Requirement Check:
1. We assume the sample is a simple random sample.
2. The sample size is n = 11, which is not greater than 30, so we must check
a normal quantile plot for normality.
Note: (See plot on the right)
The points are reasonably close to a straight line
and there is no other patter, so we conclude that
The data appear to be from a normally distributed
Population.
Step 1: The claim that cell phones have a mean radiation level less than 1.00
W/kg is expressed as μ < 1.00 W/kg.
Step 2: The alternative to the original claim is μ ≥ 1.00 W/kg.
Step 3: The hypotheses are written as:
Step 4: The stated level of significance is 𝛼 = 0.05.
Step 5: Because the claim is about a population mean μ, the statistic most
relevant to this test is the sample mean:
0
1
: 1.00 W/kg
: 1.00 W/kg
H
H




x
Step 6: Calculate the test statistic and then find the P-value or the critical value
using StatCrunch.
0.938 1.00
0.486
0.423
11
xx
t
s
n
 
   
Step 7: Critical Value Method: Because the test statistic of t = –0.486 does not
fall in the critical region bounded by the critical value of t = –1.812, fail to reject
the null hypothesis.
Step 7: P-value method:
Using StatCrunch, the P-value computed is 0.3187. Since the P-value is
greater than α = 0.05, we fail to reject the null hypothesis.
Step 8:
Because we fail to reject the null hypothesis, we conclude that there is not
sufficient evidence to support the claim that cell phones have a mean
radiation level that is less than 1.00 W/kg.
We can use a confidence interval for testing a claim about μ.
For a two-tailed test with a 0.05 significance level, we construct a 95% confidence
interval.
For a one-tailed test with a 0.05 significance level, we construct a 90% confidence
interval.
Using the cell phone example, construct a confidence interval that can be used to
test the claim that μ < 1.00 W/kg, assuming a 0.05 significance level.
Note that a left-tailed hypothesis test with α = 0.05 corresponds to a 90%
confidence interval.
Using StatCrunch, the confidence interval is:
0.707 W/kg < μ < 1.169 W/kg
Because the value of μ = 1.00 W/kg is contained in the interval, we fail to reject the
null hypothesis that μ = 1.00 W/kg .
Based on the sample of 11 values, we do not have sufficient evidence to support
the claim that the mean radiation level is less than 1.00 W/kg.
When σ is known, we use test that involves the standard normal distribution.
In reality, it is very rare to test a claim about an unknown population mean
while the population standard deviation is somehow known.
The procedure is essentially the same as a t test, with the following
exception: The test statistic is
The P-value and critical values can be computed using StatCrunch.
xx
z
n




If we repeat the cell phone radiation example, with the assumption that
σ = 0.480 W/kg, the test statistic is:
The example refers to a left-tailed test, so the P-value is the area to the left
of z = –0.43, which is 0.3342.
Since the P-value is greater than 𝛼 = 0.05, we fail to reject the null and
reach the same conclusion as before.
0.938 1.00
0.43
0.480
11
xx
z
n


 
   
Main Ideas:
• The sample mean is the best point estimate of the population mean
• We can use two independent sample means to construct a
confidence interval that can be used to estimate the true value of the
underlying difference in the corresponding population means
• We can also test claims about the difference between two population
means
Notation:
Dependent samples
 two samples are dependent if the
sample values are paired
Independent samples
 two samples are independent if
the sample values from one are
not related to or somehow
naturally paired/matched with the
sample values from the other
Requirements:
• Population standard deviations (𝜎1 and 𝜎2) are NOT known and
NOT assumed equal
• The two samples are independent
• Both samples are SRS
• Both 𝑛1 > 30 and 𝑛2 > 30 OR both samples come from populations
that are normal
• Margin of Error
𝐸 = 𝑡 𝛼/2 ∗
𝑠1
2
𝑛1
+
𝑠2
2
𝑛2
and 𝑑𝑓 = min 𝑛1 − 1, 𝑛2 − 1
• C.I.: 𝑥1 − 𝑥2 − 𝐸 < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝐸
• Notice that we are often interested in whether or not 0 is included
within the limits of the confidence interval constructed, i.e., whether or
not 𝜇1 − 𝜇2 = 0 is reasonable
• Requirements:
• Requirements and degrees of freedom (df) are the same as in the
C.I. before
• Test Statistic: 𝑡 =
𝑥1−𝑥2 − 𝜇1−𝜇2
𝑠1
2
𝑛1
+
𝑠2
2
𝑛2
Researchers conducted trials to investigate the effects of color on
creativity.
Subjects with a red background were asked to think of creative uses for
a brick; other subjects with a blue background were given the same
task.
Responses were given by a panel of judges.
Researchers make the claim that “blue enhances performance on a
creative task”. Test the claim using a 0.01 significance level.
Requirement check:
1. The values of the two population standard deviations are unknown
and assumed not equal.
2. The subject groups are independent.
3. The samples are simple random samples.
4. Both sample sizes exceed 30.
The requirements are all satisfied.
The data:
Background color Sample size Sample mean Sample standard deviation
Red Background n = 35 s = 0.97
Blue Background n = 36 s = 0.63
3.39x 
3.97x 
Step 1: The claim that “blue enhances performance on a creative task”
can be restated as “people with a blue background (group 2) have a
higher mean creativity score than those in the group with a red background
(group 1)”. This can be expressed as μ1 < μ2.
Step 2: If the original claim is false, then μ1 ≥ μ2.
Step 3: The hypotheses can be written as:
OR
𝐻0: 𝜇1−𝜇2=0
𝐻1: 𝜇1−𝜇2<0
0 1 2
1 1 2
:
:
H
H
 
 


Step 4: The significance level is α = 0.05.
Step 5: Because we have two independent samples and we are testing a claim
about two population means, we use a t-distribution.
Step 6: Calculate the test statistic. 1 2 1 2
2 2
1 2
1 2
2 2
( ) ( )
(3.39 3.97) 0
2.979
0.97 0.63
35 36
x x
t
s s
n n
   


 
  

Step 6: Because we are using a t-distribution, the critical value of t = –2.441
is found using StatCrunch. We use 34 degrees of freedom.
Step 7: Because the test statistic does fall in the critical region, we reject the
null hypothesis μ1 – μ2.
P-Value Method: StatCrunch provides a P-value, and the area to the left of
the test statistic of t = –2.979 is 0.0021. Since this is less than the significance
level of 0.01, we reject the null hypothesis.
Conclusion: There is sufficient evidence to support the claim that the red
background group has a lower mean creativity score than the blue
background group.
Using the data from this color creativity example, construct a 98%
confidence interval estimate for the difference between the mean
creativity score for those with a red background and the mean
creativity score for those with a blue background.
Using StatCrunch, the 98% confidence interval obtained is:
−1.05 < 𝜇1 − 𝜇2 < −0.11
2 2 2 2
1 2
/2
1 2
0.97 0.63
2.441 0.475261
35 36
s s
E t
n n
    
1 23.39 and 3.97x x 
1 2 1 2 1 2
1 2
( ) ( ) ( )
1.06 ( ) 0.10
x x E x x E 
 
      
    
We are 98% confident that the limits –1.05 and –0.11 actually do contain
the difference between the two population means.
Because those limits do not include 0, our interval suggests that there is
a significant difference between the two means.
These methods are rarely used in practice because the underlying
assumptions are usually not met.
1. The two population standard deviations are both known
• the test statistic will be a z instead of a t and use the standard
normal model.
2. The two population standard deviations are unknown but assumed
to be equal
• pool the sample variances
1 2 1 2
2 2
1 2
1 2
( ) ( )x x
z
n n
 
 
  


The test statistic will be:
P-values and critical values are found using StatCrunch.
1 2 1 2 1 2( ) ( ) ( )x x E x x E       
2 2
1 2
/ 2
1 2
E z
n n

 
 
The test statistic will be
Where the pooled sample variance is
with
1 2 1 2
2 2
1 2
( ) ( )
p p
x x
t
s s
n n
   


2 2
2 1 1 2 2
1 2
( 1) ( 1)
( 1) ( 1)
p
n s n s
s
n n
  

  
1 2df 2n n  
1 2 1 2 1 2( ) ( ) ( )x x E x x E       
2 2
/2
1 2
p ps s
E t
n n
 
1 2df 2n n  
Independent Samples (Two Additional Methods)
• 𝜎1 𝑎𝑛𝑑 𝜎2 known – Z Test / Z Interval
• 𝜎1 = 𝜎2 -- Pooled Sample Variance
Dependent Samples
• When samples are paired, we use a different methodology
Main Ideas:
• The sample mean is still the best point estimate of the population mean
• We can use two dependent sample means to construct a confidence interval
that can be used to estimate the true value of the underlying difference in the
corresponding population means
• We can also test claims about the difference between two population means
• In experimental design, using dependent samples is generally better and more
practical than assuming two independent samples
Notation:
• 𝑑: 𝑡ℎ𝑒 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑎 𝑠𝑖𝑛𝑔𝑙𝑒 𝑚𝑎𝑡𝑐ℎ𝑒𝑑 𝑝𝑎𝑖𝑟
• 𝑛: 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑎𝑖𝑟𝑠 𝑜𝑓 𝑑𝑎𝑡𝑎
• 𝜇 𝑑: 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑝𝑎𝑖𝑟𝑠 𝑜𝑓 𝑑𝑎𝑡𝑎
• 𝑑: 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑝𝑎𝑖𝑟𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎
• 𝑠 𝑑: 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑝𝑎𝑖𝑟𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎
Requirements
• The sample data are dependent
• Both samples are SRS
• Either 𝑛 > 30 OR the paired differences come from a population that is
normal
Margin of Error
• 𝐸 = 𝑡 𝛼/2 ∗
𝑠 𝑑
𝑛
with 𝑑𝑓 = 𝑛 − 1
C.I.
• 𝑑 − 𝐸 < 𝜇 𝑑 < 𝑑 + 𝐸
Notice that we are often interested in whether or not 0 is included within the limits
of the confidence interval constructed, i.e., whether or not 𝜇 𝑑 = 0 is reasonable
Requirements:
• Requirements and Degrees of freedom (𝑑𝑓) are the same as in the C.I.
above
Test Statistic:
𝑡 =
𝑑−𝜇 𝑑
𝑠 𝑑
𝑛
Use the sample data below with a significance level of 0.05 to test the
claim that for the population of heights of presidents and their main
opponents, the differences have a mean greater than 0 cm (so presidents
tend to be taller than their opponents).
Height (cm) of President 189 173 183 180 179
Height (cm) of Main Opponent 170 185 175 180 178
Difference d 19 -12 8 0 1
Requirement Check:
1. The samples are dependent because the values are paired.
2. The pairs of data are randomly selected.
3. The number of data points is 5, so normality should be checked (and it is
assumed the condition is met).
Step 1: The claim is that µd > 0 cm.
Step 2: If the original claim is not true, we have µd ≤ 0 cm.
Step 3: The hypotheses can be written as:
0
0
: 0 cm
: 0 cm
d
d
H
H




Step 4: The significance level is α = 0.05.
Step 5: We use the Student t-distribution.
The summary statistics are: 3.2
11.4
d
s


Step 6: Determine the value of the test statistic:
with df = 5 – 1 = 4
3.2 0
0.628
11.4
5
d
d
d
t
s
n
 
  
Step 6: Using StatCrunch, the P-value is 0.282.
Using the critical value method:
Step 7: Because the P-value exceeds 0.05, or because the test statistic
does not fall in the critical region, we fail to reject the null hypothesis.
Conclusion: There is not sufficient evidence to support the claim that for
the population of heights of presidents and their main opponent, the
differences have a mean greater than 0 cm.
In other words, presidents do not appear to be taller than their
opponents.
Confidence Interval: Support the conclusions with a 90% confidence
interval estimate for µd.
/2
11.4
2.132 10.8694
5
ds
E t
n
  
3.2 10.8694 3.2 10.8694
7.7 14.1
d
d
d
d E d E


   
   
  
We have 90% confidence that the limits of –7.7 cm and 14.1 cm contain
the true value of the difference in height (president’s height – opponent’s
height).
See that the interval does contain the value of 0 cm, so it is very possible
that the mean of the differences is equal to 0 cm, indicating that there is
no significant difference between the heights.
Complete the following:
• Practice Problems 5
• Practice Problems 6

More Related Content

What's hot

The sampling distribution
The sampling distributionThe sampling distribution
The sampling distribution
Harve Abella
 
Test of hypothesis
Test of hypothesisTest of hypothesis
Test of hypothesis
vikramlawand
 
Inferential statistics powerpoint
Inferential statistics powerpointInferential statistics powerpoint
Inferential statistics powerpoint
kellula
 
Hypothesis testing an introduction
Hypothesis testing an introductionHypothesis testing an introduction
Hypothesis testing an introduction
Geetika Gulyani
 

What's hot (20)

The sampling distribution
The sampling distributionThe sampling distribution
The sampling distribution
 
Statistical inference concept, procedure of hypothesis testing
Statistical inference   concept, procedure of hypothesis testingStatistical inference   concept, procedure of hypothesis testing
Statistical inference concept, procedure of hypothesis testing
 
Linear regression
Linear regression Linear regression
Linear regression
 
Test of hypothesis
Test of hypothesisTest of hypothesis
Test of hypothesis
 
Measure of Dispersion in statistics
Measure of Dispersion in statisticsMeasure of Dispersion in statistics
Measure of Dispersion in statistics
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statistics
 
Testing Hypothesis
Testing HypothesisTesting Hypothesis
Testing Hypothesis
 
Confidence interval
Confidence intervalConfidence interval
Confidence interval
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Confidence interval & probability statements
Confidence interval & probability statements Confidence interval & probability statements
Confidence interval & probability statements
 
Inferential statistics powerpoint
Inferential statistics powerpointInferential statistics powerpoint
Inferential statistics powerpoint
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Hypothesis testing an introduction
Hypothesis testing an introductionHypothesis testing an introduction
Hypothesis testing an introduction
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Lecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysisLecture 6. univariate and bivariate analysis
Lecture 6. univariate and bivariate analysis
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 

Viewers also liked

Population & sample lecture 04
Population & sample lecture 04Population & sample lecture 04
Population & sample lecture 04
DrZahid Khan
 
Sample and population
Sample and populationSample and population
Sample and population
Ms. Jones
 
RESEARCH POPULATION
RESEARCH POPULATIONRESEARCH POPULATION
RESEARCH POPULATION
Ningsih SM
 
1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics
mlong24
 
A3 statistical symbols
A3 statistical symbolsA3 statistical symbols
A3 statistical symbols
sahadevag
 

Viewers also liked (20)

Population & sample lecture 04
Population & sample lecture 04Population & sample lecture 04
Population & sample lecture 04
 
STATISTICS AND PROBABILITY (TEACHING GUIDE)
STATISTICS AND PROBABILITY (TEACHING GUIDE)STATISTICS AND PROBABILITY (TEACHING GUIDE)
STATISTICS AND PROBABILITY (TEACHING GUIDE)
 
Video slides focus on population & sample
Video slides focus on population & sampleVideo slides focus on population & sample
Video slides focus on population & sample
 
Population & sample lecture
Population & sample lecturePopulation & sample lecture
Population & sample lecture
 
Sample and population
Sample and populationSample and population
Sample and population
 
RESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLINGRESEARCH METHOD - SAMPLING
RESEARCH METHOD - SAMPLING
 
RESEARCH POPULATION
RESEARCH POPULATIONRESEARCH POPULATION
RESEARCH POPULATION
 
Sampling
SamplingSampling
Sampling
 
Sampling methods PPT
Sampling methods PPTSampling methods PPT
Sampling methods PPT
 
Chapter 8-SAMPLE & SAMPLING TECHNIQUES
Chapter 8-SAMPLE & SAMPLING TECHNIQUESChapter 8-SAMPLE & SAMPLING TECHNIQUES
Chapter 8-SAMPLE & SAMPLING TECHNIQUES
 
Sampling and Sample Types
Sampling  and Sample TypesSampling  and Sample Types
Sampling and Sample Types
 
1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics1.1-1.2 Descriptive and Inferential Statistics
1.1-1.2 Descriptive and Inferential Statistics
 
Sample and population
Sample and populationSample and population
Sample and population
 
A3 statistical symbols
A3 statistical symbolsA3 statistical symbols
A3 statistical symbols
 
Lecture 16 SERIES
Lecture 16  SERIESLecture 16  SERIES
Lecture 16 SERIES
 
MULTIVARIATE STATISTICAL MODELS’ SYMBOLS
MULTIVARIATE STATISTICAL MODELS’ SYMBOLSMULTIVARIATE STATISTICAL MODELS’ SYMBOLS
MULTIVARIATE STATISTICAL MODELS’ SYMBOLS
 
Papulation
PapulationPapulation
Papulation
 
Samplels & Sampling Techniques
Samplels & Sampling TechniquesSamplels & Sampling Techniques
Samplels & Sampling Techniques
 
SAMPLING
SAMPLINGSAMPLING
SAMPLING
 
Introduction to statistics...ppt rahul
Introduction to statistics...ppt rahulIntroduction to statistics...ppt rahul
Introduction to statistics...ppt rahul
 

Similar to Population and sample mean

Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
Jeremy Lane
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
MYRABACSAFRA2
 

Similar to Population and sample mean (20)

RESEARCH METHODOLOGY - 2nd year ppt
RESEARCH METHODOLOGY - 2nd year pptRESEARCH METHODOLOGY - 2nd year ppt
RESEARCH METHODOLOGY - 2nd year ppt
 
STAT-t statistic.ppt
STAT-t statistic.pptSTAT-t statistic.ppt
STAT-t statistic.ppt
 
Non parametric-tests
Non parametric-testsNon parametric-tests
Non parametric-tests
 
Estimating a Population Mean
Estimating a Population Mean  Estimating a Population Mean
Estimating a Population Mean
 
Inferential Statistics.pdf
Inferential Statistics.pdfInferential Statistics.pdf
Inferential Statistics.pdf
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Statistics for Medical students
Statistics for Medical studentsStatistics for Medical students
Statistics for Medical students
 
chi_square test.pptx
chi_square test.pptxchi_square test.pptx
chi_square test.pptx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
STATISTIC ESTIMATION
STATISTIC ESTIMATIONSTATISTIC ESTIMATION
STATISTIC ESTIMATION
 
Introduction to the t Statistic
Introduction to the t StatisticIntroduction to the t Statistic
Introduction to the t Statistic
 
Chemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptx
Chemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptxChemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptx
Chemometrics-ANALYTICAL DATA SIGNIFICANCE TESTS.pptx
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
t distribution, paired and unpaired t-test
t distribution, paired and unpaired t-testt distribution, paired and unpaired t-test
t distribution, paired and unpaired t-test
 
hypothesis.pptx
hypothesis.pptxhypothesis.pptx
hypothesis.pptx
 
Agreement analysis
Agreement analysisAgreement analysis
Agreement analysis
 
Sample size calculation
Sample  size calculationSample  size calculation
Sample size calculation
 
Monte carlo analysis
Monte carlo analysisMonte carlo analysis
Monte carlo analysis
 

More from Avjinder (Avi) Kaler

More from Avjinder (Avi) Kaler (20)

Unleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Unleashing Real-World Simulations: A Python Tutorial by Avjinder KalerUnleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Unleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
 
Tutorial for Deep Learning Project with Keras
Tutorial for Deep Learning Project  with KerasTutorial for Deep Learning Project  with Keras
Tutorial for Deep Learning Project with Keras
 
Tutorial for DBSCAN Clustering in Machine Learning
Tutorial for DBSCAN Clustering in Machine LearningTutorial for DBSCAN Clustering in Machine Learning
Tutorial for DBSCAN Clustering in Machine Learning
 
Python Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdfPython Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdf
 
Sql tutorial for select, where, order by, null, insert functions
Sql tutorial for select, where, order by, null, insert functionsSql tutorial for select, where, order by, null, insert functions
Sql tutorial for select, where, order by, null, insert functions
 
Kaler et al 2018 euphytica
Kaler et al 2018 euphyticaKaler et al 2018 euphytica
Kaler et al 2018 euphytica
 
Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...
 
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
 
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypesGenome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
 
Tutorial for Estimating Broad and Narrow Sense Heritability using R
Tutorial for Estimating Broad and Narrow Sense Heritability using RTutorial for Estimating Broad and Narrow Sense Heritability using R
Tutorial for Estimating Broad and Narrow Sense Heritability using R
 
Tutorial for Circular and Rectangular Manhattan plots
Tutorial for Circular and Rectangular Manhattan plotsTutorial for Circular and Rectangular Manhattan plots
Tutorial for Circular and Rectangular Manhattan plots
 
Genomic Selection with Bayesian Generalized Linear Regression model using R
Genomic Selection with Bayesian Generalized Linear Regression model using RGenomic Selection with Bayesian Generalized Linear Regression model using R
Genomic Selection with Bayesian Generalized Linear Regression model using R
 
Genome wide association mapping
Genome wide association mappingGenome wide association mapping
Genome wide association mapping
 
Nutrient availability response to sulfur amendment in histosols having variab...
Nutrient availability response to sulfur amendment in histosols having variab...Nutrient availability response to sulfur amendment in histosols having variab...
Nutrient availability response to sulfur amendment in histosols having variab...
 
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
 
R code descriptive statistics of phenotypic data by Avjinder Kaler
R code descriptive statistics of phenotypic data by Avjinder KalerR code descriptive statistics of phenotypic data by Avjinder Kaler
R code descriptive statistics of phenotypic data by Avjinder Kaler
 
Population genetics
Population geneticsPopulation genetics
Population genetics
 
Quantitative genetics
Quantitative geneticsQuantitative genetics
Quantitative genetics
 
Abiotic stresses in plant
Abiotic stresses in plantAbiotic stresses in plant
Abiotic stresses in plant
 
Seed rate calculation for experiment
Seed rate calculation for experimentSeed rate calculation for experiment
Seed rate calculation for experiment
 

Recently uploaded

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 

Population and sample mean

  • 1. POPULATION AND SAMPLE MEAN Avjinder Singh Kaler and Kristi Mai
  • 2. • Estimating a Population Mean • 𝜎 unknown • 𝜎 known • Estimating the difference between two population means • Independent samples • Dependent samples
  • 3. Main Ideas: • The sample mean is the best point estimate of the population mean • We can use a sample mean to construct a C.I. to estimate the true value of a population mean • We must learn how to find the sample size necessary to estimate a population mean Recall: • 𝑥 = 𝑥 𝑛 : sample mean • 𝑥 targets 𝜇 and is an individual value that is used as an estimate (i.e. it is a point estimate for 𝜇) Notice: There are two situations when estimating a population mean 1. 𝜎, the population standard deviation, is known 2. 𝜎, the population standard deviation, is NOT known
  • 4. • Margin of Error (estimating the population mean when 𝜎 is known) • 𝐸 = 𝑍 𝛼/2 ∗ 𝜎 𝑛 • Notice: The margin of error changes when what we are estimating changes!! • Constructing a C.I. • Requirements:  The sample must be a SRS  The value of 𝜎 is known  The population is normal OR 𝑛 > 30 • C.I.:  𝑥 − 𝐸 < 𝜇 < 𝑥 + 𝐸  Same as: 𝑥 ± 𝐸 and (𝑥 − 𝐸, 𝑥 + 𝐸)
  • 5. • Minimum required sample size • Sample size needed: • If 𝜎 is known: 𝑛 = (𝑍 𝛼/2) ∗ 𝜎 𝐸 2 • If not a whole number, ALWAYS round up to the nearest whole number for minimum required sample sizes
  • 6. Some Key Points: • The sample mean is still the best point estimate of the population mean • We can use a sample mean to construct a C.I. to estimate the true value of a population mean even when we do not know the population standard deviation • We see that if requirements are generally met but 𝜎 is unknown, we must use a t-distribution
  • 7. The Student t Distribution: • If a population has a normal distribution, then the following formula describes the t-distribution: 𝑡 = 𝑥−𝜇 𝑠 𝑛 • The above formula is a t-score; a measure of relative standing • We are estimating the unknown population standard deviation with the sample standard deviation • This estimation would typically lead to unreliability and so we compensate for this inherent unreliability with wider intervals and “fatter tails” displayed in the density curve • We must utilize a t-table or t-calculator when using the t-distribution • We NEED degrees of freedom  Degrees of Freedom (𝑑𝑓) for a collection of sample data is the number of sample values that can vary after certain restrictions have been imposed upon all the data values
  • 8.
  • 9. • Recall: • 𝑠 = 𝑥−𝑥 2 𝑛−1 : sample standard deviation • Margin of Error  𝐸 = 𝑡 𝛼/2 ∗ 𝑠 𝑛 with 𝑑𝑓 = 𝑛 − 1  Notice: The margin of error also changes when the information we have changes
  • 10. Constructing a C.I. • Requirements: • The sample must be a SRS • The value of 𝜎 is NOT known • The population is normal OR 𝑛 > 30 • C.I.: • 𝑥 − 𝐸 < 𝜇 < 𝑥 + 𝐸 • Same As: 𝑥 ± 𝐸 and 𝑥 − 𝐸, 𝑥 + 𝐸 • Notice that the C.I. appears to be the same – however, it will NOT be the same as the previous CI for 𝜇 because (with our uncertainty about 𝜎) the margin of error changed
  • 11. • The student t distribution is different for different sample sizes • The t distribution has the same general symmetric bell shape as the Normal distribution, but reflects the greater variability that is expected when samples are smaller • The t distribution has a mean of 𝑡 = 0 just as the standard normal distribution has a mean of 𝑧 = 0
  • 12. Is the population normal OR is n>30 Is 𝜎 known or unknown? Use normal distribution (Normal -Calculator) Use t distribution (t-calculator) Use nonparametric method or bootstrapping technique Yes No Known Unknown
  • 13.  Requirements:  The sample must be a SRS  The value of 𝜎 is known  The population is normal OR 𝑛 > 30  Test Statistic: z = 𝑥−𝜇 𝜎 𝑛  𝜇: population mean (assumed true under 𝐻0) Note: p-values and critical values are from Z-table  Requirements:  The sample must be a SRS  The value of 𝜎 is NOT known  The population is normal OR 𝑛 > 30  Test Statistic: t = 𝑥−𝜇 𝑠 𝑛 ; 𝑑𝑓 = 𝑛 − 1  𝜇: population mean (assumed true under 𝐻0) Note: p-values and critical values are from t-table 𝜎 known 𝜎 NOT known
  • 14. Listed below are the measured radiation emissions (in W/kg) corresponding to a sample of cell phones. Use a 0.05 level of significance to test the claim that cell phones have a mean radiation level that is less than 1.00 W/kg. The summary statistics are: . 0.38 0.55 1.54 1.55 0.50 0.60 0.92 0.96 1.00 0.86 1.46 0.938 and 0.423x s 
  • 15. Requirement Check: 1. We assume the sample is a simple random sample. 2. The sample size is n = 11, which is not greater than 30, so we must check a normal quantile plot for normality. Note: (See plot on the right) The points are reasonably close to a straight line and there is no other patter, so we conclude that The data appear to be from a normally distributed Population.
  • 16. Step 1: The claim that cell phones have a mean radiation level less than 1.00 W/kg is expressed as μ < 1.00 W/kg. Step 2: The alternative to the original claim is μ ≥ 1.00 W/kg. Step 3: The hypotheses are written as: Step 4: The stated level of significance is 𝛼 = 0.05. Step 5: Because the claim is about a population mean μ, the statistic most relevant to this test is the sample mean: 0 1 : 1.00 W/kg : 1.00 W/kg H H     x
  • 17. Step 6: Calculate the test statistic and then find the P-value or the critical value using StatCrunch. 0.938 1.00 0.486 0.423 11 xx t s n      
  • 18. Step 7: Critical Value Method: Because the test statistic of t = –0.486 does not fall in the critical region bounded by the critical value of t = –1.812, fail to reject the null hypothesis.
  • 19. Step 7: P-value method: Using StatCrunch, the P-value computed is 0.3187. Since the P-value is greater than α = 0.05, we fail to reject the null hypothesis. Step 8: Because we fail to reject the null hypothesis, we conclude that there is not sufficient evidence to support the claim that cell phones have a mean radiation level that is less than 1.00 W/kg.
  • 20. We can use a confidence interval for testing a claim about μ. For a two-tailed test with a 0.05 significance level, we construct a 95% confidence interval. For a one-tailed test with a 0.05 significance level, we construct a 90% confidence interval.
  • 21. Using the cell phone example, construct a confidence interval that can be used to test the claim that μ < 1.00 W/kg, assuming a 0.05 significance level. Note that a left-tailed hypothesis test with α = 0.05 corresponds to a 90% confidence interval. Using StatCrunch, the confidence interval is: 0.707 W/kg < μ < 1.169 W/kg Because the value of μ = 1.00 W/kg is contained in the interval, we fail to reject the null hypothesis that μ = 1.00 W/kg . Based on the sample of 11 values, we do not have sufficient evidence to support the claim that the mean radiation level is less than 1.00 W/kg.
  • 22. When σ is known, we use test that involves the standard normal distribution. In reality, it is very rare to test a claim about an unknown population mean while the population standard deviation is somehow known. The procedure is essentially the same as a t test, with the following exception: The test statistic is The P-value and critical values can be computed using StatCrunch. xx z n    
  • 23. If we repeat the cell phone radiation example, with the assumption that σ = 0.480 W/kg, the test statistic is: The example refers to a left-tailed test, so the P-value is the area to the left of z = –0.43, which is 0.3342. Since the P-value is greater than 𝛼 = 0.05, we fail to reject the null and reach the same conclusion as before. 0.938 1.00 0.43 0.480 11 xx z n        
  • 24. Main Ideas: • The sample mean is the best point estimate of the population mean • We can use two independent sample means to construct a confidence interval that can be used to estimate the true value of the underlying difference in the corresponding population means • We can also test claims about the difference between two population means
  • 26. Dependent samples  two samples are dependent if the sample values are paired Independent samples  two samples are independent if the sample values from one are not related to or somehow naturally paired/matched with the sample values from the other
  • 27. Requirements: • Population standard deviations (𝜎1 and 𝜎2) are NOT known and NOT assumed equal • The two samples are independent • Both samples are SRS • Both 𝑛1 > 30 and 𝑛2 > 30 OR both samples come from populations that are normal
  • 28. • Margin of Error 𝐸 = 𝑡 𝛼/2 ∗ 𝑠1 2 𝑛1 + 𝑠2 2 𝑛2 and 𝑑𝑓 = min 𝑛1 − 1, 𝑛2 − 1 • C.I.: 𝑥1 − 𝑥2 − 𝐸 < 𝜇1 − 𝜇2 < 𝑥1 − 𝑥2 + 𝐸 • Notice that we are often interested in whether or not 0 is included within the limits of the confidence interval constructed, i.e., whether or not 𝜇1 − 𝜇2 = 0 is reasonable
  • 29. • Requirements: • Requirements and degrees of freedom (df) are the same as in the C.I. before • Test Statistic: 𝑡 = 𝑥1−𝑥2 − 𝜇1−𝜇2 𝑠1 2 𝑛1 + 𝑠2 2 𝑛2
  • 30. Researchers conducted trials to investigate the effects of color on creativity. Subjects with a red background were asked to think of creative uses for a brick; other subjects with a blue background were given the same task. Responses were given by a panel of judges. Researchers make the claim that “blue enhances performance on a creative task”. Test the claim using a 0.01 significance level.
  • 31. Requirement check: 1. The values of the two population standard deviations are unknown and assumed not equal. 2. The subject groups are independent. 3. The samples are simple random samples. 4. Both sample sizes exceed 30. The requirements are all satisfied.
  • 32. The data: Background color Sample size Sample mean Sample standard deviation Red Background n = 35 s = 0.97 Blue Background n = 36 s = 0.63 3.39x  3.97x 
  • 33. Step 1: The claim that “blue enhances performance on a creative task” can be restated as “people with a blue background (group 2) have a higher mean creativity score than those in the group with a red background (group 1)”. This can be expressed as μ1 < μ2. Step 2: If the original claim is false, then μ1 ≥ μ2. Step 3: The hypotheses can be written as: OR 𝐻0: 𝜇1−𝜇2=0 𝐻1: 𝜇1−𝜇2<0 0 1 2 1 1 2 : : H H      
  • 34. Step 4: The significance level is α = 0.05. Step 5: Because we have two independent samples and we are testing a claim about two population means, we use a t-distribution. Step 6: Calculate the test statistic. 1 2 1 2 2 2 1 2 1 2 2 2 ( ) ( ) (3.39 3.97) 0 2.979 0.97 0.63 35 36 x x t s s n n            
  • 35. Step 6: Because we are using a t-distribution, the critical value of t = –2.441 is found using StatCrunch. We use 34 degrees of freedom.
  • 36. Step 7: Because the test statistic does fall in the critical region, we reject the null hypothesis μ1 – μ2. P-Value Method: StatCrunch provides a P-value, and the area to the left of the test statistic of t = –2.979 is 0.0021. Since this is less than the significance level of 0.01, we reject the null hypothesis. Conclusion: There is sufficient evidence to support the claim that the red background group has a lower mean creativity score than the blue background group.
  • 37. Using the data from this color creativity example, construct a 98% confidence interval estimate for the difference between the mean creativity score for those with a red background and the mean creativity score for those with a blue background.
  • 38. Using StatCrunch, the 98% confidence interval obtained is: −1.05 < 𝜇1 − 𝜇2 < −0.11 2 2 2 2 1 2 /2 1 2 0.97 0.63 2.441 0.475261 35 36 s s E t n n      1 23.39 and 3.97x x  1 2 1 2 1 2 1 2 ( ) ( ) ( ) 1.06 ( ) 0.10 x x E x x E               
  • 39. We are 98% confident that the limits –1.05 and –0.11 actually do contain the difference between the two population means. Because those limits do not include 0, our interval suggests that there is a significant difference between the two means.
  • 40. These methods are rarely used in practice because the underlying assumptions are usually not met. 1. The two population standard deviations are both known • the test statistic will be a z instead of a t and use the standard normal model. 2. The two population standard deviations are unknown but assumed to be equal • pool the sample variances
  • 41. 1 2 1 2 2 2 1 2 1 2 ( ) ( )x x z n n          The test statistic will be: P-values and critical values are found using StatCrunch.
  • 42. 1 2 1 2 1 2( ) ( ) ( )x x E x x E        2 2 1 2 / 2 1 2 E z n n     
  • 43. The test statistic will be Where the pooled sample variance is with 1 2 1 2 2 2 1 2 ( ) ( ) p p x x t s s n n       2 2 2 1 1 2 2 1 2 ( 1) ( 1) ( 1) ( 1) p n s n s s n n        1 2df 2n n  
  • 44. 1 2 1 2 1 2( ) ( ) ( )x x E x x E        2 2 /2 1 2 p ps s E t n n   1 2df 2n n  
  • 45.
  • 46. Independent Samples (Two Additional Methods) • 𝜎1 𝑎𝑛𝑑 𝜎2 known – Z Test / Z Interval • 𝜎1 = 𝜎2 -- Pooled Sample Variance Dependent Samples • When samples are paired, we use a different methodology
  • 47. Main Ideas: • The sample mean is still the best point estimate of the population mean • We can use two dependent sample means to construct a confidence interval that can be used to estimate the true value of the underlying difference in the corresponding population means • We can also test claims about the difference between two population means • In experimental design, using dependent samples is generally better and more practical than assuming two independent samples
  • 48. Notation: • 𝑑: 𝑡ℎ𝑒 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑣𝑎𝑙𝑢𝑒𝑠 𝑖𝑛 𝑎 𝑠𝑖𝑛𝑔𝑙𝑒 𝑚𝑎𝑡𝑐ℎ𝑒𝑑 𝑝𝑎𝑖𝑟 • 𝑛: 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑎𝑖𝑟𝑠 𝑜𝑓 𝑑𝑎𝑡𝑎 • 𝜇 𝑑: 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑝𝑎𝑖𝑟𝑠 𝑜𝑓 𝑑𝑎𝑡𝑎 • 𝑑: 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑝𝑎𝑖𝑟𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎 • 𝑠 𝑑: 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑝𝑎𝑖𝑟𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎
  • 49. Requirements • The sample data are dependent • Both samples are SRS • Either 𝑛 > 30 OR the paired differences come from a population that is normal Margin of Error • 𝐸 = 𝑡 𝛼/2 ∗ 𝑠 𝑑 𝑛 with 𝑑𝑓 = 𝑛 − 1 C.I. • 𝑑 − 𝐸 < 𝜇 𝑑 < 𝑑 + 𝐸 Notice that we are often interested in whether or not 0 is included within the limits of the confidence interval constructed, i.e., whether or not 𝜇 𝑑 = 0 is reasonable
  • 50. Requirements: • Requirements and Degrees of freedom (𝑑𝑓) are the same as in the C.I. above Test Statistic: 𝑡 = 𝑑−𝜇 𝑑 𝑠 𝑑 𝑛
  • 51. Use the sample data below with a significance level of 0.05 to test the claim that for the population of heights of presidents and their main opponents, the differences have a mean greater than 0 cm (so presidents tend to be taller than their opponents). Height (cm) of President 189 173 183 180 179 Height (cm) of Main Opponent 170 185 175 180 178 Difference d 19 -12 8 0 1
  • 52. Requirement Check: 1. The samples are dependent because the values are paired. 2. The pairs of data are randomly selected. 3. The number of data points is 5, so normality should be checked (and it is assumed the condition is met).
  • 53. Step 1: The claim is that µd > 0 cm. Step 2: If the original claim is not true, we have µd ≤ 0 cm. Step 3: The hypotheses can be written as: 0 0 : 0 cm : 0 cm d d H H    
  • 54. Step 4: The significance level is α = 0.05. Step 5: We use the Student t-distribution. The summary statistics are: 3.2 11.4 d s  
  • 55. Step 6: Determine the value of the test statistic: with df = 5 – 1 = 4 3.2 0 0.628 11.4 5 d d d t s n     
  • 56. Step 6: Using StatCrunch, the P-value is 0.282. Using the critical value method:
  • 57. Step 7: Because the P-value exceeds 0.05, or because the test statistic does not fall in the critical region, we fail to reject the null hypothesis. Conclusion: There is not sufficient evidence to support the claim that for the population of heights of presidents and their main opponent, the differences have a mean greater than 0 cm. In other words, presidents do not appear to be taller than their opponents.
  • 58. Confidence Interval: Support the conclusions with a 90% confidence interval estimate for µd. /2 11.4 2.132 10.8694 5 ds E t n    3.2 10.8694 3.2 10.8694 7.7 14.1 d d d d E d E             
  • 59. We have 90% confidence that the limits of –7.7 cm and 14.1 cm contain the true value of the difference in height (president’s height – opponent’s height). See that the interval does contain the value of 0 cm, so it is very possible that the mean of the differences is equal to 0 cm, indicating that there is no significant difference between the heights.
  • 60. Complete the following: • Practice Problems 5 • Practice Problems 6