Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Confidence Interval, One-Sample t Test, Matched Pairs t Procedures, Robustness of the t Procedures))
Ăhnlich wie Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Confidence Interval, One-Sample t Test, Matched Pairs t Procedures, Robustness of the t Procedures))
Ăhnlich wie Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Confidence Interval, One-Sample t Test, Matched Pairs t Procedures, Robustness of the t Procedures)) (20)
Study on Air-Water & Water-Water Heat Exchange in a Finned ï»żTube Exchanger
Â
Chapter 7 : Inference for Distributions(The t Distributions, One-Sample t Confidence Interval, One-Sample t Test, Matched Pairs t Procedures, Robustness of the t Procedures))
2. Chapter 7
Inference for Distributions
7.1 Inference for the Mean of a Population
7.2 Comparing Two Means
2
3. 7.1 Inference for the Mean of a
Population
3
ï The t Distributions
ï One-Sample t Confidence Interval
ï One-Sample t Test
ï Matched Pairs t Procedures
ï Robustness of the t Procedures
4. 4
Cola manufacturers want to test how much the sweetness of a new cola
drink is affected by storage. The sweetness difference due to storage was
evaluated by 10 professional tasters (by comparing the sweetness before
and after storage):
Taster Sweetness difference (D = Before â After)
1 2.0
2 0.4
3 0.7
4 2.0
5 â0.4
6 2.2
7 â1.3
8 1.2
9 1.1
10 2.3
We want to test if storage results in a loss of
sweetness:
0”:H;0”:H diffadiff0
>=
This looks familiar. However, here we do not
know the population parameter Ï.
This situation is very common with real data.
Example â Sweetening colas
5. 5
When Ï is unknown
The sample s.d. s provides an estimate of the population s.d. Ï.
When the sample size is large,
then s is a good estimate of Ï.
But when the sample size is small,
then s is a poor estimate of Ï.
The sample is likely to contain
elements representative of the
whole population.
The sample contains only a few
individuals.
6. 6
Standard deviation s â standard error s/ân
ï When Ï is not known, we estimate it with the sample standard
deviation s.
ï Then we estimate the standard deviation of by .
ï This quantity is called the standard error of the sample mean and
we denote it by SEM or .
Example: A simple random sample of five female basketball players is
selected. Their heights (in cm) are 170, 175, 169, 183, and 177.
What is the standard error of the mean of these height
measurements?
Solution: Sample mean = 174.8
Sample s.d, s= â32.2
SEM = = â(32.2/5) = 2.538.
7. 7
The t Distributions
Suppose that an SRS of size n is drawn from an N(”,Ï) population.
ï When Ï is known, the sampling distribution is N(ÎŒ, Ï/ân).
ï When Ï is estimated from the sample standard deviation s, the
sampling distribution follows a t-distribution t(ÎŒ, s/ân) with degrees of
freedom n â 1.
8. 8
When comparing the density curves of the standard Normal distribution
and t distributions, several facts are apparent:
ïŒThe density curves of the t distributions
are similar in shape to the standard
Normal curve.
ïŒThe spread of the t distributions is a bit
greater than that of the standard Normal
distribution.
ïŒThe t distributions have more probability
in the tails and less in the center than
does the standard Normal.
ïŒAs the degrees of freedom increase, the t
density curve approaches the standard
Normal curve ever more closely.
We can use Table D to determine critical values t* for t distributions with
different degrees of freedom.
The t Distributions
9. 9
Standardizing the data before using
Table D
As with the normal distribution, the first step is to standardize the data.
Then we can use Table D to obtain the area under the curve.
Here, Ό is the mean (center) of the sampling distribution,
and the standard error of the mean s/ân is its standard deviation (width).
10. 10
Using Table D
Suppose you want to construct a 95% confidence interval for the mean
” of a Normal population based on an SRS of size n = 12.
What critical t* should you use?
In Table D, we consult the row
corresponding to df = n â 1 = 11.
We move across that row to the
entry that is directly above 95%
confidence level.
Upper-tail probability p
df .05 .025 .02 .01
10 1.812 2.228 2.359 2.764
11 1.796 2.201 2.328 2.718
12 1.782 2.179 2.303 2.681
z* 1.645 1.960 2.054 2.326
90% 95% 96% 98%
Confidence level C
11. 11
11
One-Sample t Confidence
Interval
Choose an SRS of size n from a population having unknown mean ”.
A level C confidence interval for ” is:
where t* is the critical value for the t(n â 1) distribution.
The margin of error is
This interval is exact when the population distribution is Normal and
approximately correct for large n in other cases.
The One-Sample t Interval for a Population Mean
x ± t *
sx
n
t *
sx
n
ï The level C confidence interval is an interval with
probability C of containing the true population parameter.
ï C is the area between ât* and t*.
ï We find t* from Table D for df = nâ1 and confidence
level C.
12. 12
Example: Listening to music on cell phones.
On average, U.K. subscribers with 3G phones spent an average of 8.3 hours
per month listening to full-track music on their cell phones.
Suppose we want to determine a 95% confidence interval for the U.S.
average and draw the following random sample of size 8 from the U.S.
population of 3G subscribers: 5 6 0 4 11 9 2 3
.n-.sx 71df;633,d.s.;5mean,sampleHere ====
The standard error is
tâ = 2.365
The 95% CI is
13. 13
The One-Sample t Test
As in the previous chapter, a test of hypotheses requires a few
steps:
1. Stating the null and alternative hypotheses (H0 versus Ha)
2. Calculating t and its degrees of freedom
3. Finding the area under the curve with Table D
4. Stating the P-value and interpreting the result
t =
x â ”0
sx
n
14. 14
The One-Sample t Test (ContâŠ)
The P-value is calculated as the corresponding area under the curve,
one-tailed or two-tailed depending on Ha:
15. 15
The level of dissolved oxygen (DO) in a stream or river is an important
indicator of the waterâs ability to support aquatic life.
A researcher measures the DO level at 15 randomly chosen locations along
a stream. Here are the results in milligrams per liter:
We want to perform a test at the α = 0.05 significance level of:
H0: ” = 5
Ha: ” < 5
where ” is the actual mean dissolved oxygen level in this stream.
Example
4.53 5.04 3.29 5.23 4.13 5.50 4.83 4.40
5.42 6.38 4.01 4.66 2.87 5.73 5.55
A dissolved oxygen level below 5 mg/l puts aquatic life at risk.
16. 16
The P-value is between 0.15 and 0.20, which is
greater than our α = 0.05 significance level,
-ï We fail to reject H0.
-ï We donât have enough evidence to conclude
that the mean DO level in the stream is less than
5 mg/l.
The sample mean and standard deviation are:
P-value The P-value is the area to the left of
t = â0.94 under the t distribution curve with
df = 15 â 1 = 14.
But P(t < â0.94 ) = P(t > 0.94 )
Upper-tail probability p
df .25 .20 .15
13 .694 .870 1.079
14 .692 .868 1.076
15 .691 .866 1.074
50% 60% 70%
Confidence level C
Example (ContâŠ)
H0: ” = 5; Ha: ” < 5
17. Example â Sweetening colas contâŠ)
Is there evidence that storage results in
sweetness loss for the new cola
recipe at the α=5% level of significance?
H0: Ό = 0 versus Ha: Ό > 0 (one-sided test)
There is a significant loss of
sweetness, on average, following
storage.17
18. 18
Matched Pairs t Procedures
Matched Pairs t Procedures
To compare the responses to the two treatments in a matched-pairs
design, find the difference between the responses within each pair. Then
apply the one-sample t procedures to these differences.
The members of one sample are identical to, or matched (paired) with,
the members of the other sample.
Example: Pre-test and post-test studies look at data collected on the
same sample elements before and after some experiment is performed.
19. 19
Matched Pairs t Procedures
Conceptually, this is not different from tests on one population.
20. Example â Sweetening colas (revisited)
The sweetness loss due to storage was evaluated by 10 professional
tasters (comparing the sweetness before and after storage):
Although the text didnât mention it explicitly, this is a pre-/post-test design
and the variable is the difference in cola sweetness before minus after
storage. A matched pairs test of significance is indeed just like a one-
sample test.
20
21. Example: Does lack of caffeine increase depression?
Individuals diagnosed as caffeine dependent
are assigned to receive daily pills.
Sometimes, the pills contain caffeine and
other times they contain a placebo.
Depression was assessed.
For each individual in the sample, we
have calculated a difference in
depression score (placebo minus
caffeine).
Caffeine deprivation causes a significant increase in depression.21