SlideShare ist ein Scribd-Unternehmen logo
1 von 82
Chapter 11 Testing a Claim
11.1 Significance Tests:	The Basics
The Pizza Problem Let us suppose that a certain pizza company claims that they deliver their pizza in an average of 20 minutes Now, we are told “average time” so it’s possible that they’ve delivered a pizza in 5 minutes, and it’s also possible that they delivered a pizza in 30 minutes If we order pizza 10 times, what average time will convince you that they’re claim is wrong? Welcome to significance testing!
How significance testing works Assume that a claim about an average or proportion is true Compute the average or prop of a sample Compare the sample with the sampling distribution for the claim and sample size. If the probability of obtaining the sample avg or prop is too low, we conclude that our claim is improbable, and reject it.
How significance testing works 	In all cases, we are comparing the sample with the sampling distribution for the claim and sample size
PHANTOMS (a framework) 	As with Confidence Intervals, there is an acronym to help you remember the steps of a significance test State the Parameter State the Hypothesis pair Check the Assumptions State the Name of the test Find the value of the Test Statistic Obtain a p-value Make a decision Summarize
State Parameters Parameters work the same way they did in Confidence Intervals = The true average of the Pizza Company’s delivery times x-bar = the average delivery time for a sample of 10 deliveries from the Pizza Company p = the proportion of all deliveries from the Pizza Company that are delivered in less than 20 minutes p-hat = the proportion of sample of 10 deliveries from the Pizza Company that are delivered in less than 20 minutes
Stating Hypotheses Hypotheses come in pairs: “the null hypothesis”  H0 “H naught” This is the presumed claim For our purposes, our null hypothesis will always be in the forms: “= __” “p = ___”
Stating Hypotheses Hypotheses come in pairs: “the alternative hypothesis” Ha This is the suspicion of the researcher There are 3 alt hyps that we can test “ ≠ ___” (two-sided alternative) “p > ___” (one-sided alternative) “ < ___” (one-sided alternative)
Stating Hypotheses 	Notice: Hypotheses are always about the parameter ( or p, neverxbar or phat) Written Examples “H0:  = 20 minutes Ha:  > 20 minutes” “H0: p = 0.5 Ha: p < 0.5”
Checking the Assumptions Since we are comparing our samples to a sampling distribution (just like the last chapter), the assumptions are the same We will review them now:
Checking the Assumptions Assumptions for mean SRS IndependenceN > 10n Normality (a, b, or c must be true) 	(a) population is Normal, or (b) n > 30; Central Limit Theorem, or 	(c) Sample is approximately normal: 	(1) histogram single peak and symmetric, 	(2) Normal probability plot  is linear,	(3) no Outliers
Checking the Assumptions Assumptions for proportions SRS IndependenceN > 10n Normality np > 10nq > 10
Name of Test “one-sided z test for means” “two-sided z test for means” “one-sided t test for means” “two-sided t test for means” “one-sided z test for proportions” “two-sided z test for proportions” More on these later
Test Statistics Test Statistics are always of the form: Standard Deviation of the sampling distribution depends on the characteristic tested
Test Statistics Std Dev for mean ( known): Std Dev for mean ( unknown): Std Dev for proportions: Notice that we use ‘p’ and not ‘p-hat’
P-values The P-value is the probability of obtaining a measurement as extreme as the test statistic At its most basic, computing the P-Value is the same as computing area from a Normal curve or Student’s t-distribution Computation varies slightly when using  2-sided alternative vs. 1-sided alternative
P-values Two sided alternatives For these alt hyps, we calculate a p-val based on area “from two tails”
P-values Example: Let’s assume our sample of 24  has:x-bar = 22 and s = 1.53 H0:  = 20Ha:  20 “2-sided t-test for means”
P-values Example (cont) Test Statistic:
P-values Example (cont) Test Statistic:
P-values Example (cont) Test Statistic:
P-values Example (cont) Test Statistic:
P-values Example (cont) Test Statistic: P-value
P-values Example (cont) Test Statistic: P-value
P-values Example (cont) Test Statistic: P-value
P-values One sided alternatives Calculate the area tail indicated by the alternative hypothesis for P-value
P-values One sided alternatives If H0 p = .53 and Ha: p > 0.53then P-val = P(z > test stat) If Ha:  < 10,  then P-val = P(t < test stat)
P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22p-hat = 0.20 from n = 55 Test statistic
P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22 p-hat = 0.20 from n = 55 Test statistic
P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22 p-hat = 0.20 from n = 55 Test statistic
P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22 p-hat = 0.20 from n = 55 Test statistic
P-values Example (cont.) P-value Would you say this is “likely” or “unlikely”?
Making a decision The P-value serves as the indicator If the test statistic is likely under the presumed sampling distribution (i.e. the p-value is large), then we have no reason to reject the null-hypothesis If the test statistic is unlikely (i.e. the p-value is small), then we have reason to reject the null-hypothesis. “If the p-value is low, reject the Hoe”
Making a decision Significance level (‘alpha’ ) This is the probability level at which we will reject H0 Typical sig levels  = 0.10, 0.05, 0.01 If no significance level is given, we will generally reject at the  = 0.05 level. When p-val < , then we “reject H0 at the  = __ level” When H0 is rejected, we say the data is “statistically significant at the  = __ level”
Making a decision “Reject or Fail to Reject” When p val > alpha, we “fail to reject H0” This means that we do not have evidence to show H0 is incorrect This does not mean, H0 is “correct” When p val < alpha we “reject H0” This means that H0 is unlikely 	The new estimate for or p is our sample data (x-bar or p-hat)
WOW That was a lot of information! We will be going over this information again at a slower pace in the coming weeks.  We’ll work out the mechanics later Understanding the basics and the “whys” right now will help you in the future!
Assignment 11.1 Page 693 #3, 5, 7-8, 11-14
11.2 Carrying out Significance Tests
z-test for a population mean This is the appropriate test when  is known. Test Statistic:
z-test for a population mean P-value:
Example 11.10 	The mean systolic blood pressure for males 35 to 44 years is 128, and the standard deviation in this population is 15.  The medical records of 72 male executives in this age group finds the mean systolic blood pressure is 129.93.  Is this evidence that the mean blood pressure for all the company’s younger male executives is different than the national average?
Example 11.10 We are going to check to see if our sample comes from a population with the same and sigma as the national population. Because of this, our parameter will come from the national averages. The null hypothesis will assume that younger male executives have the same mean blood pressure as the national average. The null hypothesis will always assume “things are equal”
Example 11.10 Parameter “Let  = average blood pressure of all younger male executives in the company” “Let x-bar = average blood pressure in the sample of 72 younger male executives from the company”
Example 11.10 Hypotheses  = 128 	   128 Notice that we will need the 2-sided P-value
Example 11.10 Assumptions Simple Random Sample  “We are not told that our sample is from an SRS.  We should check how this sample was chosen.  We will proceed as though this sample was an SRS” Independence “We are not told the size the population of young male executives.  We should check that the population is greater than 10(72) = 720.” Normality “Because we have a large sample, the Central Limit Theorem guarantees that the sampling distribution is approximately Normal”
Example 11.10 Assumptions (cont.) The preceding example illustrates ‘what to do’ if you think that an assumption is not met. If you believe that an assumption is not met: 	(1) state the condition that must be qualified,  	(2) mention that it “needs to checked,” and  	(3) state you will “proceed as though this assumption was met”  Always try to carry out the significance test.
Example 11.10 Name of the Test “We will conduct a z-test for a population mean”
Example 11.10 Test Statistic
Example 11.10 P-value
Example 11.10 P-value
Example 11.10 P-value
Example 11.10 Make a Decision We are not given an  in this examplewe should use the standard 0.05 significance level. The p-value is larger than our , so we should reject the null hypothesis Note: nothing needs to be written for this part of PHANTOMS
Example 11.10 Summarize 	“Approximately 27% of the time, a sample of size n =72 will produce an average at least as extreme as 129.93.   	Since this p-value is larger than a presumed  = 0.05, we cannot reject our null hypothesis. 	We have no evidence to suggest that the mean systolic blood pressure of young executives is not 128.”
Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: 	(1) Interpret the p-value
Approximately 27% of the time, a sample of size n =72 will produce an average at least as extreme as 129.93. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: 	(1) Interpret the p-value
Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: 	(1) Interpret the p-value 	(2) Compare the p-value with 
Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: 	(1) Interpret the p-value 	(2) Compare the p-value with  Since this p-value is larger than a presumed  = 0.05, we cannot reject our null hypothesis.
Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: 	(1) Interpret the p-value 	(2) Compare the p-value with  	(3) Interpret the conclusion in context
Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: 	(1) Interpret the p-value 	(2) Compare the p-value with  	(3) Interpret the conclusion in context We have no evidence to suggest that the mean systolic blood pressure of young executives is not 128.
Tests and Confidence Intervals A “two-sided alternative” and the “confidence interval” are the same test. A test will reject the null hypothesis of a two-sided alternative when the test statistic is outside the confident interval with CL = 1 -  The link between confidence intervals and a two-sided test is called “duality” Refer to example 11.12
Assignment 11.2 Page 709 #27, 29, 31-33
11.3 Use and Abuse of Tests
More on Significance Levels The significance level for a test is informed by the plausibility of H0. If H0 is particularly “strong” or has a many years behind it, then the evidence must also be “strong” (small ) If we were trying to disprove the gravitational constant, the  would have to be very, very small!
More on Significance Levels What are the consequences of rejecting H0? There will always be a cost/benefit to rejecting H0 If it is more expensive to reject than it is to fail to reject, then the evidence must be strong (small ) Consider the Toyota brake recall 2009
More on Significance Levels There is no “hard line” between reject and fail to reject There isn’t a real difference between  = 0.10 and  = 0.11 There is no sharp border between “statistically significant” and “statistically insignificant” As P-value decreases, the strength of the evidence increases Although  = 0.05 is ‘handy rule of thumb,’ it is not a universal rule
Cautions Don’t forget to examine the data The presence of outliers can affect whether the significance tests are plausible “Statistically Significant” is not the same thing as “Important” Lack of significance may signal an important conclusion A Test of Significance is not appropriate for all data sets
11.4 Using Inference to Make Decisions
“What if” we made the wrong decision? There are two kinds of wrong decisions: Reject a H0 that was actually true This is a “TYPE I ERROR” Fail to reject H0 that was false This is a “TYPE II ERROR” Some students find it helpful to think: “You can reject one hoe, but who can fail to reject two hoes” whatever floats your boat, eh?
“What if” we made the wrong decision? TYPE I ERROR The null hypothesis was true! The probability that we made this error will be same as  (since H0 was true) You will need to know how to recognize this error in context and You will need to know the probability of making a Type I error
“What if” we made the wrong decision? TYPE II ERROR In this case, the null hypothesis was incorrect, but we failed to reject it The probability of making a Type II error is a “what if” calculation “What if  is actually 42- what’s the probability that I fail to reject?” The probability of making a Type II error is known as 
Type II Errors
Type II Errors
Type II Errors This is the alternative samplingdistribution.  Remember:H0 is (presumed) false
Type II Errors
Type II Errors
Type II Errors This is 
Type II Error  is the area of the tail for the sampling distribution of the “what if” parameter value H0: = 5, xbar = 5.8,  = 0.7, n = 40 Calculation of  when a = 6 Since 0 >  we need to calculate the left tail area .
Type II Error Mercifully, the AP exam will never ask you to compute  You will be asked to interpret  Remember that  is always dependent on an alternative value of the parameter .
Power The probability that the significance test will reject H0 at an  level for an alternative value of the parameter is the power of the test against the alternative. Power = 1-  Power is the probability of not making a TYPE II error Lots of power is a good thing!
How to increase power Increase the significance level () Consider an alternative parameter that is further away from the null hypothesis Increase the sample size Decrease  All the above have the effect of decreasing . Less   = More power
Stats chapter 11

Weitere ähnliche Inhalte

Was ist angesagt?

Chapter 8-hypothesis-testing-1211425712197151-9
Chapter 8-hypothesis-testing-1211425712197151-9Chapter 8-hypothesis-testing-1211425712197151-9
Chapter 8-hypothesis-testing-1211425712197151-9
stone66
 
Chap#9 hypothesis testing (3)
Chap#9 hypothesis testing (3)Chap#9 hypothesis testing (3)
Chap#9 hypothesis testing (3)
shafi khan
 
Ppt unit-05-mbf103
Ppt unit-05-mbf103Ppt unit-05-mbf103
Ppt unit-05-mbf103
Vibha Nayak
 
Statistical Test
Statistical TestStatistical Test
Statistical Test
guestdbf093
 

Was ist angesagt? (19)

Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Test hypothesis
Test hypothesisTest hypothesis
Test hypothesis
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Chapter 8-hypothesis-testing-1211425712197151-9
Chapter 8-hypothesis-testing-1211425712197151-9Chapter 8-hypothesis-testing-1211425712197151-9
Chapter 8-hypothesis-testing-1211425712197151-9
 
What is a Single Sample Z Test?
What is a Single Sample Z Test?What is a Single Sample Z Test?
What is a Single Sample Z Test?
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Single sample z test - explain (final)
Single sample z test - explain (final)Single sample z test - explain (final)
Single sample z test - explain (final)
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by hand
 
Basic of Hypothesis Testing TEKU QM
Basic of Hypothesis Testing TEKU QMBasic of Hypothesis Testing TEKU QM
Basic of Hypothesis Testing TEKU QM
 
U uni 6 ssb
U uni 6 ssbU uni 6 ssb
U uni 6 ssb
 
Concept of probability
Concept of probabilityConcept of probability
Concept of probability
 
Chap#9 hypothesis testing (3)
Chap#9 hypothesis testing (3)Chap#9 hypothesis testing (3)
Chap#9 hypothesis testing (3)
 
Calculating a single sample z test
Calculating a single sample z testCalculating a single sample z test
Calculating a single sample z test
 
Ppt unit-05-mbf103
Ppt unit-05-mbf103Ppt unit-05-mbf103
Ppt unit-05-mbf103
 
Statistical Test
Statistical TestStatistical Test
Statistical Test
 
FEC 512.05
FEC 512.05FEC 512.05
FEC 512.05
 
After quiz 2
After quiz 2After quiz 2
After quiz 2
 
Biostatistics ii4june
Biostatistics ii4juneBiostatistics ii4june
Biostatistics ii4june
 

Andere mochten auch (10)

Stats chapter 13
Stats chapter 13Stats chapter 13
Stats chapter 13
 
AP Stats Procedures for Two Independent Samples
AP Stats Procedures for Two Independent SamplesAP Stats Procedures for Two Independent Samples
AP Stats Procedures for Two Independent Samples
 
Chapter2
Chapter2Chapter2
Chapter2
 
Chapter3
Chapter3Chapter3
Chapter3
 
Chapter4
Chapter4Chapter4
Chapter4
 
Multinomial Model Simulations
Multinomial Model SimulationsMultinomial Model Simulations
Multinomial Model Simulations
 
Chapter15
Chapter15Chapter15
Chapter15
 
Chapter6
Chapter6Chapter6
Chapter6
 
Chapter14
Chapter14Chapter14
Chapter14
 
Stats chapter 10
Stats chapter 10Stats chapter 10
Stats chapter 10
 

Ähnlich wie Stats chapter 11

Chapter 20 and 21 combined testing hypotheses about proportions 2013
Chapter 20 and 21 combined testing hypotheses about proportions 2013Chapter 20 and 21 combined testing hypotheses about proportions 2013
Chapter 20 and 21 combined testing hypotheses about proportions 2013
calculistictt
 
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesisTesting of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
svmmcradonco1
 
Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1
yhchung
 
Hypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.pptHypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.ppt
Solomonkiplimo
 

Ähnlich wie Stats chapter 11 (20)

Chapter 20 and 21 combined testing hypotheses about proportions 2013
Chapter 20 and 21 combined testing hypotheses about proportions 2013Chapter 20 and 21 combined testing hypotheses about proportions 2013
Chapter 20 and 21 combined testing hypotheses about proportions 2013
 
Test of hypothesis 1
Test of hypothesis 1Test of hypothesis 1
Test of hypothesis 1
 
hypothesis test
 hypothesis test hypothesis test
hypothesis test
 
HYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.pptHYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.ppt
 
Testing
TestingTesting
Testing
 
Test of significance
Test of significanceTest of significance
Test of significance
 
P value
P valueP value
P value
 
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesisTesting of Hypothesis, p-value, Gaussian distribution, null hypothesis
Testing of Hypothesis, p-value, Gaussian distribution, null hypothesis
 
Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1Hypothesis Testing Lesson 1
Hypothesis Testing Lesson 1
 
Spss session 1 and 2
Spss session 1 and 2Spss session 1 and 2
Spss session 1 and 2
 
Probability and basic statistics with R
Probability and basic statistics with RProbability and basic statistics with R
Probability and basic statistics with R
 
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
Chapter 6 part2-Introduction to Inference-Tests of Significance,  Stating Hyp...Chapter 6 part2-Introduction to Inference-Tests of Significance,  Stating Hyp...
Chapter 6 part2-Introduction to Inference-Tests of Significance, Stating Hyp...
 
20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd
 
importance of P value and its uses in the realtime Significance
importance of P value and its uses in the realtime Significanceimportance of P value and its uses in the realtime Significance
importance of P value and its uses in the realtime Significance
 
Hypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.pptHypothesis Testing techniques in social research.ppt
Hypothesis Testing techniques in social research.ppt
 
Statistical tests
Statistical testsStatistical tests
Statistical tests
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 

Mehr von Richard Ferreria (20)

Chapter8
Chapter8Chapter8
Chapter8
 
Chapter1
Chapter1Chapter1
Chapter1
 
Chapter7
Chapter7Chapter7
Chapter7
 
Chapter5
Chapter5Chapter5
Chapter5
 
Chapter9
Chapter9Chapter9
Chapter9
 
Chapter11
Chapter11Chapter11
Chapter11
 
Chapter12
Chapter12Chapter12
Chapter12
 
Chapter10
Chapter10Chapter10
Chapter10
 
Chapter13
Chapter13Chapter13
Chapter13
 
Adding grades to your google site v2 (dropbox)
Adding grades to your google site v2 (dropbox)Adding grades to your google site v2 (dropbox)
Adding grades to your google site v2 (dropbox)
 
Stats chapter 14
Stats chapter 14Stats chapter 14
Stats chapter 14
 
Stats chapter 15
Stats chapter 15Stats chapter 15
Stats chapter 15
 
Stats chapter 9
Stats chapter 9Stats chapter 9
Stats chapter 9
 
Stats chapter 8
Stats chapter 8Stats chapter 8
Stats chapter 8
 
Stats chapter 8
Stats chapter 8Stats chapter 8
Stats chapter 8
 
Stats chapter 7
Stats chapter 7Stats chapter 7
Stats chapter 7
 
Stats chapter 6
Stats chapter 6Stats chapter 6
Stats chapter 6
 
Podcasting and audio editing
Podcasting and audio editingPodcasting and audio editing
Podcasting and audio editing
 
Adding grades to your google site
Adding grades to your google siteAdding grades to your google site
Adding grades to your google site
 
Stats chapter 5
Stats chapter 5Stats chapter 5
Stats chapter 5
 

Stats chapter 11

  • 3. The Pizza Problem Let us suppose that a certain pizza company claims that they deliver their pizza in an average of 20 minutes Now, we are told “average time” so it’s possible that they’ve delivered a pizza in 5 minutes, and it’s also possible that they delivered a pizza in 30 minutes If we order pizza 10 times, what average time will convince you that they’re claim is wrong? Welcome to significance testing!
  • 4. How significance testing works Assume that a claim about an average or proportion is true Compute the average or prop of a sample Compare the sample with the sampling distribution for the claim and sample size. If the probability of obtaining the sample avg or prop is too low, we conclude that our claim is improbable, and reject it.
  • 5. How significance testing works In all cases, we are comparing the sample with the sampling distribution for the claim and sample size
  • 6. PHANTOMS (a framework) As with Confidence Intervals, there is an acronym to help you remember the steps of a significance test State the Parameter State the Hypothesis pair Check the Assumptions State the Name of the test Find the value of the Test Statistic Obtain a p-value Make a decision Summarize
  • 7. State Parameters Parameters work the same way they did in Confidence Intervals = The true average of the Pizza Company’s delivery times x-bar = the average delivery time for a sample of 10 deliveries from the Pizza Company p = the proportion of all deliveries from the Pizza Company that are delivered in less than 20 minutes p-hat = the proportion of sample of 10 deliveries from the Pizza Company that are delivered in less than 20 minutes
  • 8. Stating Hypotheses Hypotheses come in pairs: “the null hypothesis” H0 “H naught” This is the presumed claim For our purposes, our null hypothesis will always be in the forms: “= __” “p = ___”
  • 9. Stating Hypotheses Hypotheses come in pairs: “the alternative hypothesis” Ha This is the suspicion of the researcher There are 3 alt hyps that we can test “ ≠ ___” (two-sided alternative) “p > ___” (one-sided alternative) “ < ___” (one-sided alternative)
  • 10. Stating Hypotheses Notice: Hypotheses are always about the parameter ( or p, neverxbar or phat) Written Examples “H0:  = 20 minutes Ha:  > 20 minutes” “H0: p = 0.5 Ha: p < 0.5”
  • 11. Checking the Assumptions Since we are comparing our samples to a sampling distribution (just like the last chapter), the assumptions are the same We will review them now:
  • 12. Checking the Assumptions Assumptions for mean SRS IndependenceN > 10n Normality (a, b, or c must be true) (a) population is Normal, or (b) n > 30; Central Limit Theorem, or (c) Sample is approximately normal: (1) histogram single peak and symmetric, (2) Normal probability plot is linear, (3) no Outliers
  • 13. Checking the Assumptions Assumptions for proportions SRS IndependenceN > 10n Normality np > 10nq > 10
  • 14. Name of Test “one-sided z test for means” “two-sided z test for means” “one-sided t test for means” “two-sided t test for means” “one-sided z test for proportions” “two-sided z test for proportions” More on these later
  • 15. Test Statistics Test Statistics are always of the form: Standard Deviation of the sampling distribution depends on the characteristic tested
  • 16. Test Statistics Std Dev for mean ( known): Std Dev for mean ( unknown): Std Dev for proportions: Notice that we use ‘p’ and not ‘p-hat’
  • 17. P-values The P-value is the probability of obtaining a measurement as extreme as the test statistic At its most basic, computing the P-Value is the same as computing area from a Normal curve or Student’s t-distribution Computation varies slightly when using 2-sided alternative vs. 1-sided alternative
  • 18. P-values Two sided alternatives For these alt hyps, we calculate a p-val based on area “from two tails”
  • 19. P-values Example: Let’s assume our sample of 24 has:x-bar = 22 and s = 1.53 H0:  = 20Ha:  20 “2-sided t-test for means”
  • 20. P-values Example (cont) Test Statistic:
  • 21. P-values Example (cont) Test Statistic:
  • 22. P-values Example (cont) Test Statistic:
  • 23. P-values Example (cont) Test Statistic:
  • 24. P-values Example (cont) Test Statistic: P-value
  • 25. P-values Example (cont) Test Statistic: P-value
  • 26. P-values Example (cont) Test Statistic: P-value
  • 27. P-values One sided alternatives Calculate the area tail indicated by the alternative hypothesis for P-value
  • 28. P-values One sided alternatives If H0 p = .53 and Ha: p > 0.53then P-val = P(z > test stat) If Ha:  < 10, then P-val = P(t < test stat)
  • 29. P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22p-hat = 0.20 from n = 55 Test statistic
  • 30. P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22 p-hat = 0.20 from n = 55 Test statistic
  • 31. P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22 p-hat = 0.20 from n = 55 Test statistic
  • 32. P-values Example Let’s assume: H0 p = .22 and Ha : p < 0.22 p-hat = 0.20 from n = 55 Test statistic
  • 33. P-values Example (cont.) P-value Would you say this is “likely” or “unlikely”?
  • 34. Making a decision The P-value serves as the indicator If the test statistic is likely under the presumed sampling distribution (i.e. the p-value is large), then we have no reason to reject the null-hypothesis If the test statistic is unlikely (i.e. the p-value is small), then we have reason to reject the null-hypothesis. “If the p-value is low, reject the Hoe”
  • 35. Making a decision Significance level (‘alpha’ ) This is the probability level at which we will reject H0 Typical sig levels  = 0.10, 0.05, 0.01 If no significance level is given, we will generally reject at the  = 0.05 level. When p-val < , then we “reject H0 at the  = __ level” When H0 is rejected, we say the data is “statistically significant at the  = __ level”
  • 36. Making a decision “Reject or Fail to Reject” When p val > alpha, we “fail to reject H0” This means that we do not have evidence to show H0 is incorrect This does not mean, H0 is “correct” When p val < alpha we “reject H0” This means that H0 is unlikely The new estimate for or p is our sample data (x-bar or p-hat)
  • 37. WOW That was a lot of information! We will be going over this information again at a slower pace in the coming weeks. We’ll work out the mechanics later Understanding the basics and the “whys” right now will help you in the future!
  • 38. Assignment 11.1 Page 693 #3, 5, 7-8, 11-14
  • 39. 11.2 Carrying out Significance Tests
  • 40. z-test for a population mean This is the appropriate test when  is known. Test Statistic:
  • 41. z-test for a population mean P-value:
  • 42. Example 11.10 The mean systolic blood pressure for males 35 to 44 years is 128, and the standard deviation in this population is 15. The medical records of 72 male executives in this age group finds the mean systolic blood pressure is 129.93. Is this evidence that the mean blood pressure for all the company’s younger male executives is different than the national average?
  • 43. Example 11.10 We are going to check to see if our sample comes from a population with the same and sigma as the national population. Because of this, our parameter will come from the national averages. The null hypothesis will assume that younger male executives have the same mean blood pressure as the national average. The null hypothesis will always assume “things are equal”
  • 44. Example 11.10 Parameter “Let  = average blood pressure of all younger male executives in the company” “Let x-bar = average blood pressure in the sample of 72 younger male executives from the company”
  • 45. Example 11.10 Hypotheses  = 128   128 Notice that we will need the 2-sided P-value
  • 46. Example 11.10 Assumptions Simple Random Sample “We are not told that our sample is from an SRS. We should check how this sample was chosen. We will proceed as though this sample was an SRS” Independence “We are not told the size the population of young male executives. We should check that the population is greater than 10(72) = 720.” Normality “Because we have a large sample, the Central Limit Theorem guarantees that the sampling distribution is approximately Normal”
  • 47. Example 11.10 Assumptions (cont.) The preceding example illustrates ‘what to do’ if you think that an assumption is not met. If you believe that an assumption is not met: (1) state the condition that must be qualified, (2) mention that it “needs to checked,” and (3) state you will “proceed as though this assumption was met” Always try to carry out the significance test.
  • 48. Example 11.10 Name of the Test “We will conduct a z-test for a population mean”
  • 49. Example 11.10 Test Statistic
  • 53. Example 11.10 Make a Decision We are not given an  in this examplewe should use the standard 0.05 significance level. The p-value is larger than our , so we should reject the null hypothesis Note: nothing needs to be written for this part of PHANTOMS
  • 54. Example 11.10 Summarize “Approximately 27% of the time, a sample of size n =72 will produce an average at least as extreme as 129.93. Since this p-value is larger than a presumed  = 0.05, we cannot reject our null hypothesis. We have no evidence to suggest that the mean systolic blood pressure of young executives is not 128.”
  • 55. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: (1) Interpret the p-value
  • 56. Approximately 27% of the time, a sample of size n =72 will produce an average at least as extreme as 129.93. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: (1) Interpret the p-value
  • 57. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: (1) Interpret the p-value (2) Compare the p-value with 
  • 58. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: (1) Interpret the p-value (2) Compare the p-value with  Since this p-value is larger than a presumed  = 0.05, we cannot reject our null hypothesis.
  • 59. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: (1) Interpret the p-value (2) Compare the p-value with  (3) Interpret the conclusion in context
  • 60. Example 11.10 Summarize (cont.) Note that the summary contains 3 parts: (1) Interpret the p-value (2) Compare the p-value with  (3) Interpret the conclusion in context We have no evidence to suggest that the mean systolic blood pressure of young executives is not 128.
  • 61. Tests and Confidence Intervals A “two-sided alternative” and the “confidence interval” are the same test. A test will reject the null hypothesis of a two-sided alternative when the test statistic is outside the confident interval with CL = 1 -  The link between confidence intervals and a two-sided test is called “duality” Refer to example 11.12
  • 62. Assignment 11.2 Page 709 #27, 29, 31-33
  • 63. 11.3 Use and Abuse of Tests
  • 64. More on Significance Levels The significance level for a test is informed by the plausibility of H0. If H0 is particularly “strong” or has a many years behind it, then the evidence must also be “strong” (small ) If we were trying to disprove the gravitational constant, the  would have to be very, very small!
  • 65. More on Significance Levels What are the consequences of rejecting H0? There will always be a cost/benefit to rejecting H0 If it is more expensive to reject than it is to fail to reject, then the evidence must be strong (small ) Consider the Toyota brake recall 2009
  • 66. More on Significance Levels There is no “hard line” between reject and fail to reject There isn’t a real difference between  = 0.10 and  = 0.11 There is no sharp border between “statistically significant” and “statistically insignificant” As P-value decreases, the strength of the evidence increases Although  = 0.05 is ‘handy rule of thumb,’ it is not a universal rule
  • 67. Cautions Don’t forget to examine the data The presence of outliers can affect whether the significance tests are plausible “Statistically Significant” is not the same thing as “Important” Lack of significance may signal an important conclusion A Test of Significance is not appropriate for all data sets
  • 68. 11.4 Using Inference to Make Decisions
  • 69. “What if” we made the wrong decision? There are two kinds of wrong decisions: Reject a H0 that was actually true This is a “TYPE I ERROR” Fail to reject H0 that was false This is a “TYPE II ERROR” Some students find it helpful to think: “You can reject one hoe, but who can fail to reject two hoes” whatever floats your boat, eh?
  • 70. “What if” we made the wrong decision? TYPE I ERROR The null hypothesis was true! The probability that we made this error will be same as  (since H0 was true) You will need to know how to recognize this error in context and You will need to know the probability of making a Type I error
  • 71. “What if” we made the wrong decision? TYPE II ERROR In this case, the null hypothesis was incorrect, but we failed to reject it The probability of making a Type II error is a “what if” calculation “What if  is actually 42- what’s the probability that I fail to reject?” The probability of making a Type II error is known as 
  • 74. Type II Errors This is the alternative samplingdistribution. Remember:H0 is (presumed) false
  • 77. Type II Errors This is 
  • 78. Type II Error  is the area of the tail for the sampling distribution of the “what if” parameter value H0: = 5, xbar = 5.8,  = 0.7, n = 40 Calculation of  when a = 6 Since 0 >  we need to calculate the left tail area .
  • 79. Type II Error Mercifully, the AP exam will never ask you to compute  You will be asked to interpret  Remember that  is always dependent on an alternative value of the parameter .
  • 80. Power The probability that the significance test will reject H0 at an  level for an alternative value of the parameter is the power of the test against the alternative. Power = 1-  Power is the probability of not making a TYPE II error Lots of power is a good thing!
  • 81. How to increase power Increase the significance level () Consider an alternative parameter that is further away from the null hypothesis Increase the sample size Decrease  All the above have the effect of decreasing . Less  = More power