SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
Stat310            Testing


                        Hadley Wickham
Sunday, 19 April 2009
1. Import question
                2. Recap
                3. More examples/practice
                4. Choosing a cut-off
                5. P value is a random variable too!
                6. Next time


Sunday, 19 April 2009
Final

                        Which would you prefer?
                        a) a 3 hour final
                        b) a 2 hour final




Sunday, 19 April 2009
Recap

                        What is a null hypothesis? What is an
                        alternative hypothesis?
                        What is the opposite of rejecting the null
                        hypothesis? Why?




Sunday, 19 April 2009
Testing jargon

                        No: Null hypothesis. Nothing is
                        happening. (Thing we want to disprove)
                        Yes: Alternative hypothesis. Something
                        interesting is happening.




Sunday, 19 April 2009
Absence of
     evidence is not
  evidence of absence

Sunday, 19 April 2009
The lady tasting tea
                        A thought experiment by R. A. Fisher
                        (famous early statistician, 1890-1962)
                        A lady at a tea party claims that she can
                        tell the difference between putting the
                        milk in first and second.
                        How can we be sure?



Sunday, 19 April 2009
Experiment

                        8 cups. 4 milk first, 4 milk second.
                        Presented in random order.
                        What is the null hypothesis?
                        How many possible outcomes are there?




Sunday, 19 April 2009
Your turn

                        What would the distribution of correct
                        responses be under the null hypothesis?
                        How many would she need to get correct
                        for us to be reasonably certain that she
                        really could tell the difference?




Sunday, 19 April 2009
Right   Wrong   #     %
                         4        0     1    1%
                         3        1     16   23%
                         2        2     36   51%
                         1        3     16   23%
                         0        4     1    1%
                                        70   100%

Sunday, 19 April 2009
Another example

                        Xi ~ iid Normal(μx, 1)
                        Yi ~ iid Normal(μy, 1)
                        Do they have the same means?




Sunday, 19 April 2009
1. Write down null and alternative
                   hypotheses
                2. Figure out good test statistic
                   (for this class, usually obvious)
                3. Work out distribution under the null




Sunday, 19 April 2009
Experiment
                        x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5
                        y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8
                        (mean of x = 5.67, mean of y = 5.11)
                        Are the means of the underlying
                        distributions the same?
                        (True answer?)


Sunday, 19 April 2009
1. Compute test statistic
                2. Compute p-value, by evaluating F at
                   the test-statistic
                3. (Question: what is the distribution of
                   the p-value if the null hypothesis is
                   true?)



Sunday, 19 April 2009
P-value
                        P value gives us the probability, under the
                        null hypothesis, that we would have seen a
                        value equal to or more extreme than the
                        value we observed.
                        Strength of evidence for rejecting the null
                        hypothesis.
                        But we need a cut off to make a yes-no
                        decision. How do we choose that cut off?


Sunday, 19 April 2009
Errors
                        What are the possible errors we can
                        make?
                        False positive. Choose alternative when
                        null is correct. (aka Type 1)
                        False negative. Choose null when
                        alternative is true. (aka Type 2)



Sunday, 19 April 2009
Terminology
                        Probability of a false positive called α
                        Probability of false negative called 1 - β


                        How are the two related?
                        Usually care more about false positives.
                        Usually pick arbitrary cut-off of what?


Sunday, 19 April 2009
Testing overview

                        Write down null and alternative
                        hypotheses.
                        Compute test statistic.
                        Convert to p-value.
                        Compare p-value to alpha cut off.



Sunday, 19 April 2009
Back to example




Sunday, 19 April 2009
y
                                                                                                                                                                                                                                                                 y                       y
                                                                                                                        y                                                                                                                                                                         y
                                                                                                                                                                                                                                                 y
   6.5                                                                                                                                                                                                                                                                                                     y
                                                                                                                                                                                                                  y
                                                                                                                                                                         yy
                                                                                            y                                                                                                                                                                                                                                          y
                                                                                                                                            y                                                             y
                                                                                                                y
                    y                                                                                                                                                                 y
                                                                                                                                                             y
                                                                       y
                                                      y                                                                                                                                                                                              y                                                                                              y
                                                                                                                    y
                                                                                        y                                           y
                                y        y                                                                                                                                                            y                                                                                                y
                                                                                       y y yy                                                                                                                                                                                                                  yy
                                                              y                                                                                                                                                                yy                                                             y
                           yyy
                        yy                                                                                                                                                                                                                                                                                       yyyy y y
                                                                                                                                                                                                                                                                             yyy
                                                                                                                                                                                                              yy
   6.0                                                                                                                                                                                                                                   y
                                                                                                                                                                     y                                             y
                                                                                                                                                                                                                                                                                       yy y y y
                            y                                                                                                                                                                 y                                                                                                                        y
                                                                                                                                                         yy                                                                                                                                                    y
                               y                                                                            y                                                                                                    y
                                                                                                                                                                                  y
                                                                  yy                                                                    y                                                                                                                            y
                                                                                                                                                                                                                                             y
                                                                                                                                                                                          y
                                                                               y                        y                                                                                                                                                y
                                                                                                                                                yyy                                                                                 yy
                                                                                                                            yy                                                                                                                                                                                                                  y
                                                                                                                                                                                                                                                                                   y
                                                                                                                                                                                                                                                             y
                                              y                                                                                                                                                                                                                          y
                                                                                                                                                                                 y
                                                                                   y
                                                                   x                                                                                                                              x
                                                                                                                                                                                                  y
                                                                                                                                                                                   x
                                                          y
   5.5                                                                                                                                                                                                                                                                                            x
                                                                                                                                                                                                                               yxx
                                                                                                                                                                 x
                                                                                        x                                                                                        x                                                                                                                                                         x
                                                              x                                                                                                                                                                                                                                                                    x
                                                                                                x                                   x                                                                 x
                                                                                                        xx
                                                                                   x                                                                                                                                                                                                                               x
                                                                                                                                                x                                                                                xx
                                                      x                                                                                                                                                                                                                                                                                         x
                                                                                                                            x                                                                                                                    x
                                    xx                                     y
                x                                                                                                                                                                                                                                                                                                      x
                                                                                                                                                                                                              x                                                  x           x x xx                                                    x
                                                                                                                                                                             x
                                              x                                                                                                          x                                                                                                                                                                 x
                                                                                            x                                                                                                                                                                                                                                                       x
                                                                                                                                                                                          xx
   5.0                                                                                                                  x
                            x                             x                                         x                                                                                                                 xx
                                                                                       xx                                                                                                                                                                                                                      x
                                                                                                                                                                                                                                                                                         xx
                                                                                                                                            x                                                                                                                            x
                                                                                                                                                                                  x
                                                                                                                                x
                                x                                                                                                                                                                                              x                                                                  xx
                                                                                                                                                                                                          x                                                          x
                    x                                                                                                                                                                                                                                                         x
                                                                                                                                                    xx
                                                                  x xx
                                         xx                                                                 x                                                        x                                            x                                      xx                                                                    x
                        x                                                                                                                                                                                                      xx
                                                                                                                    x                                                                                                                                                              x
                                                                                                                                                             x
                                                                               x
                                    x                                                                                                                                                                                                                                                         x        x xx
                                                                                                                                                                                                                                                     x
   4.5                                                                                                                                  x                                                                                  x                                                                                                               xx
                                                  x                                                                                                                      x

                                                                           20                                                                       40                                                                     60                                                           80                                                     100




Sunday, 19 April 2009
2.0


              1.5
 Difference




              1.0


              0.5


              0.0

                        20   40   60   80   100




Sunday, 19 April 2009
2.0


                1.5
 |Difference|




                1.0


                0.5


                0.0
                        20   40   60   80   100




Sunday, 19 April 2009
4


           3
 z−score




           2


           1


           0
                        20   40   60   80   100




Sunday, 19 April 2009
61 rejected

              0.8


              0.6
 yintercept




              0.4


              0.2


              0.0
                        20   40   60   80      100




Sunday, 19 April 2009
Sunday, 19 April 2009
y
                            x

                                                                                                          y                       x
                                                                                                          x
                                                                   x                                                              y
                                                                                      x
   5.5                                                         y                                                                                               x
                                                                                  y       y                                x
                                                                                                  x
                                                       x                                                                                                                 y
                                                                                                                                                         y
                                                                                                                                                   x
                                yy                                                            y
                                                               x            xyx                       x                                                               xx
                                                                                                                                   xy
                                                                                                                  xx
                        x                                  x                                                                                       y              yy
                                     y    y                                                                                      x
                                                         yyy                                                                           xx
                xx                                                       x
                            y                                                                         y                  xyy xy
                                                                           xx      x                                                    y
                                     xy                                                                                                              x             x
                                                      yx                                     y
                                                                y yx                 x
                                       y
                                 y                          x
                                                    y                         yy                                                              yy
                                                                                                                     yy
                            y x y xy                                                   xx                                                                   xy
                                                                                                                yx
                          yx y                                x                                                                    y
                                                       y
                                               y
                y yy                                                                                                                                                  yx
                                                                                                                                                          xx
                                                      xxx
   5.0                                                                                                                                                    y
                                                                         yyx
                                         x                                                                                 x                     y
                                                              yy x                                                                          y
                                                                                                 x                            yy
                     yx                                                                                          y
                                                                                               yy                                                       x             y
                                       x
                              x                xx
                        x                                        x                                                                        x
                                                                                                                         y                                       yyy
                                                                                 x
                      x            x
                 y                                                 y
                                           xy                                                                                                        yx
                                                                                                                       x
                                                                                             x
                                               xy
                   x    y                                             x                                                                       xxx y
                                                                                                                     x
                                     y                                                                                                                               xx
                                                          y                                y
                                                                x x yy y         y
                                yx           x
                          x                                                                                                xy
                                                                                                              x                      x
                                xx y                                                                                                 y                  y      y
                                                                                                                                                               xx
                 x                                                                       y
                                           y                                                                       y
                                     x
   4.5                                                                                 y
                                                                                                                                          yx
                                                                                                                               x
                                                  y                                            x
                                                   y

                                              20                       40                             60                                80                            100




Sunday, 19 April 2009
1.0


               0.5
 sameerence




               0.0


              −0.5



                        20   40    60   80   100




  Can you think of another test-
  statistic based on this plot?
Sunday, 19 April 2009
1.2

                1.0

                0.8
 |sameerence|




                0.6

                0.4

                0.2

                0.0
                        20   40   60   80   100




Sunday, 19 April 2009
2.5

           2.0
 z−score




           1.5

           1.0

           0.5

           0.0
                        20   40   60   80   100




Sunday, 19 April 2009
1 rejected

              0.8


              0.6
 yintercept




              0.4


              0.2


              0.0
                        20   40   60   80          100




Sunday, 19 April 2009
The rest of testing
                        For a given situation, need to know a good
                        test-statistic and the distribution under the
                        null.
                        Lots of standard cases, which you can
                        now derive, or look up in a book.
                        In a final, I will either explicitly ask you to
                        derive it, or I’ll give you the test statistic
                        and null distribution.


Sunday, 19 April 2009
Next time
                        Graded tests back.
                        Information about the final.
                        (Incl. study session)
                        What you I do with statistics (stat405).
                        Other courses / Majoring in statistics.
                        Celebrate being done.


Sunday, 19 April 2009

Weitere ähnliche Inhalte

Mehr von Hadley Wickham (20)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

25 Testing

  • 1. Stat310 Testing Hadley Wickham Sunday, 19 April 2009
  • 2. 1. Import question 2. Recap 3. More examples/practice 4. Choosing a cut-off 5. P value is a random variable too! 6. Next time Sunday, 19 April 2009
  • 3. Final Which would you prefer? a) a 3 hour final b) a 2 hour final Sunday, 19 April 2009
  • 4. Recap What is a null hypothesis? What is an alternative hypothesis? What is the opposite of rejecting the null hypothesis? Why? Sunday, 19 April 2009
  • 5. Testing jargon No: Null hypothesis. Nothing is happening. (Thing we want to disprove) Yes: Alternative hypothesis. Something interesting is happening. Sunday, 19 April 2009
  • 6. Absence of evidence is not evidence of absence Sunday, 19 April 2009
  • 7. The lady tasting tea A thought experiment by R. A. Fisher (famous early statistician, 1890-1962) A lady at a tea party claims that she can tell the difference between putting the milk in first and second. How can we be sure? Sunday, 19 April 2009
  • 8. Experiment 8 cups. 4 milk first, 4 milk second. Presented in random order. What is the null hypothesis? How many possible outcomes are there? Sunday, 19 April 2009
  • 9. Your turn What would the distribution of correct responses be under the null hypothesis? How many would she need to get correct for us to be reasonably certain that she really could tell the difference? Sunday, 19 April 2009
  • 10. Right Wrong # % 4 0 1 1% 3 1 16 23% 2 2 36 51% 1 3 16 23% 0 4 1 1% 70 100% Sunday, 19 April 2009
  • 11. Another example Xi ~ iid Normal(μx, 1) Yi ~ iid Normal(μy, 1) Do they have the same means? Sunday, 19 April 2009
  • 12. 1. Write down null and alternative hypotheses 2. Figure out good test statistic (for this class, usually obvious) 3. Work out distribution under the null Sunday, 19 April 2009
  • 13. Experiment x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5 y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8 (mean of x = 5.67, mean of y = 5.11) Are the means of the underlying distributions the same? (True answer?) Sunday, 19 April 2009
  • 14. 1. Compute test statistic 2. Compute p-value, by evaluating F at the test-statistic 3. (Question: what is the distribution of the p-value if the null hypothesis is true?) Sunday, 19 April 2009
  • 15. P-value P value gives us the probability, under the null hypothesis, that we would have seen a value equal to or more extreme than the value we observed. Strength of evidence for rejecting the null hypothesis. But we need a cut off to make a yes-no decision. How do we choose that cut off? Sunday, 19 April 2009
  • 16. Errors What are the possible errors we can make? False positive. Choose alternative when null is correct. (aka Type 1) False negative. Choose null when alternative is true. (aka Type 2) Sunday, 19 April 2009
  • 17. Terminology Probability of a false positive called α Probability of false negative called 1 - β How are the two related? Usually care more about false positives. Usually pick arbitrary cut-off of what? Sunday, 19 April 2009
  • 18. Testing overview Write down null and alternative hypotheses. Compute test statistic. Convert to p-value. Compare p-value to alpha cut off. Sunday, 19 April 2009
  • 19. Back to example Sunday, 19 April 2009
  • 20. y y y y y y 6.5 y y yy y y y y y y y y y y y y y y y y y y y y y yy yy y yy y yyy yy yyyy y y yyy yy 6.0 y y y yy y y y y y y yy y y y y y yy y y y y y y y yyy yy yy y y y y y y y x x y x y 5.5 x yxx x x x x x x x x x xx x x x xx x x x x xx y x x x x x x xx x x x x x x x xx 5.0 x x x x xx xx x xx x x x x x x xx x x x x xx x xx xx x x x xx x x xx x x x x x x x xx x 4.5 x x xx x x 20 40 60 80 100 Sunday, 19 April 2009
  • 21. 2.0 1.5 Difference 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 22. 2.0 1.5 |Difference| 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 23. 4 3 z−score 2 1 0 20 40 60 80 100 Sunday, 19 April 2009
  • 24. 61 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 26. y x y x x x y x 5.5 y x y y x x x y y x yy y x xyx x xx xy xx x x y yy y y x yyy xx xx x y y xyy xy xx x y xy x x yx y y yx x y y x y yy yy yy y x y xy xx xy yx yx y x y y y y yy yx xx xxx 5.0 y yyx x x y yy x y x yy yx y yy x y x x xx x x x y yyy x x x y y xy yx x x xy x y x xxx y x y xx y y x x yy y y yx x x xy x x xx y y y y xx x y y y x 4.5 y yx x y x y 20 40 60 80 100 Sunday, 19 April 2009
  • 27. 1.0 0.5 sameerence 0.0 −0.5 20 40 60 80 100 Can you think of another test- statistic based on this plot? Sunday, 19 April 2009
  • 28. 1.2 1.0 0.8 |sameerence| 0.6 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 29. 2.5 2.0 z−score 1.5 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 30. 1 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 31. The rest of testing For a given situation, need to know a good test-statistic and the distribution under the null. Lots of standard cases, which you can now derive, or look up in a book. In a final, I will either explicitly ask you to derive it, or I’ll give you the test statistic and null distribution. Sunday, 19 April 2009
  • 32. Next time Graded tests back. Information about the final. (Incl. study session) What you I do with statistics (stat405). Other courses / Majoring in statistics. Celebrate being done. Sunday, 19 April 2009