SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
Stat310            Testing


                        Hadley Wickham
Sunday, 19 April 2009
1. Import question
                2. Recap
                3. More examples/practice
                4. Choosing a cut-off
                5. P value is a random variable too!
                6. Next time


Sunday, 19 April 2009
Final

                        Which would you prefer?
                        a) a 3 hour final
                        b) a 2 hour final




Sunday, 19 April 2009
Recap

                        What is a null hypothesis? What is an
                        alternative hypothesis?
                        What is the opposite of rejecting the null
                        hypothesis? Why?




Sunday, 19 April 2009
Testing jargon

                        No: Null hypothesis. Nothing is
                        happening. (Thing we want to disprove)
                        Yes: Alternative hypothesis. Something
                        interesting is happening.




Sunday, 19 April 2009
Absence of
     evidence is not
  evidence of absence

Sunday, 19 April 2009
The lady tasting tea
                        A thought experiment by R. A. Fisher
                        (famous early statistician, 1890-1962)
                        A lady at a tea party claims that she can
                        tell the difference between putting the
                        milk in first and second.
                        How can we be sure?



Sunday, 19 April 2009
Experiment

                        8 cups. 4 milk first, 4 milk second.
                        Presented in random order.
                        What is the null hypothesis?
                        How many possible outcomes are there?




Sunday, 19 April 2009
Your turn

                        What would the distribution of correct
                        responses be under the null hypothesis?
                        How many would she need to get correct
                        for us to be reasonably certain that she
                        really could tell the difference?




Sunday, 19 April 2009
Right   Wrong   #     %
                         4        0     1    1%
                         3        1     16   23%
                         2        2     36   51%
                         1        3     16   23%
                         0        4     1    1%
                                        70   100%

Sunday, 19 April 2009
Another example

                        Xi ~ iid Normal(μx, 1)
                        Yi ~ iid Normal(μy, 1)
                        Do they have the same means?




Sunday, 19 April 2009
1. Write down null and alternative
                   hypotheses
                2. Figure out good test statistic
                   (for this class, usually obvious)
                3. Work out distribution under the null




Sunday, 19 April 2009
Experiment
                        x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5
                        y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8
                        (mean of x = 5.67, mean of y = 5.11)
                        Are the means of the underlying
                        distributions the same?
                        (True answer?)


Sunday, 19 April 2009
1. Compute test statistic
                2. Compute p-value, by evaluating F at
                   the test-statistic
                3. (Question: what is the distribution of
                   the p-value if the null hypothesis is
                   true?)



Sunday, 19 April 2009
P-value
                        P value gives us the probability, under the
                        null hypothesis, that we would have seen a
                        value equal to or more extreme than the
                        value we observed.
                        Strength of evidence for rejecting the null
                        hypothesis.
                        But we need a cut off to make a yes-no
                        decision. How do we choose that cut off?


Sunday, 19 April 2009
Errors
                        What are the possible errors we can
                        make?
                        False positive. Choose alternative when
                        null is correct. (aka Type 1)
                        False negative. Choose null when
                        alternative is true. (aka Type 2)



Sunday, 19 April 2009
Terminology
                        Probability of a false positive called α
                        Probability of false negative called 1 - β


                        How are the two related?
                        Usually care more about false positives.
                        Usually pick arbitrary cut-off of what?


Sunday, 19 April 2009
Testing overview

                        Write down null and alternative
                        hypotheses.
                        Compute test statistic.
                        Convert to p-value.
                        Compare p-value to alpha cut off.



Sunday, 19 April 2009
Back to example




Sunday, 19 April 2009
y
                                                                                                                                                                                                                                                                 y                       y
                                                                                                                        y                                                                                                                                                                         y
                                                                                                                                                                                                                                                 y
   6.5                                                                                                                                                                                                                                                                                                     y
                                                                                                                                                                                                                  y
                                                                                                                                                                         yy
                                                                                            y                                                                                                                                                                                                                                          y
                                                                                                                                            y                                                             y
                                                                                                                y
                    y                                                                                                                                                                 y
                                                                                                                                                             y
                                                                       y
                                                      y                                                                                                                                                                                              y                                                                                              y
                                                                                                                    y
                                                                                        y                                           y
                                y        y                                                                                                                                                            y                                                                                                y
                                                                                       y y yy                                                                                                                                                                                                                  yy
                                                              y                                                                                                                                                                yy                                                             y
                           yyy
                        yy                                                                                                                                                                                                                                                                                       yyyy y y
                                                                                                                                                                                                                                                                             yyy
                                                                                                                                                                                                              yy
   6.0                                                                                                                                                                                                                                   y
                                                                                                                                                                     y                                             y
                                                                                                                                                                                                                                                                                       yy y y y
                            y                                                                                                                                                                 y                                                                                                                        y
                                                                                                                                                         yy                                                                                                                                                    y
                               y                                                                            y                                                                                                    y
                                                                                                                                                                                  y
                                                                  yy                                                                    y                                                                                                                            y
                                                                                                                                                                                                                                             y
                                                                                                                                                                                          y
                                                                               y                        y                                                                                                                                                y
                                                                                                                                                yyy                                                                                 yy
                                                                                                                            yy                                                                                                                                                                                                                  y
                                                                                                                                                                                                                                                                                   y
                                                                                                                                                                                                                                                             y
                                              y                                                                                                                                                                                                                          y
                                                                                                                                                                                 y
                                                                                   y
                                                                   x                                                                                                                              x
                                                                                                                                                                                                  y
                                                                                                                                                                                   x
                                                          y
   5.5                                                                                                                                                                                                                                                                                            x
                                                                                                                                                                                                                               yxx
                                                                                                                                                                 x
                                                                                        x                                                                                        x                                                                                                                                                         x
                                                              x                                                                                                                                                                                                                                                                    x
                                                                                                x                                   x                                                                 x
                                                                                                        xx
                                                                                   x                                                                                                                                                                                                                               x
                                                                                                                                                x                                                                                xx
                                                      x                                                                                                                                                                                                                                                                                         x
                                                                                                                            x                                                                                                                    x
                                    xx                                     y
                x                                                                                                                                                                                                                                                                                                      x
                                                                                                                                                                                                              x                                                  x           x x xx                                                    x
                                                                                                                                                                             x
                                              x                                                                                                          x                                                                                                                                                                 x
                                                                                            x                                                                                                                                                                                                                                                       x
                                                                                                                                                                                          xx
   5.0                                                                                                                  x
                            x                             x                                         x                                                                                                                 xx
                                                                                       xx                                                                                                                                                                                                                      x
                                                                                                                                                                                                                                                                                         xx
                                                                                                                                            x                                                                                                                            x
                                                                                                                                                                                  x
                                                                                                                                x
                                x                                                                                                                                                                                              x                                                                  xx
                                                                                                                                                                                                          x                                                          x
                    x                                                                                                                                                                                                                                                         x
                                                                                                                                                    xx
                                                                  x xx
                                         xx                                                                 x                                                        x                                            x                                      xx                                                                    x
                        x                                                                                                                                                                                                      xx
                                                                                                                    x                                                                                                                                                              x
                                                                                                                                                             x
                                                                               x
                                    x                                                                                                                                                                                                                                                         x        x xx
                                                                                                                                                                                                                                                     x
   4.5                                                                                                                                  x                                                                                  x                                                                                                               xx
                                                  x                                                                                                                      x

                                                                           20                                                                       40                                                                     60                                                           80                                                     100




Sunday, 19 April 2009
2.0


              1.5
 Difference




              1.0


              0.5


              0.0

                        20   40   60   80   100




Sunday, 19 April 2009
2.0


                1.5
 |Difference|




                1.0


                0.5


                0.0
                        20   40   60   80   100




Sunday, 19 April 2009
4


           3
 z−score




           2


           1


           0
                        20   40   60   80   100




Sunday, 19 April 2009
61 rejected

              0.8


              0.6
 yintercept




              0.4


              0.2


              0.0
                        20   40   60   80      100




Sunday, 19 April 2009
Sunday, 19 April 2009
y
                            x

                                                                                                          y                       x
                                                                                                          x
                                                                   x                                                              y
                                                                                      x
   5.5                                                         y                                                                                               x
                                                                                  y       y                                x
                                                                                                  x
                                                       x                                                                                                                 y
                                                                                                                                                         y
                                                                                                                                                   x
                                yy                                                            y
                                                               x            xyx                       x                                                               xx
                                                                                                                                   xy
                                                                                                                  xx
                        x                                  x                                                                                       y              yy
                                     y    y                                                                                      x
                                                         yyy                                                                           xx
                xx                                                       x
                            y                                                                         y                  xyy xy
                                                                           xx      x                                                    y
                                     xy                                                                                                              x             x
                                                      yx                                     y
                                                                y yx                 x
                                       y
                                 y                          x
                                                    y                         yy                                                              yy
                                                                                                                     yy
                            y x y xy                                                   xx                                                                   xy
                                                                                                                yx
                          yx y                                x                                                                    y
                                                       y
                                               y
                y yy                                                                                                                                                  yx
                                                                                                                                                          xx
                                                      xxx
   5.0                                                                                                                                                    y
                                                                         yyx
                                         x                                                                                 x                     y
                                                              yy x                                                                          y
                                                                                                 x                            yy
                     yx                                                                                          y
                                                                                               yy                                                       x             y
                                       x
                              x                xx
                        x                                        x                                                                        x
                                                                                                                         y                                       yyy
                                                                                 x
                      x            x
                 y                                                 y
                                           xy                                                                                                        yx
                                                                                                                       x
                                                                                             x
                                               xy
                   x    y                                             x                                                                       xxx y
                                                                                                                     x
                                     y                                                                                                                               xx
                                                          y                                y
                                                                x x yy y         y
                                yx           x
                          x                                                                                                xy
                                                                                                              x                      x
                                xx y                                                                                                 y                  y      y
                                                                                                                                                               xx
                 x                                                                       y
                                           y                                                                       y
                                     x
   4.5                                                                                 y
                                                                                                                                          yx
                                                                                                                               x
                                                  y                                            x
                                                   y

                                              20                       40                             60                                80                            100




Sunday, 19 April 2009
1.0


               0.5
 sameerence




               0.0


              −0.5



                        20   40    60   80   100




  Can you think of another test-
  statistic based on this plot?
Sunday, 19 April 2009
1.2

                1.0

                0.8
 |sameerence|




                0.6

                0.4

                0.2

                0.0
                        20   40   60   80   100




Sunday, 19 April 2009
2.5

           2.0
 z−score




           1.5

           1.0

           0.5

           0.0
                        20   40   60   80   100




Sunday, 19 April 2009
1 rejected

              0.8


              0.6
 yintercept




              0.4


              0.2


              0.0
                        20   40   60   80          100




Sunday, 19 April 2009
The rest of testing
                        For a given situation, need to know a good
                        test-statistic and the distribution under the
                        null.
                        Lots of standard cases, which you can
                        now derive, or look up in a book.
                        In a final, I will either explicitly ask you to
                        derive it, or I’ll give you the test statistic
                        and null distribution.


Sunday, 19 April 2009
Next time
                        Graded tests back.
                        Information about the final.
                        (Incl. study session)
                        What you I do with statistics (stat405).
                        Other courses / Majoring in statistics.
                        Celebrate being done.


Sunday, 19 April 2009

Weitere ähnliche Inhalte

Mehr von Hadley Wickham (20)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 

Kürzlich hochgeladen

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

25 Testing

  • 1. Stat310 Testing Hadley Wickham Sunday, 19 April 2009
  • 2. 1. Import question 2. Recap 3. More examples/practice 4. Choosing a cut-off 5. P value is a random variable too! 6. Next time Sunday, 19 April 2009
  • 3. Final Which would you prefer? a) a 3 hour final b) a 2 hour final Sunday, 19 April 2009
  • 4. Recap What is a null hypothesis? What is an alternative hypothesis? What is the opposite of rejecting the null hypothesis? Why? Sunday, 19 April 2009
  • 5. Testing jargon No: Null hypothesis. Nothing is happening. (Thing we want to disprove) Yes: Alternative hypothesis. Something interesting is happening. Sunday, 19 April 2009
  • 6. Absence of evidence is not evidence of absence Sunday, 19 April 2009
  • 7. The lady tasting tea A thought experiment by R. A. Fisher (famous early statistician, 1890-1962) A lady at a tea party claims that she can tell the difference between putting the milk in first and second. How can we be sure? Sunday, 19 April 2009
  • 8. Experiment 8 cups. 4 milk first, 4 milk second. Presented in random order. What is the null hypothesis? How many possible outcomes are there? Sunday, 19 April 2009
  • 9. Your turn What would the distribution of correct responses be under the null hypothesis? How many would she need to get correct for us to be reasonably certain that she really could tell the difference? Sunday, 19 April 2009
  • 10. Right Wrong # % 4 0 1 1% 3 1 16 23% 2 2 36 51% 1 3 16 23% 0 4 1 1% 70 100% Sunday, 19 April 2009
  • 11. Another example Xi ~ iid Normal(μx, 1) Yi ~ iid Normal(μy, 1) Do they have the same means? Sunday, 19 April 2009
  • 12. 1. Write down null and alternative hypotheses 2. Figure out good test statistic (for this class, usually obvious) 3. Work out distribution under the null Sunday, 19 April 2009
  • 13. Experiment x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5 y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8 (mean of x = 5.67, mean of y = 5.11) Are the means of the underlying distributions the same? (True answer?) Sunday, 19 April 2009
  • 14. 1. Compute test statistic 2. Compute p-value, by evaluating F at the test-statistic 3. (Question: what is the distribution of the p-value if the null hypothesis is true?) Sunday, 19 April 2009
  • 15. P-value P value gives us the probability, under the null hypothesis, that we would have seen a value equal to or more extreme than the value we observed. Strength of evidence for rejecting the null hypothesis. But we need a cut off to make a yes-no decision. How do we choose that cut off? Sunday, 19 April 2009
  • 16. Errors What are the possible errors we can make? False positive. Choose alternative when null is correct. (aka Type 1) False negative. Choose null when alternative is true. (aka Type 2) Sunday, 19 April 2009
  • 17. Terminology Probability of a false positive called α Probability of false negative called 1 - β How are the two related? Usually care more about false positives. Usually pick arbitrary cut-off of what? Sunday, 19 April 2009
  • 18. Testing overview Write down null and alternative hypotheses. Compute test statistic. Convert to p-value. Compare p-value to alpha cut off. Sunday, 19 April 2009
  • 19. Back to example Sunday, 19 April 2009
  • 20. y y y y y y 6.5 y y yy y y y y y y y y y y y y y y y y y y y y y yy yy y yy y yyy yy yyyy y y yyy yy 6.0 y y y yy y y y y y y yy y y y y y yy y y y y y y y yyy yy yy y y y y y y y x x y x y 5.5 x yxx x x x x x x x x x xx x x x xx x x x x xx y x x x x x x xx x x x x x x x xx 5.0 x x x x xx xx x xx x x x x x x xx x x x x xx x xx xx x x x xx x x xx x x x x x x x xx x 4.5 x x xx x x 20 40 60 80 100 Sunday, 19 April 2009
  • 21. 2.0 1.5 Difference 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 22. 2.0 1.5 |Difference| 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 23. 4 3 z−score 2 1 0 20 40 60 80 100 Sunday, 19 April 2009
  • 24. 61 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 26. y x y x x x y x 5.5 y x y y x x x y y x yy y x xyx x xx xy xx x x y yy y y x yyy xx xx x y y xyy xy xx x y xy x x yx y y yx x y y x y yy yy yy y x y xy xx xy yx yx y x y y y y yy yx xx xxx 5.0 y yyx x x y yy x y x yy yx y yy x y x x xx x x x y yyy x x x y y xy yx x x xy x y x xxx y x y xx y y x x yy y y yx x x xy x x xx y y y y xx x y y y x 4.5 y yx x y x y 20 40 60 80 100 Sunday, 19 April 2009
  • 27. 1.0 0.5 sameerence 0.0 −0.5 20 40 60 80 100 Can you think of another test- statistic based on this plot? Sunday, 19 April 2009
  • 28. 1.2 1.0 0.8 |sameerence| 0.6 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 29. 2.5 2.0 z−score 1.5 1.0 0.5 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 30. 1 rejected 0.8 0.6 yintercept 0.4 0.2 0.0 20 40 60 80 100 Sunday, 19 April 2009
  • 31. The rest of testing For a given situation, need to know a good test-statistic and the distribution under the null. Lots of standard cases, which you can now derive, or look up in a book. In a final, I will either explicitly ask you to derive it, or I’ll give you the test statistic and null distribution. Sunday, 19 April 2009
  • 32. Next time Graded tests back. Information about the final. (Incl. study session) What you I do with statistics (stat405). Other courses / Majoring in statistics. Celebrate being done. Sunday, 19 April 2009