SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Stat310         Maximum likelihood


                          Hadley Wickham
Sunday, 11 April 2010
1. Assessment
                2. Feedback
                3. Joint pdf
                4. Maximum likelihood




Sunday, 11 April 2010
Assessment
                        • All grading now 100% up to date
                          (as far as I know)
                        • Overall grade to date in owlspace
                          (but doesn’t account for dropping
                          lowest homework)
                        • Quizzes were going to be worth 10%,
                          change to 5%?


Sunday, 11 April 2010
So far

                        • 2 / 2 tests * 10% = 20%
                        • 7 / 10 homeworks * 40% = 28%
                        • 3 / 5 quizzes * 5% = 3%
                        • Total: 51% of grade




Sunday, 11 April 2010
To come
                        • 1 final * 30% = 30%
                        • 3 / 10 homeworks * 40% = 12%
                        • 2 / 5 quizzes * 5% = 2%
                        • 5% TBA
                        • Total: 49% of grade



Sunday, 11 April 2010
Test

                        • Bad news: It was harder
                        • Good news: I’ve figured out why, so it
                          won’t happen on the final




Sunday, 11 April 2010
14



         12



         10



         8
 count




         6



         4



         2



         0

                        0.0   0.2   0.4        0.6   0.8   1.0
                                          T2
Sunday, 11 April 2010
1.0
                                         Better                                                                  ●


                                                                                                                 ●               ●
                                                                                                                                     ●           ●
                                                                                             ●               ●                           ●
                                                                                                             ●           ●       ●       ● ● ● ●
                                 0.8                                                                     ● ● ● ●                 ●
                                                                                     ●                                   ●               ●
                                                                                             ●       ●       ● ● ● ●
                                                                                                     ●                                       ●
                                                                                             ● ● ● ●             ● ●                 ●
                                                                             ●           ● ●             ●           ●
                                                                                                     ●                   ●       ●
                                 0.6                           ●                                 ● ●                     ●
                                                                                                         ● ● ●           ● ●
                                                                                                 ●       ●                   ●                   ●
                        Test 2




                                                                                 ● ●
                                                               ● ●                       ●       ●           ●


                                                                                                         ● ●
                                 0.4                                             ●                               ●
                                                                         ●       ●                               ●
                                                                             ●


                                                                                                                     ●
                                                    ●                    ●
                                                                     ●
                                 0.2




                                 0.0
                                                                                                                         Worse
                                       0.0    0.2       0.4                      0.6                             0.8                             1.0
                                                              Test 1
Sunday, 11 April 2010
15




         10
 count




         5




         0

                        10   20             30   40   50
                                  Overall
Sunday, 11 April 2010
These are
              minimums
         15



              described in
              the syllabus
         10
 count




         5




                             F                       C        B   A
         0

                        10       20             30       40           50
                                      Overall
Sunday, 11 April 2010
Options
                        • Do nothing.
                        • Add 3 points on to test. Distribute 5%
                          evenly across all assessment.
                        • 1 hour take home exam worth 5%.
                          2-3 problems from the book.
                        • 1 extra homework worth 5%.
                          4-5 problems from the book.


Sunday, 11 April 2010
Homeworks
                        • Due Thursday in class
                        • Out of the goodness of my heart I have
                          been accepting late homeworks
                        • But it is getting excessive - I shouldn’t
                          have to deal with 15 late homeworks a
                          week
                        • Please turn in on time or I will start
                          enforcing the late homework penalty.


Sunday, 11 April 2010
Feedback



Sunday, 11 April 2010
Feedback about me
                        Doing well: Lectures/teaching (13), engaging/
                        interesting lectures (11), website (10),
                        examples (10), homeworks (8), help sessions
                        (6), pace (4), funny (3), being awesome (2)
                        Needs improvement: test too hard (too many
                        to count), hard to study from ppt (7), more
                        activities (5), less mistakes (5), too fast (4),
                        homework session should be a tutorial (3)


Sunday, 11 April 2010
Changes
                        My notes are scattered between slides, the
                        board and my voice. Your notes should
                        not be!
                        Will continue to try and find interesting
                        examples and activities.
                        For final review session, will have voting
                        system and I’ll re-cover popular topics on
                        the board.


Sunday, 11 April 2010
You




     Doing well




    Needs
 improvement



Sunday, 11 April 2010
You



                              Marijuana?
     Doing well




    Needs
 improvement



Sunday, 11 April 2010
You




     Doing well




                        Probably read
    Needs                ahead, but
 improvement            who does that
                          anyways

Sunday, 11 April 2010
You



                        I’m enjoying
     Doing well         the weather




    Needs
 improvement



Sunday, 11 April 2010
You




     Doing well




    Needs
                              my grade
 improvement



Sunday, 11 April 2010
Why do we care
                        about random
                          variables?


Sunday, 11 April 2010
Experiments
                        If we capture all the relevant information
                        about an experiment, we can repeat
                        virtually (either mathematically or
                        computationally). This is usually easier
                        and cheaper than doing the real
                        experiment!
                        The mathematical abstraction we use to
                        do this is the random variable.


Sunday, 11 April 2010
So
                        The purpose of a random variable is to
                        describe (or at least approximate) the
                        behaviour of an experiment. So:
                        X ~ SomeDist(some params)
                        means we have a single experiment
                        whose behaviour is defined.



Sunday, 11 April 2010
Replications
                        X1 ~ SomeDist(some params)
                        X2 ~ SomeDist(some params)
                        Means we repeat the experiment twice - it’s the
                        same distribution, which implies that the
                        experiment is repeated under identical conditions.
                        f(x1, x2) is the bivariate pdf which allows us to
                        figure out the probability of any event involving
                        the two replicates



Sunday, 11 April 2010
Replicates
                        Xi ~ SomeDist(some params)
                        i = 1, 2, ..., n
                        Means we repeat the experiment n times.
                        f(x1, x2, ..., xn) is the joint pdf which allows
                        us to figure out the probability of any
                        event involving the n replicates


Sunday, 11 April 2010
Maximum likelihood



Sunday, 11 April 2010
Your turn
                        On Tuesday I was dismayed to find that if
                        Xi ~ Binomial(n, p) then an estimator for p
                           n
                        is i Xi /n 2


                        In fact, this estimator is basically correct,
                        but there is a problem with my notation.
                        Can you spot where I went wrong?
                        (everything you need is on this slide)


Sunday, 11 April 2010
Formal definition
                        The maximum likelihood estimator is a
                        value of the parameter that maximises the
                        likelihood function with respect to the
                        parameter.

    ˆM L = max l(θ; x1 , x2 , . . . , xn )
    θ
                               θ∈Θ

Sunday, 11 April 2010
Steps
                        Write out likelihood (=joint pdf)
                        Write out log-likelihood
                        (Discard constants)
                        Find maximum:
                           Differentiate and set to 0
                           (Check second derivative is negatice)
                           (Check end points)


Sunday, 11 April 2010
Maximum

                        • Derivative zero
                        • Derivative undefined
                        • At boundary points




Sunday, 11 April 2010
Your turn

                        Xi ~ Poisson(λ) i = 1,..., n
                        Use maximum likelihood to find an
                        estimator for λ




Sunday, 11 April 2010
Invariance principle


                        One neat property of maximum likelihood
                        estimators is invariance




Sunday, 11 April 2010
What else?
                        MLEs are:
                        Unbiased
                        Minimum variance
                        Have asymptotically normal distribution!
                             ˆM L ) =       −1
                        V ar(θ           δ2
                                      E δθ2 l(X|θ)
Sunday, 11 April 2010
But

                        That math is too hard for this course :(
                        So we need some other ways to work out
                        how much error our estimators have.




Sunday, 11 April 2010
Your turn


                        What is the variance of   ˆM L ?
                                                  λ



Sunday, 11 April 2010
Your turn
                        I repeated an experiment defined by
                        Poisson(λ) 10 times, and recorded the
                        following results:
                        6 11 10 6 12 7 8 5 7 10
                        What is the MLE of λ?
                        What is the standard deviation of our
                        estimate?


Sunday, 11 April 2010
Answer
                        Mean = 8.2
                        SD = 0.90
                        Can you create an interval around the
                        estimate that ensures that the true value
                        will be inside it 95% of the time?
                        (Use clt)


Sunday, 11 April 2010
Reading


                        6.1, 6.1.1




Sunday, 11 April 2010

Weitere ähnliche Inhalte

Ähnlich wie 21 Ml

Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1rusersla
 
Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)Hadley Wickham
 
The ecological and evolutionary impacts of altered
The ecological and evolutionary impacts of alteredThe ecological and evolutionary impacts of altered
The ecological and evolutionary impacts of alteredDistribEcology
 
About Vision, Mission And Strategy
About Vision, Mission And StrategyAbout Vision, Mission And Strategy
About Vision, Mission And StrategyGuus Vos
 
Over Visie, Missie En Strategie
Over Visie, Missie En StrategieOver Visie, Missie En Strategie
Over Visie, Missie En StrategieGuus Vos
 
How People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It MattersHow People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It MattersMat Morrison
 
Space-time data workshop at IfGI
Space-time data workshop at IfGISpace-time data workshop at IfGI
Space-time data workshop at IfGITomislav Hengl
 

Ähnlich wie 21 Ml (20)

Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1Los Angeles R users group - July 12 2011 - Part 1
Los Angeles R users group - July 12 2011 - Part 1
 
13 Bivariate
13 Bivariate13 Bivariate
13 Bivariate
 
04 Wrapup
04 Wrapup04 Wrapup
04 Wrapup
 
Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)Model Visualisation (with ggplot2)
Model Visualisation (with ggplot2)
 
02 Large
02 Large02 Large
02 Large
 
02 large
02 large02 large
02 large
 
07 Discrete
07 Discrete07 Discrete
07 Discrete
 
07 Discrete
07 Discrete07 Discrete
07 Discrete
 
17 Sampling Dist
17 Sampling Dist17 Sampling Dist
17 Sampling Dist
 
17 polishing
17 polishing17 polishing
17 polishing
 
08 Continuous
08 Continuous08 Continuous
08 Continuous
 
14 case-study
14 case-study14 case-study
14 case-study
 
1 basics
1 basics1 basics
1 basics
 
The ecological and evolutionary impacts of altered
The ecological and evolutionary impacts of alteredThe ecological and evolutionary impacts of altered
The ecological and evolutionary impacts of altered
 
About Vision, Mission And Strategy
About Vision, Mission And StrategyAbout Vision, Mission And Strategy
About Vision, Mission And Strategy
 
Over Visie, Missie En Strategie
Over Visie, Missie En StrategieOver Visie, Missie En Strategie
Over Visie, Missie En Strategie
 
01 Intro
01 Intro01 Intro
01 Intro
 
01 intro
01 intro01 intro
01 intro
 
How People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It MattersHow People Use Facebook -- And Why It Matters
How People Use Facebook -- And Why It Matters
 
Space-time data workshop at IfGI
Space-time data workshop at IfGISpace-time data workshop at IfGI
Space-time data workshop at IfGI
 

Mehr von Hadley Wickham (20)

27 development
27 development27 development
27 development
 
27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 
09 bootstrapping
09 bootstrapping09 bootstrapping
09 bootstrapping
 

21 Ml

  • 1. Stat310 Maximum likelihood Hadley Wickham Sunday, 11 April 2010
  • 2. 1. Assessment 2. Feedback 3. Joint pdf 4. Maximum likelihood Sunday, 11 April 2010
  • 3. Assessment • All grading now 100% up to date (as far as I know) • Overall grade to date in owlspace (but doesn’t account for dropping lowest homework) • Quizzes were going to be worth 10%, change to 5%? Sunday, 11 April 2010
  • 4. So far • 2 / 2 tests * 10% = 20% • 7 / 10 homeworks * 40% = 28% • 3 / 5 quizzes * 5% = 3% • Total: 51% of grade Sunday, 11 April 2010
  • 5. To come • 1 final * 30% = 30% • 3 / 10 homeworks * 40% = 12% • 2 / 5 quizzes * 5% = 2% • 5% TBA • Total: 49% of grade Sunday, 11 April 2010
  • 6. Test • Bad news: It was harder • Good news: I’ve figured out why, so it won’t happen on the final Sunday, 11 April 2010
  • 7. 14 12 10 8 count 6 4 2 0 0.0 0.2 0.4 0.6 0.8 1.0 T2 Sunday, 11 April 2010
  • 8. 1.0 Better ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ● Test 2 ● ● ● ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● 0.2 0.0 Worse 0.0 0.2 0.4 0.6 0.8 1.0 Test 1 Sunday, 11 April 2010
  • 9. 15 10 count 5 0 10 20 30 40 50 Overall Sunday, 11 April 2010
  • 10. These are minimums 15 described in the syllabus 10 count 5 F C B A 0 10 20 30 40 50 Overall Sunday, 11 April 2010
  • 11. Options • Do nothing. • Add 3 points on to test. Distribute 5% evenly across all assessment. • 1 hour take home exam worth 5%. 2-3 problems from the book. • 1 extra homework worth 5%. 4-5 problems from the book. Sunday, 11 April 2010
  • 12. Homeworks • Due Thursday in class • Out of the goodness of my heart I have been accepting late homeworks • But it is getting excessive - I shouldn’t have to deal with 15 late homeworks a week • Please turn in on time or I will start enforcing the late homework penalty. Sunday, 11 April 2010
  • 14. Feedback about me Doing well: Lectures/teaching (13), engaging/ interesting lectures (11), website (10), examples (10), homeworks (8), help sessions (6), pace (4), funny (3), being awesome (2) Needs improvement: test too hard (too many to count), hard to study from ppt (7), more activities (5), less mistakes (5), too fast (4), homework session should be a tutorial (3) Sunday, 11 April 2010
  • 15. Changes My notes are scattered between slides, the board and my voice. Your notes should not be! Will continue to try and find interesting examples and activities. For final review session, will have voting system and I’ll re-cover popular topics on the board. Sunday, 11 April 2010
  • 16. You Doing well Needs improvement Sunday, 11 April 2010
  • 17. You Marijuana? Doing well Needs improvement Sunday, 11 April 2010
  • 18. You Doing well Probably read Needs ahead, but improvement who does that anyways Sunday, 11 April 2010
  • 19. You I’m enjoying Doing well the weather Needs improvement Sunday, 11 April 2010
  • 20. You Doing well Needs my grade improvement Sunday, 11 April 2010
  • 21. Why do we care about random variables? Sunday, 11 April 2010
  • 22. Experiments If we capture all the relevant information about an experiment, we can repeat virtually (either mathematically or computationally). This is usually easier and cheaper than doing the real experiment! The mathematical abstraction we use to do this is the random variable. Sunday, 11 April 2010
  • 23. So The purpose of a random variable is to describe (or at least approximate) the behaviour of an experiment. So: X ~ SomeDist(some params) means we have a single experiment whose behaviour is defined. Sunday, 11 April 2010
  • 24. Replications X1 ~ SomeDist(some params) X2 ~ SomeDist(some params) Means we repeat the experiment twice - it’s the same distribution, which implies that the experiment is repeated under identical conditions. f(x1, x2) is the bivariate pdf which allows us to figure out the probability of any event involving the two replicates Sunday, 11 April 2010
  • 25. Replicates Xi ~ SomeDist(some params) i = 1, 2, ..., n Means we repeat the experiment n times. f(x1, x2, ..., xn) is the joint pdf which allows us to figure out the probability of any event involving the n replicates Sunday, 11 April 2010
  • 27. Your turn On Tuesday I was dismayed to find that if Xi ~ Binomial(n, p) then an estimator for p n is i Xi /n 2 In fact, this estimator is basically correct, but there is a problem with my notation. Can you spot where I went wrong? (everything you need is on this slide) Sunday, 11 April 2010
  • 28. Formal definition The maximum likelihood estimator is a value of the parameter that maximises the likelihood function with respect to the parameter. ˆM L = max l(θ; x1 , x2 , . . . , xn ) θ θ∈Θ Sunday, 11 April 2010
  • 29. Steps Write out likelihood (=joint pdf) Write out log-likelihood (Discard constants) Find maximum: Differentiate and set to 0 (Check second derivative is negatice) (Check end points) Sunday, 11 April 2010
  • 30. Maximum • Derivative zero • Derivative undefined • At boundary points Sunday, 11 April 2010
  • 31. Your turn Xi ~ Poisson(λ) i = 1,..., n Use maximum likelihood to find an estimator for λ Sunday, 11 April 2010
  • 32. Invariance principle One neat property of maximum likelihood estimators is invariance Sunday, 11 April 2010
  • 33. What else? MLEs are: Unbiased Minimum variance Have asymptotically normal distribution! ˆM L ) = −1 V ar(θ δ2 E δθ2 l(X|θ) Sunday, 11 April 2010
  • 34. But That math is too hard for this course :( So we need some other ways to work out how much error our estimators have. Sunday, 11 April 2010
  • 35. Your turn What is the variance of ˆM L ? λ Sunday, 11 April 2010
  • 36. Your turn I repeated an experiment defined by Poisson(λ) 10 times, and recorded the following results: 6 11 10 6 12 7 8 5 7 10 What is the MLE of λ? What is the standard deviation of our estimate? Sunday, 11 April 2010
  • 37. Answer Mean = 8.2 SD = 0.90 Can you create an interval around the estimate that ensures that the true value will be inside it 95% of the time? (Use clt) Sunday, 11 April 2010
  • 38. Reading 6.1, 6.1.1 Sunday, 11 April 2010