# Probabilty1.pptx

26. Oct 2022                                                                                                1 von 96

### Probabilty1.pptx

• 1. Chapter Three Probability theory and Probability Distribution
• 2. Probability • Chance of observing a particular outcome • Likelihood of an event • Assumes a “stochastic” or “random” process: i.e.. the outcome is not predetermined - there is an element of chance 2
• 3. • Probability theory developed from the study of games of chance like dice and cards. • A process like flipping a coin, rolling a die or drawing a card from a deck are probability experiments. 3
• 4. Why Probability in Statistics? • Results are not certain • To evaluate how accurate our results are: – Given how our data were collected, are our results accurate ? – Given the level of accuracy needed, how many observations need to be collected ? 4
• 5. • Probability theory is a foundation for statistical inference, & • Allows us to draw conclusions about a population of patients based on information obtained from a sample of patients drawn from that population. 5
• 6. More importantly probability theory is used to understand: – About probability distributions: Binomial, Poisson, and Normal Distributions – Sampling and sampling distributions – Estimation – Hypothesis testing – Advanced statistical analysis 6
• 7. Two Categories of Probability • Objective and Subjective Probabilities. • Objective probability 1) Classical probability and 2) Relative frequency probability. 7
• 8. Classical Probability • Is based on gambling ideas • Rolling a die - – There are 6 possible outcomes: – Total ways = {1, 2, 3, 4, 5, 6}. • Each is equally likely – P(i) = 1/6, i=1,2,...,6. P(1) = 1/6 P(2) = 1/6  ……. P(6) = 1/6 SUM = 1 8
• 9. • Definition: If an event can occur in N mutually exclusive and equally likely ways, and if m of these posses a characteristic, E, the probability of the occurrence of E = m/N. P(E)= the probability of E = P(E) = m/N • If we toss a die, what is the probability of 4 coming up? m = 1(which is 4) and N = 6 The probability of 4 coming up is 1/6. 9
• 10. • Another “equally likely” setting is the tossing of a coin – – There are 2 possible outcomes in the set of all possible outcomes {H, T}. P(H) = 0.5 P(T) = 0.5 SUM = 1 10
• 11. Relative Frequency Probability Definition: The probability that something occurs is the proportion of times it occurs when exactly the same experiment is repeated a very large (preferably infinite!) number of times in independent trials. • If a process is repeated a large number of times (n), and if an event with the characteristic E occurs m times, the relative frequency of E, Probability of E = P(E) = m/n. 11
• 12. • If you toss a coin 100 times and head comes up 40 times, P(H) = 40/100 = 0.4. • If we toss a coin 10,000 times and the head comes up 5562, P(H) = 0.5562. • Therefore, the longer the series and the longer sample size, the closer the estimate to the true value. 12
• 13. • Since trials cannot be repeated an infinite number of times, theoretical probabilities are often estimated by empirical probabilities based on a finite amount of data • Example: Of 158 people who attended a dinner party, 99 were ill. P (Illness) = 99/158 = 0.63 = 63%. 13
• 14. • In 1998, there were 2,500,000 registered live births; of these, 200,000 were LBW infants. • Therefore, the probability that a newborn is LBW is estimated by P (LBW) = 200,000/2,500,000 = 0.08 14
• 15. Subjective Probability • Personalistic (represents one’s degree of belief in the occurrence of an event). • Personal assessment of which is more effective to provide cure – traditional/modern • Personal assessment of which sports team will win a match. • Also uses classical and relative frequency methods to assess the likelihood of an event. 15
• 16. • E.g., If someone says that he is 95% certain that a cure for AIDS will be discovered within 5 years, then he means that: P(discovery of cure for AIDS within 5 years) = 95% = 0.95 • Although the subjective view of probability has enjoyed increased attention over the years, it has not fully accepted by scientists. 16
• 17. Definitions of some terms commonly encountered in probability Experiment : In statistics anything that results in a count or a measurement is called an experiment. Sample space: The set of all possible outcomes of an experiment , for example, (H,T). Event: Any subset of the sample space H or T. 17
• 18. Mutually Exclusive Events  Two events A and B are mutually exclusive if they cannot both happen at the same time P (A and B) = 0 • Example: – A coin toss cannot produce heads and tails simultaneously. – Weight of an individual can’t be classified simultaneously as “underweight”, “normal”, “overweight” 18
• 19. Independent Events • Two events A and B are independent if the probability of the first one happening is the same no matter how the second one turns out. OR. The outcome of one event has no effect on the occurrence or non-occurrence of the other. P(A∩B) = P(A) x P(B) (Independent events) 19
• 20. Example of independent event A classic example is n tosses of a coin and the chances that on each toss it lands heads. These are independent events. The chance of heads on any one toss is independent of the number of previous heads. No matter how many heads have already been observed, the chance of heads on the next toss is ½. 20
• 21. Intersection, and union • The intersection of two events A and B, A ∩ B, is the event that A and B happen simultaneously P ( A and B ) = P (A ∩ B ) • Let A represent the event that a randomly selected newborn is LBW, and B the event that he or she is from a multiple birth • The intersection of A and B is the event that the infant is both LBW and from a multiple birth 21
• 22. • The union of A and B, A U B, is the event that either A happens or B happens or they both happen simultaneously P ( A or B ) = P ( A U B ) • In the example above, the union of A and B is the event that the newborn is either LBW or from a multiple birth, or both 22
• 23. Basic Probability Rules 1. Addition rule  If events A and B are mutually exclusive: P(A or B) = P(A) + P(B) P(A and B) = 0  More generally: P(A or B) = P(A) + P(B) - P(A and B) P(event A or event B occurs or they both occur) 23
• 24. The additive law, when applied to two mutually exclusive events, states that the probability of either of the two events occurring is obtained by adding the probabilities of each event. 24
• 25. Example1: a thrown die may show a one or a two, but not both. The probability that it shows a one or a two Pr(1 or 2) = Pr (1) + Pr(2). Pr(1 or 2) = Pr (1/6) + Pr(1/6)= 2/6 25
• 26. Extension of the additive law to more than two events indicates that if A, B, C… are mutually exclusive events, Pr(A or B or C or…) = Pr (A) + pr(B)+ pr(C) + … when A and B are not mutually exclusive, Pr(A or B) = Pr (A) + Pr(B) – Pr(A and B). 26
• 27. Example2: • Of 200 seniors at a certain college, 98 are women, 34 are majoring in Biology, and 20 Biology majors are women. If one student is chosen at random from the senior class, what is the probability that the choice will be either a Biology major or a women? 27
• 28. • Pr ( Biology major or woman ) = Pr (Biology major) + Pr(woman ) - Pr (Biology major and woman) =34/200 + 98/200 - 20/200 = 112/200 = .56 28
• 29. 2. Multiplication rule – If A and B are independent events, then P(A and B) = P(A) × P(B) 29
• 30. Example Suppose we toss a coin twice, and the probability of two heads occurring is the product of their probabilities, that is Pr(two heads)= Pr(1/2)* Pr(1/2) =1/4 30
• 31. Conditional Probability • Refers to the probability of an event, given that another event is known to have occurred. • “What happened first is assumed” • Hint - When thinking about conditional probabilities, think in stages. Think of the two events A and B occurring chronologically, one after the other, either in time or space. 31
• 32. • The conditional probability that event B has occurred given that event A has already occurred is denoted P(B|A) and is defined provided that P(A) ≠ 0. 32
• 33. Example • Suppose in country X the chance that an infant lives to age 25 is .95, whereas the chance that he lives to age 65 is .65. • What is the chance that a person 25 years of age survives to age 65? 33
• 34. Notation Event Probability A Survive birth to age 25 .95 A and B Survive both birth to age 25 and age 25 to 65 .65 B/A Survive age 25 to 65 given survival to age 25 ? • Then, Pr(B/A) = Pr(A and B ) / Pr(A) = .65/.95 = .684 . That is, a person aged 25 has a 68.4 percent chance of living to age 65. 34
• 35. Properties of Probability 1. The numerical value of a probability always lies between 0 and 1. 0  P(E)  1  A value 0 means the event can not occur  A value 1 means the event definitely will occur  A value of 0.5 means that the probability that the event will occur is the same as the probability that it will not occur. 35
• 36. 2. The sum of the probabilities of all mutually exclusive outcomes is equal to 1. P(E1) + P(E2 ) + .... + P(En ) = 1. 3. For two mutually exclusive events A and B, P(A or B ) = P(A) + P(B). If not mutually exclusive: P(A or B) = P(A) + P(B) - P(A and B) 4. For two independent events A and B, P(A and B ) = P(A)*P(B). 36
• 37. 5. The complement of an event A, denoted by Ā or Ac, is the event that A does not occur • Consists of all the outcomes in which event A does NOT occur P(Ā) = P(not A) = 1 – P(A) • Ā occurs only when A does not occur. • These are complementary events. 37
• 38. • In the example, the complement of A is the event that a newborn is not LBW • In other words, A is the event that the child weighs 2500 grams at birth P(Ā) = 1 − P(A) P(not low bwt) = 1 − P(low bwt) = 1− 0.08 = 0.92 38
• 39. EXERCISE Consider certain area X the probability of having hookworm infestation is 0.5 and the probability of having schistosomiasis is 0.6. What is the probability of having hookworm or schistosomiasis? 39
• 40. Random variables and Probability Distributions 40
• 41. • A probability distribution is a device used to describe the behavior that a random variable may have by applying the theory of probability. • It is the way data are distributed, in order to draw conclusions about a set of data • Random Variable = Any quantity or characteristic that is able to assume a number of different values such that any particular outcome is determined by chance 41
• 42. • Random variables: can be either discrete or continuous • A discrete random variable is able to assume only a finite or countable number of outcomes • A continuous random variable can take on any value in a specified interval 42
• 43. Therefore, The probability distribution can be displayed in the form of a table giving the values and their associated probabilities and/or it can be expressed as a mathematical formula giving the probability of all possible values. 43
• 44. A. Discrete Probability Distributions • For a discrete random variable, the probability distribution specifies each of the possible outcomes of the random variable along with the probability that each will occur 44
• 45. • We represent a potential outcome of the random variable X by x 0 ≤ P(X = x) ≤ 1 ∑ P(X = x) = 1 45
• 46. Example 1: The following data shows the number of diagnostic services a patient receives 46
• 47. • What is the probability that a patient receives exactly 3 diagnostic services? P(X=3) = 0.031 • What is the probability that a patient receives at most one diagnostic service? P (X≤1) = P(X = 0) + P(X = 1) = 0.671 + 0.229 = 0.900 47
• 48. • What is the probability that a patient receives at least four diagnostic services? P (X≥4) = P(X = 4) + P(X = 5) = 0.010 + 0.006 = 0.016 48
• 49. Probability distributions can also be displayed using a graph 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 1 2 3 4 5 No. of diagnostic services, x Probability, X=x 49
• 50. Example 2 Toss a coin 2 times. Let x be the number of heads obtained. Find the probability distribution of x . Pr (X = xi) , i = 0, 1, 2, Pr (x = 0) = 1/4 …………………………….. TT Pr (x = 1) = 1/2 ……………………………. HT TH Pr (x = 2) = 1/4 ……………………………..HH 50
• 51. Probability distribution of X. X = xi 0 1 2 Pr(X=xi) 1/4 1/2 1/4 51
• 52. The expected value, denoted by E(x) or , represents the “average” value of the random Variable in the long run. • It is obtained by multiplying each possible value by its respective probability and summing over all the values. The Expected Value of a Discrete Random variable 52
• 53. E(X) =  = Where the xi’s are the values the random variable assumes with positive probability ) x P(X n x i 1 i i    53
• 54. Probability distribution of X. E(X) = = 0(1/4)+ 1(1/2) +2(1/4) = 1 Thus, on average we would expect one head for each toss of a coin. 54
• 55. The variance represents the spread of all values that have positive probability relative to the expected value. In particular, the variance is obtained by multiplying the squared distance of each possible value from the expected value by its respective probability and summing overall the values that have positive probability. The Variance of a Discrete Random Variable 55
• 56. σ2 = ∑(xi-µ)2P(X=xi) = (0-1) 2 (1/4) + (1-1) 2 (0.5) + (2-1) 2 (1/4) = 0.5 Standard deviation = σ = √0.5 = 0.71 56
• 57. 1. Binomial Distribution • Is the distribution followed by the number of successes in n independent trials when the probability of any single trial being a success is p. • It is one of the most widely encountered discrete probability distributions. • Consider dichotomous (binary) random variable  Patient survives or dies  A specimen is positive or negative  A child has vaccinated or not vaccinated 57
• 58. Example: • We are interested in determining whether a newborn infant will survive until his/her 70th birthday • Let Y represent the survival status of the child at age 70 years • Y = 1 if the child survives and Y = 0 if he/she does not 58
• 59. • The outcomes are mutually exclusive and exhaustive • Suppose that 72% of infants born survive to age 70 years P(Y = 1) = p = 0.72 P(Y = 0) = 1 − p = 0.28 59
• 60. 60
• 61. Binomial assumptions 1. The same experiment is carried out n times ( n trials are made). 2. The result of each trial is independent of the result of any other trial. 3. Each trial must have all outcomes that fall into two categories. • usually these outcomes are called “ success” and “ failure”. • If P is the probability of success in one trial, then , 1-p is the probability of failure. 61
• 62. • If the binomial assumptions are satisfied, the probability of r successes in n trials is: 62 r n r P) (1 P r n r) P(X             r= 0,1,2…n
• 63. = n Cr is the number of ways of choosing r items from n. • The general formula for the coefficients 63       r n         r n         r n r)! (n r! n!  is
• 64. • n denotes the number of fixed trials • r denotes the number of successes in the n trials • p denotes the probability of success • q denotes the probability of failure (1- p) 64
• 65. Example: • Suppose we know that 40% of a certain population are cigarette smokers. If we take a random sample of 10 people from this population, what is the probability that we will have exactly 4 smokers in our sample? 65
• 66. • If the probability that any individual in the population is a smoker to be P=.40, then the probability that r=4 smokers out of n=10 subjects selected is: P(X=4) =10C4(0.4)4 (1-0.4)10-4 = 10C4(0.4)4 (0.6)6 = 210(.0256)(.04666) = 0.25 • The probability of obtaining exactly 4 smokers in the sample is about 0.25. 66
• 67. • We can compute the probability of observing zero smokers out of 10 subjects selected at random, exactly 1 smoker, and so on, and display the results in a table, as given, below. • The third column, P(X ≤ x), gives the cumulative probability. E.g. the probability of selecting 3 or fewer smokers into the sample of 10 subjects is P(X ≤ 3) =.3823, or about 38%. 67
• 68. 68
• 69. The probability in the above table can be converted into the following graph 0 0.05 0.1 0.15 0.2 0.25 0.3 0 1 2 3 4 5 6 7 8 9 10 No. of Smokers Probability 69
• 70. • If the true proportion of events of interest is P, then in a sample of size n the mean of the binomial distribution is np and the standard deviation is : p) np(1 70
• 71. Example: • 70% of a certain population has been immunized for polio. If a sample of size 50 is taken, what is the “expected total number”, in the sample who have been immunized? µ = np = 50(.70) = 35 • This tells us that “on the average” we expect to see 35 immunized subjects in a sample of 50 from this population. 71
• 72. Exercise Suppose that in a certain malarious area past experience indicates that the probability of a person with a high fever will be positive for malaria is 0.7. Consider 3 randomly selected patients (with high fever) in that same area. 72
• 73. • What is the probability that no patient will be positive for malaria? • What is the probability that exactly one patient will be positive for malaria? • What is the probability that exactly two of the patients will be positive for malaria? 73
• 74. • What is the probability that all patients will be positive for malaria? • Find the mean and the SD of the probability distribution given above. 74
• 75. 2. The Normal distribution • The ND is the most important probability distribution in statistics. • Frequently called the “Gaussian distribution” or bell-shape curve. • Variables such as blood pressure, weight, height, serum cholesterol level, and IQ score — are approximately normally distributed 75
• 76. A random variable is said to have a normal distribution if it has a probability distribution that is symmetric and bell-shaped 76
• 77. The important characteristics of the Normal Distribution are: 1. It is unimodal, bell-shaped and symmetrical about x = u. 2. It is determined by two quantities: its mean (  ) and SD (  ). 3. The total area under the curve about the x axis is 1 square unit. 4. It is a probability distribution of a continuous variable. It extends from minus infinity( -) to plus infinity (+). 77
• 78. • We have different normal distributions depending on the values of μ and σ2. • We cannot tabulate every possible distribution • Tabulated normal probability calculations are available only for the ND with µ = 0 and σ2=1 78
• 79. Standard Normal Distribution  It is a normal distribution that has a mean equal to 0 and a SD equal to 1, and is denoted by N(0, 1).  The main idea is to standardize all the data that is given by using Z-scores.  These Z-scores can then be used to find the area (and thus the probability) under the normal curve. 79
• 80. • The standard normal distribution has mean 0 and variance 1 80
• 81. • If a random variable X~N(,) then we can transform it to a SND with the help of Z- transformation SND = Z score = x -   • Z represents the Z-score for a given x value 81
• 82. Area under any Normal curve To find the area under a normal curve ( with mean  and standard deviation ) between x=a and x=b, find the Z scores corresponding to a and b (call them Z1 and Z2) and then find the area under the standard normal curve between Z1 and Z2 from the published table. 82
• 83. Z-scores are important because given a Z – value we can find out the probability of obtaining a score this large or larger (or this low or lower). (look up the value in a z-table). 83
• 84. Example a) What is the probability that z < -1.96? (1) Sketch a normal curve (2) Draw a perpendicular line for z = -1.9 (3) Find the area in the table (4) The answer is the area to the left of the line P(z < -1.96) = 0.0250 84
• 85. b) What is the probability that -1.96 < z < 1.96? The area between the values P(-1.96 < z < 1.96) = .1 - .0250-.0250 = .9500 85
• 86. c) What is the probability that z > 1.96? • The answer is the area to the right of the line P(z > 1.96) = 0.0250 N.B From the symmetry properties of the standard normal distribution, P(Z  -x) = P(Z  x) 86
• 87. Exercise 1. Compute P(-1 ≤ Z ≤ 1.5) 2. Compute P(-1.66 < Z < 2.85) 3. Find the area under SND for p(0.83 < z < 1.25) 4. Find the area for Z<1.96 87
• 88. Example the height of adult men in United Kingdom, which is approximately normal with men = 171.5cm and standard deviation = 6.5cm. 1. What is the probability that a randomly selected men has a height taller than 180cm 88
• 89. 1. First find the corresponding SND= Z scores SND = Z = x -   Z= 180-171.5 = 1.31 6.5 P(z>1.31)= 0.0951 or equivalently 9.51% of adult men are taller than 180cm. 89
• 90. 2. What is the probability that a randomly selected men has a height shorter than 160cm z= 160- 171.5 = -1.77 6.5 P( z< -1.77)= 0.0384 thus 3.84% of men are shorter than 160cm 90
• 91. 3.What is the probability that a randomly selected men has a height between 165cm and 175cm SND corresponding to 165cm z= 165-171.5 = -1 6.5 proportion below this height is 0.1587 91
• 92. SND corresponding to 175cm z= 175-171.5 = 0.54 6.5 • proportion above this height is 0.2946 • Proportion of men with height between 165cm and 175cm. = 1- proportion below165cm- proportion above 175cm = 1- 0.1587- 0.2946= 0.5467= or 54.67%. 92
• 93. Exercise • The diastolic blood pressures of males 35–44 years of age are normally distributed with µ = 80 mm Hg and σ = 12 mm Hg • Let individuals with BP above 95 mm Hg are considered to be hypertensive. 93
• 94. a. What is the probability that a randomly selected male has a BP above 95 mm Hg? Ans. P(z>1.25)= 0.1056 Approximately 10.6% of this population would be classified as hypertensive b. What is the probability that a randomly selected male has a DBP above 110 mm Hg? Ans. P(z>2.50)= 0.0062 Approximately 0.6% of the population has a DBP above 110 mm Hg 94
• 95. c. What is the probability that a randomly selected male has a DBP below 60 mm Hg? Ans. P (Z < -1.67) = 0.0475 Approximately 4.8% of the population has a DBP below 60 mm Hg 95
• 96. Other Distributions 1. Student t-distribution 2. F- Distribution 3. 2 –Distribution 4. The Poisson Distribution 96