More Statistics

More Statistics Andrew Martin PS 372 University of Kentucky

Inference Inference refers to reasoning from available information or facts to reach a conclusion. However, there is no guarantee the inference is correct. In fact, inferences are sometimes incorrect.

Inference In statistical inference the estimated values of unknown population parameters are sometimes incorrect. Concerning hypothesis testing, there are two types of mistakes we can make: a Type I error and a Type II error.

Type I Error ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Type II Error ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

What are the chances? ,[object Object],[object Object],[object Object],[object Object]

Standard Error Imagine taking an endless number of independent samples of size N from a fixed population that has a mean of μ and a standard deviation σ. For each sample, you calculate Y and the standard deviation

Standard Error The standard deviation of the sampling distribution is called the standard error of the mean, or standard error . Where is the sample standard deviation and N is the sample size.

Standard Error The expression implies that as the sample size gets larger and lagers, the standard error decreases in numerical value. As a result, as the sample increases we expect Y to get closer and closer to the true value ( μ)

Binomial Distributions Binomial distributions can be used to show how probabilities can be used to assess the likelihood that an event will or will not occur given N observations. Sometimes an event happening or not happening is referred to in terms of successes and failures.

Binomial Distribution Coin tosses are a perfect example, because you can specify tossing heads or tossing heads as an event. Sticking with heads as the event, it either happens or fails to happen.

Critical Regions and Values If we have established a critical region such that we will reject the null hypothesis at 0, 1, 9 or 10 heads, then the size of the critical region would be calculated as follows: p 0 + p 1 + p 9 + p 10 = α (Critical region) .001 + .01 + .01 + .001 = .022 So we have .022, or just a little more than 2 out of 100 chances of incorrectly rejecting the null hypothesis.

Critical Regions and Values On a practical level, the only way one would reject the null hypothesis (H 0 : P = .5) is if in 10 tosses only 1,2,9 or 10 came up heads – none of which is likely with a fair coin.

Critical Regions and Values In political science, the critical regions are typically referred to as levels. In other words, if α = .05 one would typically say “The null hypothesis can be rejected at the .05 level.” This measure specifies the probability of making a Type I error (rejecting a true null hypothesis). This concept is also known as statistical significance .

Statistical Significance ,[object Object],[object Object],[object Object]

One- or Two-Sided Tests ,[object Object],[object Object]

One- or Two-Sided Tests ,[object Object],[object Object],[object Object]

What about real-world outcomes? ,[object Object],[object Object],[object Object]

Types of Distributions ,[object Object],[object Object]

Discrete vs. Continuous Distributions (Kmenta 1986) ,[object Object],[object Object],[object Object]

Observed Test Statistic ,[object Object],[object Object],[object Object],[object Object]

Observed Test Statistic ,[object Object],[object Object]

Example of hypothesis testing Example: Someone tells you “The average American has left the middle of the road and now tends to be somewhat conservative.” (H 0 : μ = 5) You, however, are not so sure. In light of Obama's recent election, you think America is not conservative. You believe it to be at least middle of the road. (H A : μ < 5)

Example of hypothesis testing Suppose you and your opponent decide to test these competing claims by examining mean voter ideology from the National Election Study (NES), which uses the following scale: 1 – Extremely liberal 2 – Very liberal 3 – Somewhat liberal 4 – Moderate 5 – Somewhat conservative 6 – Very conservative 7 – Extremely conservative

Example of hypothesis testing 1 – Extremely liberal 2 – Very liberal 3 – Somewhat liberal 4 – Moderate 5 – Somewhat conservative (opponent's claim) 6 – Very conservative 7 – Extremely conservative H 0 : μ = 5 – opponent's claim H A : μ < 5 – your claim (μ is between 1 and 4)

Example of hypothesis testing ,[object Object],[object Object],[object Object]

The t distribution ,[object Object],[object Object],[object Object]

To use a t distribution ... ,[object Object],[object Object],[object Object],[object Object]

To use a t distribution ... At this point you would now collect the sample data, find the sample mean and compute the observed test statistic (which in this case is a t-score ). The calculated t-score for the observations is then compared to a critical value t-score.

To use a t distribution ... If the absolute value of the t-score for the observations is greater than the t-score for the critical values, reject H0. Otherwise, do not reject. If |t obs | ≥ t crit reject H 0 If |t obs | < t crit do not reject H 0

To use a t distribution ... is the sample mean is the hypothesized population mean is the sample standard deviation is the sample size

To use a t distribution ... 1. Sample size: N = 25. 2. Degrees of freedom (N-1) = 25 – 1 = 24. 3. One-tailed test; α = .05 (level of significance). 4. Look up the corresponding row for degrees of freedom and column for level of significance in Appendix B for t distributions on page 576 to get the corresponding critical value.

To use a t distribution ... Now calculate the t-score for the observations. In order to make the calculation we need the four following pieces of information: the sample mean, hypothesized population mean, sample standard deviation and sample size. 4.44 is the sample mean 5 is the hypothesized population mean 1.23 is the sample standard deviation 25 is the sample size

To use a t distribution ... The observed t-score is -2.28. The critical value t-score is 1.711. Again, if |t obs | ≥ t crit reject H 0 Since |-2.28| ≥ 1.711, H 0 is rejected.

P-Values The p-value tells you the probability of getting a t statistic at least as large as the one actually observed if the null hypothesis is true. In this sample, the p-value is .016, which tells you the probability of getting a t statistic at least as large as the one actually observed if the null hypothesis is true. In this sample there is only 1.6 percent chance of observing a as large as 4.44 if the population parameter is 5.

What about large samples? Large samples rely on the standard or normal distribution, but how is the test statistic calculated? The test statistic for a normal distribution is known as a z score , which is the number of standard deviations by which a score deviates from the mean score. For example, z = 1.96 means 1.96 standard deviations above the mean.

How is a z-score calculated? ,[object Object],[object Object]

Example in Practice ,[object Object],[object Object]

Example in Practice ,[object Object],[object Object],[object Object]

Example in Practice ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

To recap .... ,[object Object],[object Object]

To use the z-score table .... ,[object Object],[object Object],[object Object],[object Object]

Since | -15.21 | > 2.58 we can reject the null hypothesis with 99 percent confidence. In other words, there's only a 1 % chance the null hypothesis is true. Put yet another way, the chance that the true population parameter for ideology being 5 is very small.

So what does this tell us? ,[object Object],[object Object]

Difference between t and z scores ,[object Object],[object Object],[object Object]

More Statistics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie More Statistics

Ähnlich wie More Statistics (20)

Mehr von mandrewmartin

Mehr von mandrewmartin (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

More Statistics