Chahine Hypothesis Testing,

Hypothesis Testing
Foundations I
September 26, 2011

Objectives
 Differentiate three measures of central tendency,
including their advantages and disadvantages
 Explain the rationale of hypothesis testing
 Define the null and alternate hypotheses
 Define and interpret: p value, test statistic, type I and II
error, alpha, beta and statistical power
 Explain how statistical power and sample size are
related and describe other factors influencing power
2011-2012 2

Levels of Measurement
 Categorical (nominal)
 Ordinal
 Interval
 Ratio

2011-2012 3

Categorical Data
 Non-ordered data
 Often represents different categories: sex, eye
colour, genotypes etc…
 An average would be meaningless
 More meaningful to talk about: different categories,
proportions, percentages or mode

2011-2012 4

Ordinal Data
 Ordered data
 The distance between the data points may vary
 E.g., Placement in a race, perceived level of pain, or
depression scale
 7 is greater than 5 and greater than 3 but differences
between 7 & 5 may not be the same as 5 & 3
 Average is not meaningful here; finding a middle
number maybe more meaningful and most consistent
2011-2012 5

Interval Data
 Very similar to ordinal data, but the differences are
consistent
 E.g., Temperature in Celsius or Ferinheight
 Difference between 20 and 30 is the same as the difference
between 40 and 50
 Really well designed rating scales gather interval data
 Important to note that 0 is not meaningful in interval data
 An average (mean) is meaningful unless data is skewed

2011-2012 6

Ratio Data
 Very similar to interval data except 0 is meaningful
 E.g., Tracking growth of bacteria, height, & weight of babies
 Someone can be twice as tall as another person; however,
cannot say something is twice as hot or cold unless its
measured in Kelvin (in Kelvin temperature of 0 is
meaningful)
 Average is very useful and many statistical procedures for
ratio data are based on means; however, if data is skewed
median is more useful

2011-2012 7

Central Tendency
 If you wanted to describe a population or a group
of people using one or two numbers you could say:
• On average, students in this class scored about 75% on
last exam….
• In this class, the most frequent eye colour is….
• In a small sub-sample of 10 students, the middle score on
the exam was….

2011-2012 8

Mean, Median & Mode
 Depending on the type and quality of your data,
either mean, median, or mode may be more
suitable in describing the typical structure of your
data or central tendency
 Statistical analyses such as Analysis of Variance, or
Chi Square Analysis or T-Tests are based on
different measures of central tendency
2011-2012 9

Descriptive vs. Inferential Statistics
 Descriptive statistics describe the sample or
population usually by providing values of range,
maximum, minimum, central tendency, variance
(sum of individual differences from the mean)
 Inferential statistics are often used when you do
not have access to the entire population and want
to make an inference about this population
2011-2012 10

A Conjecture…..
 After doing a great deal of reading, the dean of a
well know US medical school believed that in
general, the students in medical programs have an
average IQ of 135
 This is conjecture about an entire population of
undergraduate medical students

2011-2012 11

Hypothesis Testing: Step 1
 We can test the dean’s conjecture…

Null Hypothesis - Ho: µ=135
Alternative Hypothesis - HA: µ≠135

We test for the conjecture or hypothesis by
making it the null
2011-2012 12

Role of Software
 Computer programs such as SPSS, SAS, R, STATA,
etc…
 They have built in algorithms to carry out what you
might do by hand
 Its is important to initially do this by hand to
understand what it means to reject, or fail to reject
the null hypothesis
2011-2012 13

 Because we are not dealing with absolutes and we are
making a prediction about a population its not exact.
 We need to select a criterion or significance level by which
we can either reject or accept the null hypothesis.
 Most often the criterion or significance level is set at .05
 It is also referred to as p-value or α

At what point is the difference between the sample mean
and 135 not due to chance but fact ??

2011-2012 14

- We sample 10 students
- Area of acceptance is 95%
- Look up critical values on a
t-score table (±2.262)

2011-2012 15

 We need to randomly draw a sample of 10 Students

115, 140, 133, 125, 120, 126, 136, 124, 132, 129

Mean = 128

2011-2012 16

 We need to calculate Standard Deviation (SD) &
Standard Error (SE)

 How many people you know has heard of standard
deviation before?
 How many people know what it means?

2011-2012 17

IQ Scores Mean DiviationsDiviations scores Squared
Scores
115 128 13 169
140
133
128
128
-12
-5
144
25 Before SD we need to
125 128 3 9
120 128 8 64 understand variance
126 128 2 4
136
124
128
128
-8
4
64
16
Standard Deviation – Can
132 128 -4 16 be thought of as an
129 128 -1 1 average of deviation
Sum 0 512
Standard Error – Is an
Sample Variance 0 56.88889
estimation of SD used in
Standard Deviation 0 7.542472 calculating t-statistic
Standard Error 2.385139
2011-2012 18

T-Test
 The t-statistic was introduced in 1908 by William
Sealy Gosset
 A chemist working for the Guinness brewery in
Dublin, Ireland ("Student" was his pen name)
 Gosset devised the t-test as a way to cheaply
monitor the quality of stout
 Published the test in Biometrika in 1908

2011-2012 19

Hypothesis Testing: Steps 6 & 7
T-statistic = (sample average – hypothesis)/standard error

t= (128-135)/2.385
t=-2.935
“The hypothesis that the mean IQ
of the population is 135 was
rejected, t= -2.935, df=9, p≤ .05.”

2011-2012 20

Type I and II Error
 Remember in step 2, we asked how much will we
attribute the difference of means to chance…
 Measurement is never exact; though some journals and
papers vary, a p-value of .05 (meaning that we are 95%
sure that result did not happen by chance) is used
 When we have rejected the null and it is actually true
this is type I error or “false positive”
 When we have not rejected the null and it is actually
false this is a type II error or “false negative”
2011-2012 21

Power and Measures
 How much power does our prediction have?
 How much can we infer?
 It depends on sample size & quality of the measure
 IQ, Depression Scale, Cognitive ability are unobservable
 Growth of bacteria, cellular effects from medication are
observables – a ruler can be put to it
 The more we can see, the less population we will need
 The more accurate our inferences, the smaller error we
would produce
2011-2012 22

Contact
 Dr. Saad Chahine
Saad.Chahine@msvu.ca

2011-2012 23

Chahine Hypothesis Testing,

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (8)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Chahine Hypothesis Testing,

Ähnlich wie Chahine Hypothesis Testing, (20)

Mehr von Saad Chahine

Mehr von Saad Chahine (10)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Chahine Hypothesis Testing,