SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
z, t, and F tests
Making inferences from
experimental sample to population
using statistical tests
CONTENTS OF TALK
• Distributions and probability – what is a statistical test?
• The normal distribution
• Inferences from sample to population: hypothesis testing
• Central limit theorem
• z tests and example
• t tests and example
• F tests/ANOVA and example
…. and finally….
• Statistical tests and SPM
Distribution & Probability
• If we know something about the distribution of events, we know something about the
probability that one of these events is likely to occur.
• e.g. I know that 75% of people have brown eyes, therefore there is a probability of .75 that the next
person I meet will have brown eyes.
• We can use information about distributions to decide how probable it is that the results of
an experiment looking at variable x support a particular hypothesis about the distribution of
variable x in the population.
• = central aim of experimental science
• This is how statistical tests work: test a sample distribution (our experimental results)
against a hypothesised distribution, resulting in a ‘p’ value for how likely it is that we would
obtain our results under the null hypothesis (null hypothesis = there is no effect or
difference between conditions) – i.e. how likely it is that our results were a fluke!
• e.g. in an experiment I measure RT in two different conditions and find a difference between
conditions. I want to know whether my data can statistically support the hypothesis that
there is a genuine difference in RT between these two conditions in the population as a
whole, i.e. that the data come from a population where the means of the two conditions are
different. The null hypothesis is therefore that the means are the same, and I want a
probability of less than .05 of getting the results we obtained under the null hypothesis
A statistical test allows me to ‘test’ how likely it is that the sample data
come from a parent population with a particular characteristic
The Normal Distribution
Mean and standard deviation tell you the basic features of a distribution
mean = average value of all members of the group
standard deviation = a measure of how much the values of individual members vary
in relation to the mean
• The normal distribution is symmetrical about the mean
• 68% of the normal distribution lies within 1 s.d. of the mean
68% of dist.
1 s.d. 1 s.d.
P(x)
X
x
Many continuous variables follow a normal distribution, and it plays a special role
in the statistical tests we are interested in;
•The x-axis represents the values of a particular
variable
•The y-axis represents the proportion of members
of the population that have each value of the
variable
•The area under the curve represents probability –
e.g. area under the curve between two values on
the x-axis represents the probability of an individual
having a value in that range
Sample to Population
Testing Hypotheses
• t,z, and F tests mathematically compare
the distribution of an
experimental sample –
i.e. the mean and
standard deviation of
your results
a normal distribution whose
parameters represent some
hypothesised feature of the
population, which you think
your results support
to
• How does this work? (without going
through the derivation of the equations…!)
• …CENTRAL LIMIT THEOREM
Central Limit Theorem
• Special feature of normal distribution which underlies its use in statistical tests…
• Take k samples from a population, and calculate the mean of each sample. The
distribution of those means will approximate a normal distribution (for certain
variable types). As k tends to infinity, the distribution of sample means tends to
a normal distribution
• Because the means of samples tend towards a normal distribution in this way,
we can convert the mean of our sample distribution (the experimental results)
into a value from a standardised normal distribution.
population
mean
68% of dist.
1 s.d. 1 s.d.
sample mean
X
m
P( )
X
• A z-test achieves this conversion by
performing a linear transformation – the
equation is given on the next slide
•This can be thought of as expressing
your results and your hypothesis in the
same ‘units’.
•so the z-statistic represents a value on
the x-axis of the standard distribution, for
which we know all the p-values
z-tests:
What are they?
formula:

m


x
z deviation
standard
population
mean
population
mean
sample




m
x
•Plug in the values, and get a ‘z-value’ which corresponds to a location on the x-
axis of a standardised normal distribution (m=0, =1)
•For the standardised normal distribution we know the probability of any
particular value coming from it (area under the curve)
• this is what you read off from a table of z-values
•Because we are dealing with the probabilities of hypotheses about our sample,
there is always a chance you are wrong…. Choosing the significance level
represents how big you want this chance to be…
•P<.05 = a 5% chance that you would obtain your result under the null
hypothesis (Type 1 error)
z-tests:
Worked Example
• Battery of psychological tests to judge IQ from which we have obtained
distribution:
– Mean = 50
– S.D. = 10
– Represents disrtibution of entire population
– We would like to find out probability of various scores, for ex. Which are
those scores that are so high they can only be obtained by 10% of the
population
• Need to transform the distribution to a STANDARD NORMAL
DISTRIBUTION:
– Thus we now have a z distribution z=X-m = X-50
 10
– No change in the data since new distribution has same shape + observations
stand in same relation to each other (same as converting inches to
centimeters) – we have performed a LINEAR TRANSFORMATION
• Now, a score that was 60 is 1, i.e. the score is 1 S.D. above the mean
• A z score represents the number of S.D. that observation Xi is above or
below the mean.
t-tests:
Testing Hypotheses About Means
• formula:
sx
x
t
m

 deviation
standard
sample
mean
population
mean
sample



sx
x
m
• For a z-test you need to know the population mean and s.d. Often you don’t know the s.d. of the
hypothesised or comparison population, and so you use a t-test. This uses the sample s.d. instead.
•This introduces a source of error, which decreases as your sample size increases
•Therefore, the t statistic is distributed differently depending on the size of the sample, like a family of normal
curves. The degrees of freedom (d.f. = sample size – 1) represents which of these curves you are relating
your t-value to. There are different tables of p-values for different degrees of freedom.
 larger sample = more ‘squashed’ t-statistic distribution = easier to get significance
Kinds of t-tests (formula is slightly different for these different kinds):
• Single-sample: tests whether a sample mean is significantly different from 0
• Independent-samples: tests the relationship between two independent populations
• Paired-samples: tests the relationship between two linked populations, for example means obtained in
two conditions by a single group of participants
( )√n
n = size of sample
t-tests:
Worked Example of Single Sample t-test
• We know that finger tapping speed in normal population:
– Mean=100ms per tap
• Finger tapping speed in 8 subjects with caffeine addiction:
– Mean = 89.4ms
– Standard deviation = 20ms
• Does this prove that caffeine addiction has an effect on tapping speed?
• Null Hypothesis H0: tapping speed not faster after caffeine
• Preselected significance level was 0.05
• Calculate from t value, for ex. T(7)= √8 (89.4 -100) = -1.5
20
• Find area below t(7) = -1.5, get 0.07: i.e. 7% of the time we would expect a
score as low as this
• This value is above 0.05 => We could NOT reject H0!
• We can’t conclude that caffeine addiction has an effect on tapping speed
F-tests / ANOVAs:
What are they?
ANOVA = analysis of variance
involves calculating an F value whose significance is tested (similarly to a z or t value)
• Like t-tests, F-tests deal with differences between or among sample means, but with any
number of means (each mean corresponding to a ‘factor’)
• Q/ do k means differ? A/ yes, if the F value is significant
• Q/ how do the k factors influence each other? A/ look at the interaction effects
• ANOVA calculates F values by comparing the variability between two conditions
with the variability within each condition (this is what the formula does)
– e.g. we give a drug that we believe will improve memory to a group of people and
give a placebo to another group. We then take dependent measures of their
memory performance, e.g. mean number of words recalled from memorised lists.
– An ANOVA compares the variability that we observe between the two conditions to
the variability observed within each condition. Variability is measured as the sum of
the difference of each score from the mean.
– Thus, when the variability that we predict (between the two groups) is much greater
than the variability we don't predict (within each group) then we will conclude that
our treatments produce different results.
F-tests / ANOVAs:
What are they?
• ANOVA calculates an F value, which has a distribution related to the sample size and
number of conditions (degrees of freedom)
• The formula compares the variance between and within conditions or ‘factors’ as
discussed above – we won’t worry about the derivation! (n.b. MS = mean squares)
• If the F statistic is significant, this tells us that the means of the factors differ significantly
=> are not likely to have come from the same ‘population’ = our variable is having an effect
• When can we use ANOVAs?
• The formula is based on a model of what contributes to the value of any particular data
point, and how the variance in the data is composed. This model makes a number of
assumptions that must be met in order to allow us to use ANOVA
– homogeneity of variance
– normality
– independence of observations
• Remember: when you get a significant F value, this just tells you that there is a significant
difference somewhere between the means of the factors in the ANOVA. Therefore, you
often need to do planned or post-hoc comparisons in order to test more specific
hypotheses and probe interaction effects
MS
MS
error
factors

F
ANOVAs:
Worked Example
• Testing Differences between independent sample means: Following rTMS
over the Right Parietal cortex, are the incorrectly cued trials in a cued RT
task slowed down compared to the correctly cued trials?
• “Repeated measures” ANOVA:
– 1 group of 14 healthy volunteers
– Perform 100 trials pre- and 100 trials post- stimulation
– Real vs Sham rTMS on two separate days
• Within-session factors:
– Correct vs Incorrect trials
– Pre vs Post
• Between-session factors:
– Real vs Sham rTMS
• Null Hypothesis H0: there is no difference in the RTs of incorrectly cued trials
• Many possibilities if H0 is rejected:
– All means are different from each other: meanICpreR vs. meanICpostR vs.
meanICpreS vs. meanICpostS
– Means in the Real condition are different from means in the Sham
– Interaction of means might be different (pre_post in Real diff. pre_post in Sham)
Why do we care?
Statistical tests in SPM
• Example in a simple block design of the effect of a
drug on right hand movement versus rest:
fMRI:
Acquired
8 measurements,
2 of each condition
Factorial Design: 2x2
DRUG
Real Placebo
move
rest
Subjects:
12 healthy volunteers
Counterbalanced order
Why do we care?
Statistical tests in SPM
• We perform ANOVAs, t-tests, and f-tests when we create a design matrix and specify
contrasts
• Reminder: GLM equation to explain our data y
– y = X b + e
– X is the design matrix: enter this into SPM to tell program how to divide up the
imaging data into the different conditions
– Each element in the matrix represents one condition
Column1 = right
movement with
drug
Column2 = rest
with drug
Column3 = right
movement with
placebo
Column4 = rest
with placebo
1
2
3
4
X b + e = y
b are the regressors: Allocate regressors specific values to test specific
hypotheses (i.e. CONTRASTS) between conditions
e = error In this case:
Y= (b1x1+b2x2+b3x3+b4x4)+e
Why do we care?
t-tests in SPM
• A t-contrast is a linear combination of
parameters: c’ x b
• If we think that 1 regressor in our design
matrix (e.g. b1) could lead to an interesting
activation, we compute:
1xb1+0xb2+0xb3+0xb4 and divide by SD
• Our question: Is the mean activity in
condition 1 significantly different from the
mean activity in all other conditions?
Why do we care?
t-tests in SPM
• In SPM, we make the weights sum to 0
when testing specific hypotheses
• T-tests in our study would include:
– Main effects of movement across all sessions:
1 -1 1 -1
– Main effects of the drug:
• Increases: 1 1 -1 -1
• Decreases: -1 -1 1 1
– Interaction increases: 1 -1 -1 1
– Interaction decreases: -1 1 1 -1
1
2
3
4
• An F-test models multiple linear hypotheses: does the design matrix X
model anything?
• F-contrasts in our previous example…
– Are there any differences of the drug and placebo altogether? (i.e. increases
AND decreases)
1 0 -1 0
0 1 0 -1
• Used if we want to make more general inferences about data that
– 1) might not be found with simple averaging (cancellations?)
– 2) to test for effects that are jointly expressed by some one-dimensional
contrasts
• 'all effects of interest' to check whether there is any effect at all
– 3) in case data is modelled in a more complex way (hrf & deriv)
– 4) when you have multiple regressors and think that the effect expresses itself in
some of them, not only one
– 5) If you do not have a very clear hypothesis: might be useful to derive more
hypotheses to be tested with t-contrasts
– => more details will be given later in the course…
Why do we care?
F-tests in SPM
References
• http://obelia.jde.aca.mmu.ac.uk/rd/arsham/
opre330.htm#ranova
• ‘Statistical Methods for Psychology’
(2001), by David Howell
• SPM website:
http://www.fil.ion.ucl.ac.uk/spm/

Weitere ähnliche Inhalte

Ähnlich wie FandTtests.ppt

Ähnlich wie FandTtests.ppt (20)

Stats - Intro to Quantitative
Stats -  Intro to Quantitative Stats -  Intro to Quantitative
Stats - Intro to Quantitative
 
Malimu statistical significance testing.
Malimu statistical significance testing.Malimu statistical significance testing.
Malimu statistical significance testing.
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
Aron chpt 8 ed
Aron chpt 8 edAron chpt 8 ed
Aron chpt 8 ed
 
Aron chpt 8 ed
Aron chpt 8 edAron chpt 8 ed
Aron chpt 8 ed
 
Statistics-3 : Statistical Inference - Core
Statistics-3 : Statistical Inference - CoreStatistics-3 : Statistical Inference - Core
Statistics-3 : Statistical Inference - Core
 
Parametric Test
Parametric TestParametric Test
Parametric Test
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Basic of Statistical Inference Part-V: Types of Hypothesis Test (Parametric)
Basic of Statistical Inference Part-V: Types of Hypothesis Test (Parametric) Basic of Statistical Inference Part-V: Types of Hypothesis Test (Parametric)
Basic of Statistical Inference Part-V: Types of Hypothesis Test (Parametric)
 
Analysis of Variance
Analysis of VarianceAnalysis of Variance
Analysis of Variance
 
Sampling fundamentals
Sampling fundamentalsSampling fundamentals
Sampling fundamentals
 
Epidemiology Lectures for UG
Epidemiology Lectures for UGEpidemiology Lectures for UG
Epidemiology Lectures for UG
 
RESEARCH METHODOLOGY - 2nd year ppt
RESEARCH METHODOLOGY - 2nd year pptRESEARCH METHODOLOGY - 2nd year ppt
RESEARCH METHODOLOGY - 2nd year ppt
 
Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)
 
tests of significance
tests of significancetests of significance
tests of significance
 
Statistical analysis
Statistical  analysisStatistical  analysis
Statistical analysis
 
Chapter 7 sampling distributions
Chapter 7 sampling distributionsChapter 7 sampling distributions
Chapter 7 sampling distributions
 
Statistical Methods in Research
Statistical Methods in ResearchStatistical Methods in Research
Statistical Methods in Research
 
lecture-2.ppt
lecture-2.pptlecture-2.ppt
lecture-2.ppt
 
Anova, ancova
Anova, ancovaAnova, ancova
Anova, ancova
 

Mehr von UMAIRASHFAQ20

jhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhg
jhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhgjhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhg
jhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhgUMAIRASHFAQ20
 
CHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdf
CHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdfCHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdf
CHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdfUMAIRASHFAQ20
 
bentrule-200416181539.pdf
bentrule-200416181539.pdfbentrule-200416181539.pdf
bentrule-200416181539.pdfUMAIRASHFAQ20
 
3center4electronsbond-150331130400-conversion-gate01.pdf
3center4electronsbond-150331130400-conversion-gate01.pdf3center4electronsbond-150331130400-conversion-gate01.pdf
3center4electronsbond-150331130400-conversion-gate01.pdfUMAIRASHFAQ20
 
columnchromatography-131008023940-phpapp02.pdf
columnchromatography-131008023940-phpapp02.pdfcolumnchromatography-131008023940-phpapp02.pdf
columnchromatography-131008023940-phpapp02.pdfUMAIRASHFAQ20
 
f-test-200513110014 (1).pdf
f-test-200513110014 (1).pdff-test-200513110014 (1).pdf
f-test-200513110014 (1).pdfUMAIRASHFAQ20
 
physics114_lecture11.ppt
physics114_lecture11.pptphysics114_lecture11.ppt
physics114_lecture11.pptUMAIRASHFAQ20
 
physics114_lecture11 (1).ppt
physics114_lecture11 (1).pptphysics114_lecture11 (1).ppt
physics114_lecture11 (1).pptUMAIRASHFAQ20
 
M.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsx
M.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsxM.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsx
M.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsxUMAIRASHFAQ20
 
1133611095_375749.pptx
1133611095_375749.pptx1133611095_375749.pptx
1133611095_375749.pptxUMAIRASHFAQ20
 

Mehr von UMAIRASHFAQ20 (20)

jhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhg
jhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhgjhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhg
jhghgjhgjhgjhfhcgjfjhvjhjgjkggjhgjhgjhfjgjgfgfhgfhg
 
9618821.ppt
9618821.ppt9618821.ppt
9618821.ppt
 
CHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdf
CHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdfCHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdf
CHEMISTRY OF F-BLOCK ELEMENTS BY K.N.S.SWAMI..pdf473.pdf.pdf
 
bentrule-200416181539.pdf
bentrule-200416181539.pdfbentrule-200416181539.pdf
bentrule-200416181539.pdf
 
9-LFER.pdf
9-LFER.pdf9-LFER.pdf
9-LFER.pdf
 
9-LFER(U).pptx.pdf
9-LFER(U).pptx.pdf9-LFER(U).pptx.pdf
9-LFER(U).pptx.pdf
 
h-bonding.pdf
h-bonding.pdfh-bonding.pdf
h-bonding.pdf
 
11553.pdf
11553.pdf11553.pdf
11553.pdf
 
3center4electronsbond-150331130400-conversion-gate01.pdf
3center4electronsbond-150331130400-conversion-gate01.pdf3center4electronsbond-150331130400-conversion-gate01.pdf
3center4electronsbond-150331130400-conversion-gate01.pdf
 
SMT1105-1.pdf
SMT1105-1.pdfSMT1105-1.pdf
SMT1105-1.pdf
 
columnchromatography-131008023940-phpapp02.pdf
columnchromatography-131008023940-phpapp02.pdfcolumnchromatography-131008023940-phpapp02.pdf
columnchromatography-131008023940-phpapp02.pdf
 
ttest_intro.pdf
ttest_intro.pdfttest_intro.pdf
ttest_intro.pdf
 
f-test-200513110014 (1).pdf
f-test-200513110014 (1).pdff-test-200513110014 (1).pdf
f-test-200513110014 (1).pdf
 
BSC 821 Ch 1.pdf
BSC 821 Ch 1.pdfBSC 821 Ch 1.pdf
BSC 821 Ch 1.pdf
 
9618821.pdf
9618821.pdf9618821.pdf
9618821.pdf
 
physics114_lecture11.ppt
physics114_lecture11.pptphysics114_lecture11.ppt
physics114_lecture11.ppt
 
9773985.pdf
9773985.pdf9773985.pdf
9773985.pdf
 
physics114_lecture11 (1).ppt
physics114_lecture11 (1).pptphysics114_lecture11 (1).ppt
physics114_lecture11 (1).ppt
 
M.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsx
M.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsxM.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsx
M.Sc. Part I QUALITY IN ANALYTICAL CHEMISTRY PPT.ppsx
 
1133611095_375749.pptx
1133611095_375749.pptx1133611095_375749.pptx
1133611095_375749.pptx
 

Kürzlich hochgeladen

Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...boychatmate1
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligencePriyadharshiniG41
 

Kürzlich hochgeladen (20)

Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
Introduction to Mongo DB-open-­‐source, high-­‐performance, document-­‐orient...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligence
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 

FandTtests.ppt

  • 1. z, t, and F tests Making inferences from experimental sample to population using statistical tests
  • 2. CONTENTS OF TALK • Distributions and probability – what is a statistical test? • The normal distribution • Inferences from sample to population: hypothesis testing • Central limit theorem • z tests and example • t tests and example • F tests/ANOVA and example …. and finally…. • Statistical tests and SPM
  • 3. Distribution & Probability • If we know something about the distribution of events, we know something about the probability that one of these events is likely to occur. • e.g. I know that 75% of people have brown eyes, therefore there is a probability of .75 that the next person I meet will have brown eyes. • We can use information about distributions to decide how probable it is that the results of an experiment looking at variable x support a particular hypothesis about the distribution of variable x in the population. • = central aim of experimental science • This is how statistical tests work: test a sample distribution (our experimental results) against a hypothesised distribution, resulting in a ‘p’ value for how likely it is that we would obtain our results under the null hypothesis (null hypothesis = there is no effect or difference between conditions) – i.e. how likely it is that our results were a fluke! • e.g. in an experiment I measure RT in two different conditions and find a difference between conditions. I want to know whether my data can statistically support the hypothesis that there is a genuine difference in RT between these two conditions in the population as a whole, i.e. that the data come from a population where the means of the two conditions are different. The null hypothesis is therefore that the means are the same, and I want a probability of less than .05 of getting the results we obtained under the null hypothesis A statistical test allows me to ‘test’ how likely it is that the sample data come from a parent population with a particular characteristic
  • 4. The Normal Distribution Mean and standard deviation tell you the basic features of a distribution mean = average value of all members of the group standard deviation = a measure of how much the values of individual members vary in relation to the mean • The normal distribution is symmetrical about the mean • 68% of the normal distribution lies within 1 s.d. of the mean 68% of dist. 1 s.d. 1 s.d. P(x) X x Many continuous variables follow a normal distribution, and it plays a special role in the statistical tests we are interested in; •The x-axis represents the values of a particular variable •The y-axis represents the proportion of members of the population that have each value of the variable •The area under the curve represents probability – e.g. area under the curve between two values on the x-axis represents the probability of an individual having a value in that range
  • 5. Sample to Population Testing Hypotheses • t,z, and F tests mathematically compare the distribution of an experimental sample – i.e. the mean and standard deviation of your results a normal distribution whose parameters represent some hypothesised feature of the population, which you think your results support to • How does this work? (without going through the derivation of the equations…!) • …CENTRAL LIMIT THEOREM
  • 6. Central Limit Theorem • Special feature of normal distribution which underlies its use in statistical tests… • Take k samples from a population, and calculate the mean of each sample. The distribution of those means will approximate a normal distribution (for certain variable types). As k tends to infinity, the distribution of sample means tends to a normal distribution • Because the means of samples tend towards a normal distribution in this way, we can convert the mean of our sample distribution (the experimental results) into a value from a standardised normal distribution. population mean 68% of dist. 1 s.d. 1 s.d. sample mean X m P( ) X • A z-test achieves this conversion by performing a linear transformation – the equation is given on the next slide •This can be thought of as expressing your results and your hypothesis in the same ‘units’. •so the z-statistic represents a value on the x-axis of the standard distribution, for which we know all the p-values
  • 7. z-tests: What are they? formula:  m   x z deviation standard population mean population mean sample     m x •Plug in the values, and get a ‘z-value’ which corresponds to a location on the x- axis of a standardised normal distribution (m=0, =1) •For the standardised normal distribution we know the probability of any particular value coming from it (area under the curve) • this is what you read off from a table of z-values •Because we are dealing with the probabilities of hypotheses about our sample, there is always a chance you are wrong…. Choosing the significance level represents how big you want this chance to be… •P<.05 = a 5% chance that you would obtain your result under the null hypothesis (Type 1 error)
  • 8. z-tests: Worked Example • Battery of psychological tests to judge IQ from which we have obtained distribution: – Mean = 50 – S.D. = 10 – Represents disrtibution of entire population – We would like to find out probability of various scores, for ex. Which are those scores that are so high they can only be obtained by 10% of the population • Need to transform the distribution to a STANDARD NORMAL DISTRIBUTION: – Thus we now have a z distribution z=X-m = X-50  10 – No change in the data since new distribution has same shape + observations stand in same relation to each other (same as converting inches to centimeters) – we have performed a LINEAR TRANSFORMATION • Now, a score that was 60 is 1, i.e. the score is 1 S.D. above the mean • A z score represents the number of S.D. that observation Xi is above or below the mean.
  • 9. t-tests: Testing Hypotheses About Means • formula: sx x t m   deviation standard sample mean population mean sample    sx x m • For a z-test you need to know the population mean and s.d. Often you don’t know the s.d. of the hypothesised or comparison population, and so you use a t-test. This uses the sample s.d. instead. •This introduces a source of error, which decreases as your sample size increases •Therefore, the t statistic is distributed differently depending on the size of the sample, like a family of normal curves. The degrees of freedom (d.f. = sample size – 1) represents which of these curves you are relating your t-value to. There are different tables of p-values for different degrees of freedom.  larger sample = more ‘squashed’ t-statistic distribution = easier to get significance Kinds of t-tests (formula is slightly different for these different kinds): • Single-sample: tests whether a sample mean is significantly different from 0 • Independent-samples: tests the relationship between two independent populations • Paired-samples: tests the relationship between two linked populations, for example means obtained in two conditions by a single group of participants ( )√n n = size of sample
  • 10. t-tests: Worked Example of Single Sample t-test • We know that finger tapping speed in normal population: – Mean=100ms per tap • Finger tapping speed in 8 subjects with caffeine addiction: – Mean = 89.4ms – Standard deviation = 20ms • Does this prove that caffeine addiction has an effect on tapping speed? • Null Hypothesis H0: tapping speed not faster after caffeine • Preselected significance level was 0.05 • Calculate from t value, for ex. T(7)= √8 (89.4 -100) = -1.5 20 • Find area below t(7) = -1.5, get 0.07: i.e. 7% of the time we would expect a score as low as this • This value is above 0.05 => We could NOT reject H0! • We can’t conclude that caffeine addiction has an effect on tapping speed
  • 11. F-tests / ANOVAs: What are they? ANOVA = analysis of variance involves calculating an F value whose significance is tested (similarly to a z or t value) • Like t-tests, F-tests deal with differences between or among sample means, but with any number of means (each mean corresponding to a ‘factor’) • Q/ do k means differ? A/ yes, if the F value is significant • Q/ how do the k factors influence each other? A/ look at the interaction effects • ANOVA calculates F values by comparing the variability between two conditions with the variability within each condition (this is what the formula does) – e.g. we give a drug that we believe will improve memory to a group of people and give a placebo to another group. We then take dependent measures of their memory performance, e.g. mean number of words recalled from memorised lists. – An ANOVA compares the variability that we observe between the two conditions to the variability observed within each condition. Variability is measured as the sum of the difference of each score from the mean. – Thus, when the variability that we predict (between the two groups) is much greater than the variability we don't predict (within each group) then we will conclude that our treatments produce different results.
  • 12. F-tests / ANOVAs: What are they? • ANOVA calculates an F value, which has a distribution related to the sample size and number of conditions (degrees of freedom) • The formula compares the variance between and within conditions or ‘factors’ as discussed above – we won’t worry about the derivation! (n.b. MS = mean squares) • If the F statistic is significant, this tells us that the means of the factors differ significantly => are not likely to have come from the same ‘population’ = our variable is having an effect • When can we use ANOVAs? • The formula is based on a model of what contributes to the value of any particular data point, and how the variance in the data is composed. This model makes a number of assumptions that must be met in order to allow us to use ANOVA – homogeneity of variance – normality – independence of observations • Remember: when you get a significant F value, this just tells you that there is a significant difference somewhere between the means of the factors in the ANOVA. Therefore, you often need to do planned or post-hoc comparisons in order to test more specific hypotheses and probe interaction effects MS MS error factors  F
  • 13. ANOVAs: Worked Example • Testing Differences between independent sample means: Following rTMS over the Right Parietal cortex, are the incorrectly cued trials in a cued RT task slowed down compared to the correctly cued trials? • “Repeated measures” ANOVA: – 1 group of 14 healthy volunteers – Perform 100 trials pre- and 100 trials post- stimulation – Real vs Sham rTMS on two separate days • Within-session factors: – Correct vs Incorrect trials – Pre vs Post • Between-session factors: – Real vs Sham rTMS • Null Hypothesis H0: there is no difference in the RTs of incorrectly cued trials • Many possibilities if H0 is rejected: – All means are different from each other: meanICpreR vs. meanICpostR vs. meanICpreS vs. meanICpostS – Means in the Real condition are different from means in the Sham – Interaction of means might be different (pre_post in Real diff. pre_post in Sham)
  • 14. Why do we care? Statistical tests in SPM • Example in a simple block design of the effect of a drug on right hand movement versus rest: fMRI: Acquired 8 measurements, 2 of each condition Factorial Design: 2x2 DRUG Real Placebo move rest Subjects: 12 healthy volunteers Counterbalanced order
  • 15. Why do we care? Statistical tests in SPM • We perform ANOVAs, t-tests, and f-tests when we create a design matrix and specify contrasts • Reminder: GLM equation to explain our data y – y = X b + e – X is the design matrix: enter this into SPM to tell program how to divide up the imaging data into the different conditions – Each element in the matrix represents one condition Column1 = right movement with drug Column2 = rest with drug Column3 = right movement with placebo Column4 = rest with placebo 1 2 3 4 X b + e = y b are the regressors: Allocate regressors specific values to test specific hypotheses (i.e. CONTRASTS) between conditions e = error In this case: Y= (b1x1+b2x2+b3x3+b4x4)+e
  • 16. Why do we care? t-tests in SPM • A t-contrast is a linear combination of parameters: c’ x b • If we think that 1 regressor in our design matrix (e.g. b1) could lead to an interesting activation, we compute: 1xb1+0xb2+0xb3+0xb4 and divide by SD • Our question: Is the mean activity in condition 1 significantly different from the mean activity in all other conditions?
  • 17. Why do we care? t-tests in SPM • In SPM, we make the weights sum to 0 when testing specific hypotheses • T-tests in our study would include: – Main effects of movement across all sessions: 1 -1 1 -1 – Main effects of the drug: • Increases: 1 1 -1 -1 • Decreases: -1 -1 1 1 – Interaction increases: 1 -1 -1 1 – Interaction decreases: -1 1 1 -1 1 2 3 4
  • 18. • An F-test models multiple linear hypotheses: does the design matrix X model anything? • F-contrasts in our previous example… – Are there any differences of the drug and placebo altogether? (i.e. increases AND decreases) 1 0 -1 0 0 1 0 -1 • Used if we want to make more general inferences about data that – 1) might not be found with simple averaging (cancellations?) – 2) to test for effects that are jointly expressed by some one-dimensional contrasts • 'all effects of interest' to check whether there is any effect at all – 3) in case data is modelled in a more complex way (hrf & deriv) – 4) when you have multiple regressors and think that the effect expresses itself in some of them, not only one – 5) If you do not have a very clear hypothesis: might be useful to derive more hypotheses to be tested with t-contrasts – => more details will be given later in the course… Why do we care? F-tests in SPM
  • 19. References • http://obelia.jde.aca.mmu.ac.uk/rd/arsham/ opre330.htm#ranova • ‘Statistical Methods for Psychology’ (2001), by David Howell • SPM website: http://www.fil.ion.ucl.ac.uk/spm/