SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
50 shades of gray: A research story
Brian Nosek, Jeffrey Spies, and Matt Motyl:
Participants from the political left, right and center (N =
1,979) completed a perceptual judgment task in which
words were presented in different shades of gray . . . The
results were stunning. Moderates perceived the shades of
gray more accurately than extremists on the left and right
(p = .01).
They continue:
Our design and follow-up analyses ruled out obvious
alternative explanations such as time spent on task and a
tendency to select extreme responses. Enthused about
the result, we identified Psychological Science as our fall
back journal after we toured the Science, Nature, and
PNAS rejection mills . . .
The preregistered replication
Nosek, Spies, and Motyl:
We conducted a direct replication while we prepared the
manuscript. We ran 1,300 participants, giving us .995
power to detect an effect of the original effect size at
alpha = .05.
The result:
The effect vanished (p = .59).
2/45
3/45
4/45
5/45
6/45
The famous study of social priming
7/45
8/45
Daniel Kahneman (2011):
“When I describe priming
studies to audiences, the
reaction is often disbelief
. . . The idea you should focus
on, however, is that disbelief is
not an option. The results are
not made up, nor are they
statistical flukes. You have no
choice but to accept that the
major conclusions of these
studies are true.”
10/45
The attempted replication
11/45
Daniel Kahneman (2011):
“When I describe
priming studies to
audiences, the reaction
is often disbelief . . . The
idea you should focus
on, however, is that
disbelief is not an
option. The results are
not made up, nor are
they statistical flukes.
You have no choice but
to accept that the
major conclusions of
these studies are true.”
Wagenmakers et al. (2014):
“[After] a long series
of failed replications
. . . disbelief does in fact
remain an option.”
12/45
Alan Turing (1950):
“I assume that the reader is
familiar with the idea of
extra-sensory perception, and
the meaning of the four items
of it, viz. telepathy,
clairvoyance, precognition and
psycho-kinesis. These
disturbing phenomena seem to
deny all our usual scientific
ideas. How we should like to
discredit them! Unfortunately
the statistical evidence, at
least for telepathy, is
overwhelming.”
14/45
This week in Psychological Science
“Turning Body and Self Inside Out: Visualized Heartbeats
Alter Bodily Self-Consciousness and Tactile Perception”
“Aging 5 Years in 5 Minutes: The Effect of Taking a Memory
Test on Older Adults’ Subjective Age”
“The Double-Edged Sword of Grandiose Narcissism:
Implications for Successful and Unsuccessful Leadership
Among U.S. Presidents”
“On the Nature and Nurture of Intelligence and Specific
Cognitive Abilities: The More Heritable, the More Culture
Dependent”
“Beauty at the Ballot Box: Disease Threats Predict
Preferences for Physically Attractive Leaders”
“Shaping Attention With Reward: Effects of Reward on Space-
and Object-Based Selection”
“It Pays to Be Herr Kaiser: Germans With Noble-Sounding
Surnames More Often Work as Managers Than as Employees”
This week in Psychological Science
N = 17
N = 57
N = 42
N = 7,582
N = 123 + 156 + 66
N = 47
N = 222,924
17/45
The “That which does not destroy my statistical significance
makes it stronger” fallacy
Charles Murray: “To me, the experience of early childhood
intervention programs follows the familiar, discouraging pattern
. . . small-scale experimental efforts [N = 123 and N = 111] staffed
by highly motivated people show effects. When they are subject to
well-designed large-scale replications, those promising signs
attenuate and often evaporate altogether.”
James Heckman: “The effects reported for the programs I discuss
survive batteries of rigorous testing procedures. They are conducted
by independent analysts who did not perform or design the original
experiments. The fact that samples are small works against finding
any effects for the programs, much less the statistically significant
and substantial effects that have been found.”
What’s going on?
The paradigm of routine discovery
The garden of forking paths
The “law of small numbers” fallacy
The “That which does not destroy my statistical significance
makes it stronger” fallacy
Correlation does not even imply correlation
19/45
Why is psychology particularly difficult?
Indirect and noisy measurement
Human variation
Noncompliance and missing data
Experimental subjects trying to figure out what you’re doing
20/45
What to do?
Look at everything
Interactions
Multilevel modeling
Within-person studies
Design analysis
Bayesian inference
22/45
Living in the multiverse
23/45
Choices!
1. Exclusion criteria based on cycle length (3 options)
2. Exclusion criteria based on “How sure are you?” response (2)
3. Cycle day assessment (3)
4. Fertility assessment (4)
5. Relationship status assessment (3)
168 possibilities (after excluding some contradictory combinations)
24/45
Living in the multiverse
25/45
Living in the multiverse
26/45
27/45
28/45
29/45
Interactions and the freshman fallacy
From an email I received:
30/45
Why it’s hard to study comparisons and interactions
Standard error for a proportion: 0.5/
√
n
Standard error for a comparison: 0.52/n
2 + 0.52/n
2 = 1/
√
n
Twice the standard error . . . and the effect is probably smaller!
31/45
32/45
Within-person studies
33/45
34/45
35/45
Power Design analysis
I’ve never made a type 1 error in my life
I’ve never made a type 2 error in my life
I make Type S (sign) errors
I make Type M (magnitude) errors
36/45
What can we learn from statistical significance?
37/45
This is what "power = 0.06" looks like.
Get used to it.
Estimated effect size
−30 −20 −10 0 10 20 30
True
effect
size
(assumed)Type S error probability:
If the estimate is
statistically significant,
it has a 24% chance of
having the wrong sign.
Exaggeration ratio:
If the estimate is
statistically significant,
it must be at least 9
times higher than the
true effect size.
38/45
The paradox of publication
39/45
40/45
41/45
Let us have
the serenity to embrace the variation that we cannot reduce,
the courage to reduce the variation we cannot embrace,
and the wisdom to distinguish one from the other.
42/45
The Statistical Crisis in Science
Andrew Gelman, John Carlin, Eric Loken, Francis Tuerlinckx,
Sara Steegen, Wolf Vanpaemel
Department of Statistics and Department of Political Science
Columbia University, New York
Department of Psychology, Harvard University, 29 Jan 2015
43/45

Weitere ähnliche Inhalte

Was ist angesagt?

Exploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory ResearchExploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory Research
jemille6
 
D. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics WarsD. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics Wars
jemille6
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)
jemille6
 

Was ist angesagt? (20)

Replication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden ControversiesReplication Crises and the Statistics Wars: Hidden Controversies
Replication Crises and the Statistics Wars: Hidden Controversies
 
Mayo & parker spsp 2016 june 16
Mayo & parker   spsp 2016 june 16Mayo & parker   spsp 2016 june 16
Mayo & parker spsp 2016 june 16
 
Controversy Over the Significance Test Controversy
Controversy Over the Significance Test ControversyControversy Over the Significance Test Controversy
Controversy Over the Significance Test Controversy
 
Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively Statistical skepticism: How to use significance tests effectively
Statistical skepticism: How to use significance tests effectively
 
D. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severelyD. G. Mayo: Your data-driven claims must still be probed severely
D. G. Mayo: Your data-driven claims must still be probed severely
 
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
D. Mayo: The Science Wars and the Statistics Wars: scientism, popular statist...
 
Exploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory ResearchExploratory Research is More Reliable Than Confirmatory Research
Exploratory Research is More Reliable Than Confirmatory Research
 
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
D. G. Mayo: The Replication Crises and its Constructive Role in the Philosoph...
 
beyond objectivity and subjectivity; a discussion paper
beyond objectivity and subjectivity; a discussion paperbeyond objectivity and subjectivity; a discussion paper
beyond objectivity and subjectivity; a discussion paper
 
Discussion a 4th BFFF Harvard
Discussion a 4th BFFF HarvardDiscussion a 4th BFFF Harvard
Discussion a 4th BFFF Harvard
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
 
Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."
Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."
Yoav Benjamini, "In the world beyond p<.05: When & How to use P<.0499..."
 
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in ScienceD. Mayo: Philosophy of Statistics & the Replication Crisis in Science
D. Mayo: Philosophy of Statistics & the Replication Crisis in Science
 
D. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics WarsD. Mayo: Philosophical Interventions in the Statistics Wars
D. Mayo: Philosophical Interventions in the Statistics Wars
 
Mayo: Day #2 slides
Mayo: Day #2 slidesMayo: Day #2 slides
Mayo: Day #2 slides
 
April 3 2014 slides mayo
April 3 2014 slides mayoApril 3 2014 slides mayo
April 3 2014 slides mayo
 
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
Fusion Confusion? Comments on Nancy Reid: "BFF Four-Are we Converging?"
 
Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)Mayo O&M slides (4-28-13)
Mayo O&M slides (4-28-13)
 
Statistical Flukes, the Higgs Discovery, and 5 Sigma
Statistical Flukes, the Higgs Discovery, and 5 Sigma Statistical Flukes, the Higgs Discovery, and 5 Sigma
Statistical Flukes, the Higgs Discovery, and 5 Sigma
 
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
D. Mayo: Putting the brakes on the breakthrough: An informal look at the argu...
 

Andere mochten auch (6)

Jorge luis borges
Jorge luis borgesJorge luis borges
Jorge luis borges
 
Spurious correlation (updated)
Spurious correlation (updated)Spurious correlation (updated)
Spurious correlation (updated)
 
Validity in psychological testing
Validity in psychological testingValidity in psychological testing
Validity in psychological testing
 
Medical errors
Medical errorsMedical errors
Medical errors
 
psychological assessment standardization, evaluation etc
psychological assessment standardization, evaluation etc psychological assessment standardization, evaluation etc
psychological assessment standardization, evaluation etc
 
Psychometric Assessment
Psychometric Assessment Psychometric Assessment
Psychometric Assessment
 

Ähnlich wie Gelman psych crisis_2

Critical Analytical ThinkingPart II Heuristics and Bias.docx
Critical Analytical ThinkingPart II Heuristics and Bias.docxCritical Analytical ThinkingPart II Heuristics and Bias.docx
Critical Analytical ThinkingPart II Heuristics and Bias.docx
annettsparrow
 
475 media effects methods 2012 up
475 media effects methods 2012 up475 media effects methods 2012 up
475 media effects methods 2012 up
mpeffl
 
Chapter 3 part1-Design of Experiments
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experiments
nszakir
 

Ähnlich wie Gelman psych crisis_2 (20)

Chapter 1 (thinking critically)
Chapter 1 (thinking critically)Chapter 1 (thinking critically)
Chapter 1 (thinking critically)
 
009906275.pdf
009906275.pdf009906275.pdf
009906275.pdf
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
Critical Analytical ThinkingPart II Heuristics and Bias.docx
Critical Analytical ThinkingPart II Heuristics and Bias.docxCritical Analytical ThinkingPart II Heuristics and Bias.docx
Critical Analytical ThinkingPart II Heuristics and Bias.docx
 
Security Is Like An Onion, That's Why It Makes You Cry
Security Is Like An Onion, That's Why It Makes You CrySecurity Is Like An Onion, That's Why It Makes You Cry
Security Is Like An Onion, That's Why It Makes You Cry
 
Causal modelling - Series of lectures on causal modelling in the social sciences
Causal modelling - Series of lectures on causal modelling in the social sciencesCausal modelling - Series of lectures on causal modelling in the social sciences
Causal modelling - Series of lectures on causal modelling in the social sciences
 
Empirische wetenschap onder vuur
Empirische wetenschap onder vuurEmpirische wetenschap onder vuur
Empirische wetenschap onder vuur
 
Quantitative Methods: Experimental Design
Quantitative Methods: Experimental DesignQuantitative Methods: Experimental Design
Quantitative Methods: Experimental Design
 
StudyDesign.ppt
StudyDesign.pptStudyDesign.ppt
StudyDesign.ppt
 
Big five
Big fiveBig five
Big five
 
475 media effects methods 2012 up
475 media effects methods 2012 up475 media effects methods 2012 up
475 media effects methods 2012 up
 
Insights from psychology on lack of reproducibility
Insights from psychology on lack of reproducibilityInsights from psychology on lack of reproducibility
Insights from psychology on lack of reproducibility
 
Virtues and vices of causal modelling. A primer for the next generation of sc...
Virtues and vices of causal modelling. A primer for the next generation of sc...Virtues and vices of causal modelling. A primer for the next generation of sc...
Virtues and vices of causal modelling. A primer for the next generation of sc...
 
SAMPLE_AND_OTHER.ppt
SAMPLE_AND_OTHER.pptSAMPLE_AND_OTHER.ppt
SAMPLE_AND_OTHER.ppt
 
Chapter 3 part1-Design of Experiments
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experiments
 
Psychics
PsychicsPsychics
Psychics
 
Research methods wccc 9 14-15
Research methods wccc 9 14-15Research methods wccc 9 14-15
Research methods wccc 9 14-15
 
Bergman Psych- ch 01
Bergman Psych- ch 01Bergman Psych- ch 01
Bergman Psych- ch 01
 
Unit 2
Unit 2Unit 2
Unit 2
 
Power, Effect Sizes, Confidence Intervals, & Academic Integrity
Power, Effect Sizes, Confidence Intervals, & Academic IntegrityPower, Effect Sizes, Confidence Intervals, & Academic Integrity
Power, Effect Sizes, Confidence Intervals, & Academic Integrity
 

Mehr von jemille6

What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
jemille6
 
What's the question?
What's the question? What's the question?
What's the question?
jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
jemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
jemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
jemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
jemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
jemille6
 

Mehr von jemille6 (20)

“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”
 
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and ProbabilismStatistical Inference as Severe Testing: Beyond Performance and Probabilism
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
 
D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
 
What's the question?
What's the question? What's the question?
What's the question?
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Kürzlich hochgeladen (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 

Gelman psych crisis_2

  • 1. 50 shades of gray: A research story Brian Nosek, Jeffrey Spies, and Matt Motyl: Participants from the political left, right and center (N = 1,979) completed a perceptual judgment task in which words were presented in different shades of gray . . . The results were stunning. Moderates perceived the shades of gray more accurately than extremists on the left and right (p = .01). They continue: Our design and follow-up analyses ruled out obvious alternative explanations such as time spent on task and a tendency to select extreme responses. Enthused about the result, we identified Psychological Science as our fall back journal after we toured the Science, Nature, and PNAS rejection mills . . .
  • 2. The preregistered replication Nosek, Spies, and Motyl: We conducted a direct replication while we prepared the manuscript. We ran 1,300 participants, giving us .995 power to detect an effect of the original effect size at alpha = .05. The result: The effect vanished (p = .59). 2/45
  • 7. The famous study of social priming 7/45
  • 9. Daniel Kahneman (2011): “When I describe priming studies to audiences, the reaction is often disbelief . . . The idea you should focus on, however, is that disbelief is not an option. The results are not made up, nor are they statistical flukes. You have no choice but to accept that the major conclusions of these studies are true.”
  • 10. 10/45
  • 12. Daniel Kahneman (2011): “When I describe priming studies to audiences, the reaction is often disbelief . . . The idea you should focus on, however, is that disbelief is not an option. The results are not made up, nor are they statistical flukes. You have no choice but to accept that the major conclusions of these studies are true.” Wagenmakers et al. (2014): “[After] a long series of failed replications . . . disbelief does in fact remain an option.” 12/45
  • 13. Alan Turing (1950): “I assume that the reader is familiar with the idea of extra-sensory perception, and the meaning of the four items of it, viz. telepathy, clairvoyance, precognition and psycho-kinesis. These disturbing phenomena seem to deny all our usual scientific ideas. How we should like to discredit them! Unfortunately the statistical evidence, at least for telepathy, is overwhelming.”
  • 14. 14/45
  • 15. This week in Psychological Science “Turning Body and Self Inside Out: Visualized Heartbeats Alter Bodily Self-Consciousness and Tactile Perception” “Aging 5 Years in 5 Minutes: The Effect of Taking a Memory Test on Older Adults’ Subjective Age” “The Double-Edged Sword of Grandiose Narcissism: Implications for Successful and Unsuccessful Leadership Among U.S. Presidents” “On the Nature and Nurture of Intelligence and Specific Cognitive Abilities: The More Heritable, the More Culture Dependent” “Beauty at the Ballot Box: Disease Threats Predict Preferences for Physically Attractive Leaders” “Shaping Attention With Reward: Effects of Reward on Space- and Object-Based Selection” “It Pays to Be Herr Kaiser: Germans With Noble-Sounding Surnames More Often Work as Managers Than as Employees”
  • 16. This week in Psychological Science N = 17 N = 57 N = 42 N = 7,582 N = 123 + 156 + 66 N = 47 N = 222,924
  • 17. 17/45
  • 18. The “That which does not destroy my statistical significance makes it stronger” fallacy Charles Murray: “To me, the experience of early childhood intervention programs follows the familiar, discouraging pattern . . . small-scale experimental efforts [N = 123 and N = 111] staffed by highly motivated people show effects. When they are subject to well-designed large-scale replications, those promising signs attenuate and often evaporate altogether.” James Heckman: “The effects reported for the programs I discuss survive batteries of rigorous testing procedures. They are conducted by independent analysts who did not perform or design the original experiments. The fact that samples are small works against finding any effects for the programs, much less the statistically significant and substantial effects that have been found.”
  • 19. What’s going on? The paradigm of routine discovery The garden of forking paths The “law of small numbers” fallacy The “That which does not destroy my statistical significance makes it stronger” fallacy Correlation does not even imply correlation 19/45
  • 20. Why is psychology particularly difficult? Indirect and noisy measurement Human variation Noncompliance and missing data Experimental subjects trying to figure out what you’re doing 20/45
  • 21. What to do? Look at everything Interactions Multilevel modeling Within-person studies Design analysis Bayesian inference
  • 22. 22/45
  • 23. Living in the multiverse 23/45
  • 24. Choices! 1. Exclusion criteria based on cycle length (3 options) 2. Exclusion criteria based on “How sure are you?” response (2) 3. Cycle day assessment (3) 4. Fertility assessment (4) 5. Relationship status assessment (3) 168 possibilities (after excluding some contradictory combinations) 24/45
  • 25. Living in the multiverse 25/45
  • 26. Living in the multiverse 26/45
  • 27. 27/45
  • 28. 28/45
  • 29. 29/45
  • 30. Interactions and the freshman fallacy From an email I received: 30/45
  • 31. Why it’s hard to study comparisons and interactions Standard error for a proportion: 0.5/ √ n Standard error for a comparison: 0.52/n 2 + 0.52/n 2 = 1/ √ n Twice the standard error . . . and the effect is probably smaller! 31/45
  • 32. 32/45
  • 34. 34/45
  • 35. 35/45
  • 36. Power Design analysis I’ve never made a type 1 error in my life I’ve never made a type 2 error in my life I make Type S (sign) errors I make Type M (magnitude) errors 36/45
  • 37. What can we learn from statistical significance? 37/45
  • 38. This is what "power = 0.06" looks like. Get used to it. Estimated effect size −30 −20 −10 0 10 20 30 True effect size (assumed)Type S error probability: If the estimate is statistically significant, it has a 24% chance of having the wrong sign. Exaggeration ratio: If the estimate is statistically significant, it must be at least 9 times higher than the true effect size. 38/45
  • 39. The paradox of publication 39/45
  • 40. 40/45
  • 41. 41/45
  • 42. Let us have the serenity to embrace the variation that we cannot reduce, the courage to reduce the variation we cannot embrace, and the wisdom to distinguish one from the other. 42/45
  • 43. The Statistical Crisis in Science Andrew Gelman, John Carlin, Eric Loken, Francis Tuerlinckx, Sara Steegen, Wolf Vanpaemel Department of Statistics and Department of Political Science Columbia University, New York Department of Psychology, Harvard University, 29 Jan 2015 43/45