8. sampling & case selection challenges
y
a, b
• Population Size
• Sampling Bias
• probability of selection correlated with IV; will get the same relationship, pop
but there is systematic non-representativeness
• Selection Bias x
• subset of sampling bias; probability of selection correlated with DV misses gets
• underestimates the relationship (regression line b instead of a) y
a
• Non-response Bias
b
• possibility that you are unable to collect data; data set is unrepresentative gets
misses pop
x
9. Causal inference
for small-N
research
properties of small-N research
case study purposes & types
strategies
10. Case selection
• For quantitative research, selection should be random
• For qualitative research, selection often must be done intentionally (King,
Keohane and Verba, 1994).
11.
12. properties of small-n research
• intensive
• field research in natural settings
• many kinds of data: observation, interview, archives
• typically: case-centered, not variable centered
14. Case studies and
research design
from Gerring and McDermott
(2007)
15. Gerring on case studies
Research Goals Case Study Cross-Case Study
1. Hypothesis Generating Testing
2. Validity Internal External
3. Causal Insight Mechanisms Effects
4. Scope of Deep Broad
Proposition
Empirical Factors Case Study Cross-Case Study
5. Populations of Heterogeneous Homogenous
Cases
6. Causal Strength Strong Weak
7. Useful Variation Rare Common
8. Data Availability Concentrated Dispersed
Additional Factors Case Study Cross-Case Study
1. Causal ? ?
Complexity
2. State of the Field ? ?
16.
17.
18. Case study purposes & types:
case selection as sampling
1.Descriptive Case Study: atheoretical; goal is to understand the case itself
2.Plausibility Probe: does the empirical phenomena exist; focus on availability of data;
concern with plausibility of finding relationships between variables of interest
3.Hypothesis-Generating Case Study: seeks to find a generalization about cause and
effect
4.Hypothesis-Testing Case Studies
4.1. Critical Case
4.2. Rival Hypotheses
4.3. ....
20. Extreme cases
• Represent unusual values of
the dependent or independent
variables
• Used for hypothesis generation
• Not intended to be
representative
21. Deviant cases
• Cases that deviate from the
typical population
• A “high residual” case (outlier)
• Useful for generating
hypotheses, especially new
explanations for the outcome
(dependent variable) of interest
22. Hypothesis- Testing Strategies: case selection
1.goal: establish the relationship between two or more variables
2.selection advice:
2.1. choose cases that minimize variability in the other variables that might
impact the relationship you are investigating
2.2. representative sample
24. Selecting the typical
case
• Look for cases that are
“typical” other cases
• Idea is that these cases are
“low residual” cases
• Useful for hypothesis testing.
25. Select diverse cases
• Select cases that are represent
the full range of variation
• Useful for hypothesis
generation and hypothesis
testing
• Represent variation in the
population but not necessarily
the distribution of that
population
26. Influential case
• Cases with influential
configurations of the
independent variables are
chosen
• Useful for verifying the status of
a highly influential case
• Not necessarily representative
27. Crucial case
• Cases that are likely to represent an outcome of interest
• Choice usually requires qualitative assessment of crucialness
• Useful for hypothesis testing
• Should be highly representative
28. Selecting cases on the Independent Variable
• You select cases based on the values of an independent variable(s)
• Requires that you know a little bit about all of the potential cases
• Requires you act as if you don’t know the values of the dependent variable
30. Most Similar cases
• Cases are selected based on their similarity on variables other than the
independent variable the hypothesis is testing the outcome of interest
• Useful for hypothesis testing and generation
• Not necessarily representative of the broader
• Most Similar Systems analysis involves a non-equivalent group design:
NOXO
NO O
31. Thad’s example: income inequality and civil war
Income
Inequality
Poverty Civil
War
Colonial Past
External Threat
32. Case Income Poverty Colonial External Civil
Inequality Past Threat War?
Costa Rica Moderate Yes Yup Nope No
El Salvador High Yes Yup Nope Yes
Cuba High Yes Yup Nope Yes
adapted from Thad Kousser, UCSD
34. Case study challenges
• Motive behind the selection of case studies is not obvious (Is it convenience? Or is
it because they are good stories). Without understanding this, the project is at best
useless and at worst terrible misleading.
• Generalizability – Can the lessons learned from this case be applied to a larger
class?
• Falsifiability – Results are presented in such a way that it would be difficult for an
impartial researcher to replicate the project and arrive at the same result.
• No or Negative Degrees of Freedom: The researcher has more explanatory
variables (moving pieces) than observations.
• Selection on the Dependent Variable: Choosing cases because of their
performance on outcome of interest.
35. Strategies: remember threats to internal & external validity!
• History, maturation, instrumentation (data limitations)
• Selection bias
• KKV give example of business school student who wants a high paid job and
selects for his study sample only those graduates earning high salaries. He then
relates salary to number of accounting courses. By excluding graduates with low
salaries, he paradoxically underestimates the effect of additional accounting
courses on income.
38. Strategies: combining with large-N
1. Goal: Increase number of observations
1.1. Comparative case with large-N analysis of embedded units
2. Goal: Study causal mechanisms
2.1. Large-N study establishes relationships between variables (causal effect)
2.2. Small-N study establishes causal mechanism, looking at intervening steps (causal mechanism)
2.3. Note: causal explanation requires an understanding of both the causal effect and the causal
mechanism
3. Goal: Study of spuriousness
3.1. Large-N study establishes relationships between variables (causal effect)
3.2. Small-N study engages claims of spuriousness
4. Goal: Study of deviant cases
4.1. Large-N study establishes deviant cases
4.2. Small-N study examines deviant cases
5. Goal: Establish generality of findings
5.1. Small-N study suggests X causes Y, but lacks external validity
5.2. Large-N study looks to establish the generality of findings
39. Strategies:
Increasing leverage for causal inference in case studies
1.Congruence Method: Test a hypothesis by understanding a case; looks for fit between
theory and case; involves multiple independent variables
2.Pattern Matching: Type of congruence testing, usually focused on a single
independent variable; compares alternative theories with respect to multiple outcomes
3. Process Tracing: Focus is on establishing the causal mechanism, by examining fit of
theory to intervening causal steps; how does “X” produce a series of conditions that
come together in some way (or don’t) to produce “Y”?
4. Counterfactual Analysis: Gain leverage through rigorous, disciplined thought
experiments
40. Strategies: structured, focused comparison
1. “the comparison is focused because it deals
selectively with only certain aspects of a historical
case... and structured because it employs general
questions to guide the data collection analysis in that
historical case” - Alexander and George
2. Steps (Kaarbo and Beasley)
2.1. Identify the research question
2.2. Identify variables (usually from existing theory)
2.3. Select cases: comparable cases with variation in
the values of the dependent variable, selected
from across population subgroups (aids external
validity)
2.4. Define and specify your measurement strategy for
concepts, including a “codebook” for the
questions you employ in data collection
2.5. “Code-write cases”
2.6. Comparison (search for patterns) and implications
for theory