Sampling and case selection

HONORS THESIS CAPSTONE
COURSE

GOVERNMENT DEPARTMENT
DATE PROFESSOR
FALL 2010 MICHAEL NELSON

This time...
empirical methods
sampling
small-N causal inference

sampling
probability sampling
non-probability sampling
sampling “challenges”

Groups in Sampling

The Theoretical Population

The Study Population

The Sampling Frame

The Sample

probability sampling from Henry

general sampling strategies from Patton

sampling & case selection challenges
y
a, b
• Population Size
• Sampling Bias
• probability of selection correlated with IV; will get the same relationship, pop
but there is systematic non-representativeness
• Selection Bias x
• subset of sampling bias; probability of selection correlated with DV misses gets
• underestimates the relationship (regression line b instead of a) y
a
• Non-response Bias
b
• possibility that you are unable to collect data; data set is unrepresentative gets

misses pop

x

Causal inference
for small-N
research
properties of small-N research
case study purposes & types
strategies

Case selection

• For quantitative research, selection should be random

• For qualitative research, selection often must be done intentionally (King,
Keohane and Verba, 1994).

properties of small-n research

• intensive
• ﬁeld research in natural settings
• many kinds of data: observation, interview, archives
• typically: case-centered, not variable centered

Case studies and
research design
from Gerring and McDermott
(2007)

Gerring on case studies
Research Goals Case Study Cross-Case Study
1. Hypothesis Generating Testing
2. Validity Internal External
3. Causal Insight Mechanisms Effects
4. Scope of Deep Broad
Proposition
Empirical Factors Case Study Cross-Case Study
5. Populations of Heterogeneous Homogenous
Cases
6. Causal Strength Strong Weak
7. Useful Variation Rare Common
8. Data Availability Concentrated Dispersed
Additional Factors Case Study Cross-Case Study
1. Causal ? ?
Complexity
2. State of the Field ? ?

Case study purposes & types:
case selection as sampling

1.Descriptive Case Study: atheoretical; goal is to understand the case itself
2.Plausibility Probe: does the empirical phenomena exist; focus on availability of data;
concern with plausibility of ﬁnding relationships between variables of interest
3.Hypothesis-Generating Case Study: seeks to ﬁnd a generalization about cause and
effect
4.Hypothesis-Testing Case Studies
4.1. Critical Case
4.2. Rival Hypotheses
4.3. ....

Extreme cases

• Represent unusual values of
the dependent or independent
variables

• Used for hypothesis generation

• Not intended to be
representative

Deviant cases

• Cases that deviate from the
typical population

• A “high residual” case (outlier)

• Useful for generating
hypotheses, especially new
explanations for the outcome
(dependent variable) of interest

Hypothesis- Testing Strategies: case selection

1.goal: establish the relationship between two or more variables

2.selection advice:

2.1. choose cases that minimize variability in the other variables that might
impact the relationship you are investigating

2.2. representative sample

hypothesis - testing case studies

critical case

rival hypotheses

Selecting the typical
case

• Look for cases that are
“typical” other cases

• Idea is that these cases are
“low residual” cases

• Useful for hypothesis testing.

Select diverse cases

• Select cases that are represent
the full range of variation

• Useful for hypothesis
generation and hypothesis
testing

• Represent variation in the
population but not necessarily
the distribution of that
population

Influential case

• Cases with influential
configurations of the
independent variables are
chosen

• Useful for verifying the status of
a highly influential case

• Not necessarily representative

Crucial case

• Cases that are likely to represent an outcome of interest

• Choice usually requires qualitative assessment of crucialness

• Useful for hypothesis testing

• Should be highly representative

Selecting cases on the Independent Variable

• You select cases based on the values of an independent variable(s)

• Requires that you know a little bit about all of the potential cases

• Requires you act as if you don’t know the values of the dependent variable

Mill’s Methods

agreement

difference

Most Similar cases

• Cases are selected based on their similarity on variables other than the
independent variable the hypothesis is testing the outcome of interest

• Useful for hypothesis testing and generation

• Not necessarily representative of the broader

• Most Similar Systems analysis involves a non-equivalent group design:

NOXO
NO O

Thad’s example: income inequality and civil war

Income
Inequality

Poverty Civil
War
Colonial Past

External Threat

Case Income Poverty Colonial External Civil
Inequality Past Threat War?

Costa Rica Moderate Yes Yup Nope No

El Salvador High Yes Yup Nope Yes

Cuba High Yes Yup Nope Yes

adapted from Thad Kousser, UCSD

Case study challenges

• Motive behind the selection of case studies is not obvious (Is it convenience? Or is
it because they are good stories). Without understanding this, the project is at best
useless and at worst terrible misleading.
• Generalizability – Can the lessons learned from this case be applied to a larger
class?
• Falsiﬁability – Results are presented in such a way that it would be difﬁcult for an
impartial researcher to replicate the project and arrive at the same result.
• No or Negative Degrees of Freedom: The researcher has more explanatory
variables (moving pieces) than observations.
• Selection on the Dependent Variable: Choosing cases because of their
performance on outcome of interest.

Strategies: remember threats to internal & external validity!

• History, maturation, instrumentation (data limitations)
• Selection bias
• KKV give example of business school student who wants a high paid job and
selects for his study sample only those graduates earning high salaries. He then
relates salary to number of accounting courses. By excluding graduates with low
salaries, he paradoxically underestimates the effect of additional accounting
courses on income.

Strategies: combining with large-N
1. Goal: Increase number of observations
1.1. Comparative case with large-N analysis of embedded units
2. Goal: Study causal mechanisms
2.1. Large-N study establishes relationships between variables (causal effect)
2.2. Small-N study establishes causal mechanism, looking at intervening steps (causal mechanism)
2.3. Note: causal explanation requires an understanding of both the causal effect and the causal
mechanism
3. Goal: Study of spuriousness
3.1. Large-N study establishes relationships between variables (causal effect)
3.2. Small-N study engages claims of spuriousness
4. Goal: Study of deviant cases
4.1. Large-N study establishes deviant cases
4.2. Small-N study examines deviant cases
5. Goal: Establish generality of ﬁndings
5.1. Small-N study suggests X causes Y, but lacks external validity
5.2. Large-N study looks to establish the generality of ﬁndings

Strategies:
Increasing leverage for causal inference in case studies

1.Congruence Method: Test a hypothesis by understanding a case; looks for ﬁt between
theory and case; involves multiple independent variables
2.Pattern Matching: Type of congruence testing, usually focused on a single
independent variable; compares alternative theories with respect to multiple outcomes

3. Process Tracing: Focus is on establishing the causal mechanism, by examining ﬁt of
theory to intervening causal steps; how does “X” produce a series of conditions that
come together in some way (or don’t) to produce “Y”?
4. Counterfactual Analysis: Gain leverage through rigorous, disciplined thought
experiments

Strategies: structured, focused comparison
1. “the comparison is focused because it deals
selectively with only certain aspects of a historical
case... and structured because it employs general
questions to guide the data collection analysis in that
historical case” - Alexander and George

2. Steps (Kaarbo and Beasley)
2.1. Identify the research question
2.2. Identify variables (usually from existing theory)
2.3. Select cases: comparable cases with variation in
the values of the dependent variable, selected
from across population subgroups (aids external
validity)
2.4. Deﬁne and specify your measurement strategy for
concepts, including a “codebook” for the
questions you employ in data collection
2.5. “Code-write cases”
2.6. Comparison (search for patterns) and implications
for theory

Sampling and case selection

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Sampling and case selection

Ähnlich wie Sampling and case selection (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Sampling and case selection