2. preamble
• Likely a wide range of expertise in
the audience
• This is a deep topic, and I’ll only
scratch the surface. Warning: if
you’re in the far LHS of the
distribution, this talk will be just
enough for you to be a danger to
yourself and others.
• The goal is to provide tools for
interpreting LMs, and a basic
vocabulary for pursuing deeper
topics.
3. overview
• What are LMs?
• Fitting and interpreting LMs
• Transforming data
• Hypothesis testing
• Mixed-effect models
4. overview
• What are LMs?
• Fitting and interpreting LMs
• Transforming data
• Hypothesis testing
• Mixed-effect models
5. overview
• What are LMs?
• Fitting and interpreting LMs
• Transforming data
• Hypothesis testing
• Mixed-effect models
7. what is a linear model?
multiple regression multi-way ANOVA
8. what is a linear model?
• mtcars: Let’s pretend that we would like to model 1/4 mile
time (Y, the “response”) as a function of horsepower (X, the
“predictor”) plus random noise
Y = f(X)+e
9. what is a linear model?
• mtcars: Let’s pretend that we would like to model 1/4 mile
time (Y, the “response”) as a function of horsepower (X, the
“predictor”) plus random noise
Y = f(X)+e
The LM: yi
= b0
+ xi
b1
+ei
10. what is a linear model?
yi
= b0
+ xi
b1
+ei
• Now, our task becomes a search for
parameters that minimizes the sum of the
squared residuals
• The R function that does this magic is lm()
11. what is a linear model?
yi
= b0
+ xi
b1
+ei
slope
residuals
intercept
12. what is a linear model?
• mtcars: Let’s pretend that we would like to model 1/4 mile
time (Y, the “response”) as a function of horsepower (X, the
“predictor”) plus random noise
Y = f(X)+e
yi
= b0
+ xi
b1
+ei
Y = Xb +e
The LM:
In matrix notation:
13. what is a linear model?
• Quick note: the “linear” in linear model refers to the fact
that the function linearly transforms the parameters
y = b0
+log(x)b1
+e
y = b0
+ x
b1
+e
y = b0
+(x2
+tanh(x))b1
+e
✔
✗
✔
valid
valid
not valid
14. overview
• What are LMs?
• Fitting and interpreting LMs
• Transforming data
• Hypothesis testing
• Mixed-effect models
30. ANOVA
The broom package tidies your LMs
• Summarize model outputs into
tidy data frames: tidy()
• Quickly view model-scale
summaries: glance()
• See the original data augmented
with model statistics: augment()
• There’s more to broom, so have a
look for yourself.
34. some things to be aware of
• LMs make several assumptions about your data, look
them up. You want to be sure your data meets those
assumptions reasonably well.
– Homoscedasticity and normality of variance are the only
assumptions we will discuss.
• Look into “generalized linear models” (GLMs) and/or
quantile regression for non-normally distributed
data.
35. overview
• What are LMs?
• Fitting and interpreting LMs
• Transforming data
• Hypothesis testing
• Mixed-effect models
38. testing for heteroscedasticity
The ‘car’ package is your friend (Companion to
Applied Regression) .
Use car::ncvTest() to check for heteroscedasticity
using the Breusch-Pagan test. (ncv = Non-Constant
Variance).
39. testing for heteroscedasticity
The ‘car’ package is your friend (Companion to
Applied Regression) .
Use car::ncvTest() to check for heteroscedasticity
using the Breusch-Pagan test. (ncv = Non-Constant
Variance).
40. variance-stabilizing transformations
• Variance stabilizing
transformations make it so
that the variance of Y is not
correlated with its mean
value.
• Take the Poisson
distribution, its mean is
equal to its variance. The
square root is the variance
stabilizing transformation of
a Poisson RV.
41. variance-stabilizing transformations
• Variance stabilizing
transformations make it so
that the variance of Y is not
correlated with its mean
value.
• Take the Poisson
distribution, its mean is
equal to its variance. The
square root is the variance
stabilizing transformation of
a Poisson RV.
42. variance-stabilizing transformations
• Variance stabilizing
transformations make it so
that the variance of Y is not
correlated with its mean
value.
• Take the Poisson
distribution, its mean is
equal to its variance. The
square root is the variance
stabilizing transformation of
a Poisson RV.
43. the Box-Cox transformation
• Helps alleviate non-normality
and heteroscedasticity of
residuals
• Find a lambda that normalizes
the data (maximum likelihood
estimation)
y l( ) =
yl
-1
l
if l ¹0
log y( ) if l =0
ì
í
ïï
î
ï
ï
44. the Box-Cox transformation
• Helps alleviate non-normality
and heteroscedasticity of
residuals
• Find a lambda that normalizes
the data (maximum likelihood
estimation)
y l( ) =
yl
-1
l
if l ¹0
log y( ) if l =0
ì
í
ïï
î
ï
ï
45. the Box-Cox transformation
• Helps alleviate non-normality
and heteroscedasticity of
residuals
• Find a lambda that normalizes
the data (maximum likelihood
estimation)
y l( ) =
yl
-1
l
if l ¹0
log y( ) if l =0
ì
í
ïï
î
ï
ï
46. the Box-Cox transformation
• Helps alleviate non-normality
and heteroscedasticity of
residuals
• Find a lambda that normalizes
the data (maximum likelihood
estimation)
y l( ) =
yl
-1
l
if l ¹0
log y( ) if l =0
ì
í
ïï
î
ï
ï
47. the Box-Cox transformation
• Helps alleviate non-normality
and heteroscedasticity of
residuals
• Find a lambda that normalizes
the data (maximum likelihood
estimation)
y l( ) =
yl
-1
l
if l ¹0
log y( ) if l =0
ì
í
ïï
î
ï
ï
48. transformations for “curvy” data
• You can often use linear models to fit “curvy” data; you
just need to transform the predictors, the responses, or
both.
49. transformations for “curvy” data
• You can often use linear models to fit “curvy” data; you
just need to transform the predictors, the responses, or
both.
50. transformations for “curvy” data
• You can often use linear models to fit “curvy” data; you
just need to transform the predictors, the responses, or
both.
exponential model:
log Y( )= Xb +e
Y = eXb+e
51. transformations for “curvy” data
• You can often use linear models to fit “curvy” data; you
just need to transform the predictors, the responses, or
both.
52. additional thoughts
• Not everything can be transformed to be normal / homosecdastic,
and not everything necessarily needs to be.
– Consider nonparametric methods or GLMs.
– ANOVA is somewhat robust to heteroscedasticity when n and/or effect
size is relatively large.
• Use QQ plots to assess normality – qqnorm(); also Shapiro-Wilk test
– shapiro.test()
• The poly() function in conjunction with lm() can be used to fit n-
degree polynomials.
– Generally want to use raw = FALSE with poly()
53. overview
• What are LMs?
• Fitting and interpreting LMs
• Transforming data
• Hypothesis testing
• Mixed-effect models
56. handling multiple comparisons
• The p.adjust() function is useful
– method = “Bonferroni” controls the “familywise error rate”
(FWER)
– method = “BH” controls the “false discovery rate” (FDR)
• The multcomp package provides a general framework
for simultaneous hyp. Testing
– Simultaneous Inference in General Parametric Models,
Hothorn et al., Biometrical Journal, 2008.
59. the multcomp package
• Can specify contrasts with short
cuts e.g., “Dunnett” and
“Tukey”
• Can specify contrasts as strings,
e.g., “tx 7 – ctl = 0”
63. lots glaring omissions
• Experimental designs
• Interaction terms
• Model parameterization
• Variable selection
• Confidence intervals
• ANCOVA models
• Random effects vs fixed effects
• Much more…
64. resources
• MOOCs: Lots of good LM courses out there
• Books:
– Linear models with R – Julian Faraway
– Extending the linear model with R – Julian Faraway
– Mixed-Effects Models in S and S-PLUS – Jose Pinheiro & Doug
Bates
– Mixed-Effects models and Extensions in Ecology with R – Alain
Zuur
• http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
– Ben Bolker’s GLMM FAQ (author of lme4)