3. Previously in this group
n Introduction n Graphics
n Reading Data into R (1) n Groupwise, continuous
n Reading Data into R (2) n
n Descriptive, continuous
n Descriptive, categorical
n Deducer
9. We will use lowbwt dataset used in BIO213
lowbwt.dat
http://www.umass.edu/statdata/statdata/data/lowbwt.txt
http://www.umass.edu/statdata/statdata/data/lowbwt.dat
10. Load dataset from web
lbw <- read.table("http://www.umass.edu/statdata/statdata/data/lowbwt.dat",
head = T, skip = 4)
skip 4 rows
header = TRUE
to pick up
variable names
11. “Fix” dataset
lbw[c(10,39), "BWT"] <- c(2655, 3035)
BWT column
Replace data points
10th,39th to make the dataset identical
rows to BIO213 dataset
12. Lower case variable names
names(lbw) <- tolower(names(lbw))
Put them back into Convert variable
variable names names to lower case
17. Name of newly created dataset
(here replacing original) Take dataset
dataset <-
within(dataset, {
_variable manipulations_
}) Perform variable manipulation
You can specify by variable name
only. No need for dataset$var_name
26. formula
outcome ~ predictor1 + predictor2 + predictor3
SAS equivalent:
model outcome = predictor1 predictor2 predictor3;
27. In the case of t-test
continuous variable grouping variable to
to be compared separate groups
age ~ zyg
Variable to be Variable used
explained to explain
29. n . All variables except for the outcome
n + X2 Add X2 term
n - 1 Remove intercept
n X1:X2 Interaction term between X1 and X2
n X1*X2 Main effects and interaction term
32. On-the-fly variable manipulation
Inhibit formula
interpretation. For math
manipulation
Y ~ X1 + I(X2 * X3)
New variable (X2 times X3)
created on-the-fly and used
33. Fit a model
lm.full <- lm(bwt ~ age + lwt + smoke + ht + ui +
ftv.cat + race.cat + preterm ,
data = lbw)
37. Call: command repeated Residual
distribution
Coef/SE = t
Dummy
variables
created
Model R^2 and adjusted R^2
F-test
38. ftv.catNone No 1st trimester visit people compared to
Normal 1st trimester visit people (reference level)
ftv.catMany Many 1st trimester visit people compared to
Normal 1st trimester visit people (reference level)
39. race.catBlack Black people compared to
White people (reference level)
race.catOther Other people compared to
White people (reference level)
43. ANOVA table (type I)
degree of Sequential Mean SS
freedom SS = SS/DF
F = Mean SS / Mean SS of residual
44. Type I = Sequential SS
1 age
1st gets all in type I
er lap
ov I
ut pe
ll b n ty
sa 1i
las et n
g e 2 lwt
on emtr nd twe
2 e
ly b
in aini
typ ng
3 smoke eI
46. ANOVA table (type III)
Marginal degree of
SS freedom
Multi-
category
variables
tested as
one
F = Mean SS / Mean SS of residual
47. Type III = Marginal SS
1 age
gin
ar I
ets m e II
1s t g typ
in
o nly
e I in
typ rg
II
i n ma
las
ly ets
on tg 2 lwt
ets
dg
ly
in ma
2n
typ rg
on
3 smoke e I in
II