Weitere ähnliche Inhalte Ähnlich wie 第19回疫学セミナー 統計解析ソフトRの活用「Rで線形モデル」 (20) Mehr von Masafumi Okada (7) Kürzlich hochgeladen (20) 第19回疫学セミナー 統計解析ソフトRの活用「Rで線形モデル」1. R
,
19 ( 22 )
2012/1/26
2. R : lm() ,R
: , ,
, ,
: glm()
4. : Edgar Anderson’s Iris Data
R
help(package=”datasets”)
iris 150
( 50 : Species
(Sepal.Length , Sepal.Width )
(Petal.Length , Petal.Width )
5. > m1 <- lm(Sepal.Length ~ Petal.Length, data=iris)
> m1
Call:
lm(formula = Sepal.Length ~ Petal.Length, data = iris)
Coefficients:
(Intercept) Petal.Length
4.3066 0.4089
• iris Sepal.Length Petal.Length
m1
• m1
6. > summary(m1)
Call:
lm(formula = Sepal.Length ~ Petal.Length, data = iris)
Residuals:
Min 1Q Median 3Q Max
-1.24675 -0.29657 -0.01515 0.27676 1.00269
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.30660 0.07839 54.94 <2e-16 ***
Petal.Length 0.40892 0.01889 21.65 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
( )
• summary
summary coef fitted ,
residuals
7. ( )
Residual standard error: 0.4071 on 148 degrees of freedom
Multiple R-squared: 0.76, Adjusted R-squared: 0.7583
F-statistic: 468.6 on 1 and 148 DF, p-value: < 2.2e-16
• summary , ,
=0 t ( ), ,
R2, R2, F
• coef fitted
residuals
8. > m2 <- lm(Sepal.Length ~ Petal.Length + Petal.Width, data=iris)
> m2
Call:
lm(formula = Sepal.Length ~ Petal.Length + Petal.Width, data = iris)
Coefficients:
(Intercept) Petal.Length Petal.Width
4.1906 0.5418 -0.3196
• Petal.Width
• iris Sepal.Length Petal.Length Petal.Width
m2
• m2
9. > summary(m2)
( )
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.19058 0.09705 43.181 < 2e-16 ***
Petal.Length 0.54178 0.06928 7.820 9.41e-13 ***
Petal.Width -0.31955 0.16045 -1.992 0.0483 *
( )
Residual standard error: 0.4031 on 147 degrees of freedom
Multiple R-squared: 0.7663, Adjusted R-squared: 0.7631
F-statistic: 241 on 2 and 147 DF, p-value: < 2.2e-16
• summary
10. > m2 <- lm(Sepal.Length ~ Petal.Length + Petal.Width, data=iris)
”~”
”+”
data
11. > m3 <- lm(Sepal.Length ~ Petal.Length + Petal.Width +
Petal.Length:Petal.Width , data=iris)
”:”
2
“*”
> m3d <- lm(Sepal.Length ~ Petal.Length * Petal.Width ,data=iris)
m3 m3d
12. > m4 <- lm(Sepal.Length ~ Petal.Length^2 , data=iris)
> m4 <- lm(Sepal.Length ~ I(Petal.Length^2) , data=iris)
“^” “+” “-” ”*” “:”
I( i)
13. > m5 <- lm(Sepal.Length ~ Petal.Length + Species, data=iris)
> summary(m5)
( )
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.68353 0.10610 34.719 < 2e-16 ***
Petal.Length 0.90456 0.06479 13.962 < 2e-16 ***
Speciesversicolor -1.60097 0.19347 -8.275 7.37e-14 ***
Speciesvirginica -2.11767 0.27346 -7.744 1.48e-12 ***
( )
Species factor (“setosa”, “versicolor”, “virginica” )
factor -1
(Speciesversicolor, Speciesvirginica)
14. > m2
Call:
lm(formula = Sepal.Length ~ Petal.Length + Petal.Width, data = iris)
Coefficients:
(Intercept) Petal.Length Petal.Width
4.1906 0.5418 -0.3196
> coef(m2)
(Intercept) Petal.Length Petal.Width
4.1905824 0.5417772 -0.3195506
> sd(iris$Petal.Length) / sd(iris$Sepal.Length)
[1] 2.131832
> sd(iris$Petal.Width) / sd(iris$Sepal.Length)
[1] 0.9205034
> 0.5417772 * 2.131832
[1] 1.154978
> -0.3195506 * 0.9205034
[1] -0.2941474
> coef(m2)[-1] * apply(m2$model[-1],2,sd) / sd(m2$model[,1])
Petal.Length Petal.Width
1.1549781 -0.2941474
15. > cor(iris[,1:4])
Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length 1.0000000 -0.1175698 0.8717538 0.8179411
Sepal.Width -0.1175698 1.0000000 -0.4284401 -0.3661259
Petal.Length 0.8717538 -0.4284401 1.0000000 0.9628654
Petal.Width 0.8179411 -0.3661259 0.9628654 1.0000000
(factor) 5 (Species )
iris[,1:4] cor
16. > m6 <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width
+ Species, data = iris)
> step(m6)
Start: AIC=-348.57
Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species
Df Sum of Sq RSS AIC
<none> 13.556 -348.57
- Petal.Width 1 0.4090 13.966 -346.11
- Species 2 0.8889 14.445 -343.04
- Sepal.Width 1 3.1250 16.681 -319.45
- Petal.Length 1 13.7853 27.342 -245.33
step AIC( )
AIC
17. (Residual-Fitted plot)
> m1 <- lm(Sepal.Length ~ Petal.Length, data=iris)
> m6 <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width
+ Species, data = iris)
> plot(m1, which=1)
> plot(m6, which=1)
Residuals vs Fitted Residuals vs Fitted
1.0
15 132 15 136
0.5
0.5
Residuals
Residuals
0.0
0.0
-0.5
-0.5
-1.0
107 85
-1.5
5.0 5.5 6.0 6.5 7.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
Fitted values Fitted values
lm(Sepal.Length ~ Petal.Length) lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species)
18. ( q-q plot)
> m1 <- lm(Sepal.Length ~ Petal.Length, data=iris)
> m6 <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width
+ Species, data = iris)
> plot(m1, which=2)
> plot(m6, which=2)
Normal Q-Q Normal Q-Q
3
3
132 15 15 136
2
2
Standardized residuals
Standardized residuals
1
1
0
0
-1
-1
-2
-2
-3
107 85
-2 -1 0 1 2 -2 -1 0 1 2
Theoretical Quantiles Theoretical Quantiles
lm(Sepal.Length ~ Petal.Length) lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species)
20. : data from a case-control study of
esophagaeal cancer in Ile-et-Vilaine, France
R esoph
agegp(Age group, 6 ), alcgp(Alcohol consumption, 4 ),
tobgp(Tobacco consumption, 4 ), ncases (number of
cases), ncontrols(number of controls) 5
agegp, alcgp, tobgp case control
21. > m7 <- glm(cbind(ncases, ncontrols) ~ agegp+tobgp+alcgp,
data=esoph, family=binomial,contrasts=list(agegp="contr.treatment",
tobgp="contr.treatment", alcgp="contr.treatment"))
• esoph agegp tobgp alcgp
m7
• 1 1 0 or 1
iris ” ”
• “family=binomial”
• contrasts
( )
factor
22. > summary(m7)
( )
Deviance Residuals:
Min 1Q Median 3Q Max
-1.6891 -0.5618 -0.2168 0.2314 2.0642
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.9108 1.0302 -5.737 9.61e-09 ***
agegp35-44 1.6095 1.0676 1.508 0.131652
agegp45-54 2.9752 1.0242 2.905 0.003675 **
agegp55-64 3.3584 1.0198 3.293 0.000991 ***
agegp65-74 3.7270 1.0253 3.635 0.000278 ***
agegp75+ 3.6818 1.0645 3.459 0.000543 ***
tobgp10-19 0.3407 0.2054 1.659 0.097159 .
tobgp20-29 0.3962 0.2456 1.613 0.106708
tobgp30+ 0.8677 0.2765 3.138 0.001701 **
alcgp40-79 1.1216 0.2384 4.704 2.55e-06 ***
alcgp80-119 1.4471 0.2628 5.506 3.68e-08 ***
alcgp120+ 2.1154 0.2876 7.356 1.90e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 227.241 on 87 degrees of freedom
Residual deviance: 53.973 on 76 degrees of freedom
AIC: 225.45
Number of Fisher Scoring iterations: 6
24. > exp(cbind(coef(m7),confint(m7)))
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 0.002710046 0.0001500676 0.01309911
agegp35-44 5.000426461 0.9048631872 93.44857822
agegp45-54 19.592860766 4.1082301889 351.60508496
agegp55-64 28.741838956 6.1156513661 513.64343522
agegp65-74 41.554820823 8.6954216578 746.54178414
agegp75+ 39.716132031 7.3282690051 740.73696224
tobgp10-19 1.405982889 0.9376683713 2.10030180
tobgp20-29 1.486221090 0.9107730044 2.39108827
tobgp30+ 2.381435327 1.3753149494 4.07656452
alcgp40-79 3.069638047 1.9438541677 4.96418587
alcgp80-119 4.250811157 2.5569142185 7.18622034
alcgp120+ 8.292938857 4.7505293148 14.70778864
95% confint
95% exp
95%
26. R 3500 R
Epi, epicalc, epitools,epiR, epibasix
2x2 APC
, Mantel-Haenszel
27. Web ,
RjpWiki :
Q and A
R-bloggers : R
inside-R : R Revolution R
Twitter: #rstats, #rstatsj
28. “R ”
slideshare.net Twitter R
Ustream
R
Hinweis der Redaktion \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n