SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
lavaan.survey: An R package for complex
survey analysis of structural equation models
Daniel Oberski
Department of methodology and statistics
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
What is "complex" about a sample survey?
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
1. Clustering
3. Weighting
2. Stratification
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Why account for complex sampling in
structural equation modeling?
• Clustering, stratification, and weighting all affect the
standard errors.
• Weighting affects the point estimates;
(Skinner, Holt & Smith 1989; Pfefferman 1993; Muthén & Satorra 1995; Asparouhov
2005; Asparouhov, & Muthén 2005; Stapleton 2006; Asparouhov & Muthén 2007;
Fuller 2009; Bollen, Tueller, & Oberski 2013)
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
lavaan.survey
Input:
1. A lavaan fit object (SEM analysis)
2. A survey design object (complex sample design)
Output:
A lavaan fit object (SEM analysis),
accounting for the complex sample design
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
lavaan.survey
Input:
1. A lavaan fit object (SEM analysis)
2. A survey design object (complex sample design)
Output:
A lavaan fit object (SEM analysis),
accounting for the complex sample design
Example call:
lavaan.survey(lavaan.fit, survey.design)
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
How to fit a structural equation model accounting for
complex sampling using lavaan.survey in 3 steps:
..1 Fit the SEM using lavaan as you normally would
→ lavaan fit object;
..2 Define the complex sampling design to the survey
package
→ svydesign or svrepdesign object;
..3 Call lavaan.survey with the two objects obtained in (1)
and (2).
→ corrected lavaan fit object.
Afterwards, usual lavaan apparatus available.
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Example complex sample SEM analysis from the literature
Ferla, Valcke & Cai (2009). Academic self-efficacy and academic
self-concept: Reconsidering structural relationships. Learning and
Individual Differences, 19 (4).
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Structural equation model of Belgian PISA data
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
PISA data in the lavaan.survey package
load("pisa.be.2003.rda")
> head(pisa.be.2003, 5)[,1:20]
PV1MATH1 PV1MATH2 PV1MATH3 PV1MATH4 ST31Q01 ST31Q02 ST31Q03 ST31Q04
1 1.160 1.26 1.19 1.190 2 1 2 2
2 1.293 1.27 1.22 1.247 1 1 1 1
3 1.383 1.44 1.37 1.347 2 1 1 1
4 0.965 1.02 1.10 0.979 2 2 2 2
5 1.375 1.44 1.31 1.471 3 1 2 2
ST31Q05 ST31Q06 ST31Q07 ST31Q08 ST32Q02 ST32Q04 ST32Q06 ST32Q07
1 2 2 2 3 2 3 3 3
2 1 2 1 4 2 3 3 4
3 1 1 2 3 4 2 1 1
4 2 1 2 2 3 3 2 2
5 1 3 3 3 3 2 3 3
ST32Q09 ESCS male school.type
1 3 1.221 1 1
2 4 1.899 2 1
3 2 0.884 2 1
4 3 1.305 2 1
5 3 0.958 2 1
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Belgium is a complex country.
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
In addition to the observed variables, the raw data also
contain 80 replicate weights generated by
``balanced-repeated replication'' (BRR)
> pisa.be.2003[sample(1:8796), 21:101]
W_FSTUWT W_FSTR1 W_FSTR2 W_FSTR3 ... W_FSTR80
2988 1.12 1.64 0.57 1.70 ... 0.528
8666 11.90 17.84 5.95 5.95 ... 5.948
290 16.89 25.33 8.44 8.44 ... 25.333
3505 14.80 8.25 7.90 20.67 ... 8.206
4289 22.70 32.40 10.87 10.80 ... 11.755
2352 13.09 20.62 6.26 18.79 ... 6.301
....
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Step 1: Fitting the model with lavaan
model.pisa <- "
math =~ PV1MATH1 + PV1MATH2 + PV1MATH3 + PV1MATH4
neg.efficacy =~ ST31Q01 + ST31Q02 + ST31Q03 + ST31Q04 +
ST31Q05 + ST31Q06 + ST31Q07 + ST31Q08
neg.selfconcept =~ ST32Q02 + ST32Q04 + ST32Q06 + ST32Q07 + ST32Q09
neg.selfconcept ~ neg.efficacy + ESCS + male
neg.efficacy ~ neg.selfconcept + school.type + ESCS + male
math ~ neg.selfconcept + neg.efficacy + school.type + ESCS + male
"
fit <- lavaan(model.pisa, data=pisa.be.2003, auto.var=TRUE, std.lv=TRUE,
meanstructure=TRUE, int.ov.free=TRUE, estimator="MLM")
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Step 2: Letting survey know
about the replicate weight design
des.rep <- svrepdesign(ids=~1, weights=~W_FSTUWT,
data=pisa.be.2003,
repweights="W_FSTR[0-9]+",
type="Fay", rho=0.5)
More information on defining complex sampling objects:
Lumley (2004). Complex Surveys: A Guide to Analysis using R. New
York: Wiley.
Thomas Lumley's website with tutorials:
http://r-survey.r-forge.r-project.org/survey/index.html
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Complex survey SEM
fit.surv <-
lavaan.survey(lavaan.fit=fit, survey.design=des.rep)
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Point and standard error estimates using robust ML with and
without correction for the sampling design using BRR replicate
weights.
Estimate s.e.
Naive PML Naive PML creff
neg.selfconcept ∼ neg.efficacy -0.021 -0.050 0.032 0.046 1.415
neg.efficacy ∼ neg.selfconcept 0.568 0.609 0.046 0.065 1.421
neg.efficacy ∼ school.type 0.530 0.518 0.022 0.022 1.009
math ∼ neg.selfconcept -0.179 -0.177 0.015 0.021 1.362
math ∼ neg.efficacy -0.239 -0.237 0.015 0.018 1.216
math ∼ school.type -0.606 -0.596 0.019 0.035 1.858
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Conclusions
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
Conclusions
• When there are clusters, strata, and/or weights, you
should usually adjust the SEM analysis for them;
• Especially important for correct standard errors and
model fit statistics.
• lavaan.survey leverages the power of the lavaan and
survey packages to let you do this.
• Future: categorical data.
lavaan.survey doberski@uvt.nl
Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions
doberski@uvt.nl
http://daob.nl
Oberski, D.L. (2014). lavaan.survey: An R Package for Complex Survey
Analysis of Structural Equation Models. Journal of Statistical Software,
57 (1). http://www.jstatsoft.org/v57/i01
lavaan.survey doberski@uvt.nl
Appendix*
Appendix
lavaan.survey doberski@uvt.nl
Appendix*
Specifying regression relations in lavaan
Dependent variable
``regressed
on''
independent variables
selfconcept ~ efficacy + ESCS + male
efficacy ~ selfconcept + school.type +
ESCS + male
math ~ selfconcept + efficacy +
school.type + ESCS + male
lavaan.survey doberski@uvt.nl
Appendix*
PISA data: Math self-concept (ST32Q02,4,6,7,9)
lavaan.survey doberski@uvt.nl
Appendix*
Measurement model for self-concept
neg.selfconcept =~ ST32Q02 + ST32Q04 +
ST32Q06 + ST32Q07 +
ST32Q09
lavaan.survey doberski@uvt.nl
Appendix*
Measurement model for self-concept
Latent variable
neg.selfconcept =~ ST32Q02 + ST32Q04 +
ST32Q06 + ST32Q07 +
ST32Q09
lavaan.survey doberski@uvt.nl
Appendix*
Measurement model for self-concept
Latent variable ``measured by''
neg.selfconcept =~ ST32Q02 + ST32Q04 +
ST32Q06 + ST32Q07 +
ST32Q09
lavaan.survey doberski@uvt.nl
Appendix*
Measurement model for self-concept
Latent variable ``measured by'' obs. variables
neg.selfconcept =~ ST32Q02 + ST32Q04 +
ST32Q06 + ST32Q07 +
ST32Q09
lavaan.survey doberski@uvt.nl
Appendix*
PISA data: math self-efficicacy (ST31Q01--08)
lavaan.survey doberski@uvt.nl
Appendix*
Measurement model for self-efficacy
neg.efficacy =~ ST31Q01 + ST31Q02 + ST31Q03 +
ST31Q04 + ST31Q05 + ST31Q06 +
ST31Q07 + ST31Q08
lavaan.survey doberski@uvt.nl
Appendix*
[1/3] PISA sampling design: clustering
• Target population in each country: 15-year-old students
attending educational institutions in grades 7+;
• Minimum of 150 schools selected/country;
• Within each participating school, (usually 35) students
randomly selected with equal probability;
• Total sample size at least 4,500+ students.
• Other complications (with sampling):
• Getting a population frame
• School nonresponse
• Student nonresponse
• Small schools (< 35 students): representation vs. costs
• Countries with few schools
lavaan.survey doberski@uvt.nl
Appendix*
[2/3] PISA sampling design: stratification
• Before sampling, schools stratified in the sampling frame;
• Schools classified into ``like'' groups, for purpose:
• improve efficiency of the sample design;
• disproportionate sample allocations to specific groups of
schools, e.g. states, provinces, or other regions;
• ensure all parts of population included in the sample;
• representation of specific groups in sample.
lavaan.survey doberski@uvt.nl
Appendix*
PISA sampling: stratification in Belgium
• Region × Public/private education → 23 strata
• Then sampling within stratum with probability
proportional to:
• (Flanders) ISCED; Retention Rate; Vocational/Special
Education; Percentage of Girls;
• (French Community) National/International School;
Retention Rate; Vocational-Special Education/Other;
• (German Community) Public/Private.
lavaan.survey doberski@uvt.nl
Appendix*
[3/3] Weighting
wij = w1iw2ijf1if2ijt1i
w1i: Base weight for school i;
w2ij: Base weight for student j in school i;
f2ij, f1i: Nonresponse factor for student and school
nonparticipation;
t1i: Weight trimming factor.
lavaan.survey doberski@uvt.nl
Appendix*
Weighting
wij = wbrr
i w1iw2ijf1if2ijt1i
w1i: Base weight for school i;
w2ij: Base weight for student j in school i;
f2ij, f1i: Nonresponse factor for student and school
nonparticipation;
t1i: Weight trimming factor.
wbrr
ij : ``Balanced-repeated replication'' weight (80 replications
per case) created by the WesVar software
lavaan.survey doberski@uvt.nl
Appendix*
Balanced-repeated replication (BRR), Fay's ρ = 0.5
lavaan.survey doberski@uvt.nl
Appendix*
Balanced-repeated replication (BRR), Fay's ρ = 0.5
lavaan.survey doberski@uvt.nl
Appendix*
Balanced-repeated replication (BRR), Fay's ρ = 0.5
lavaan.survey doberski@uvt.nl
Appendix*
Structural parameter estimates and standard errors
taking nonnormality into account
Estimate Std.err Z-value P(>|z|)
Regressions:
neg.selfconcept ~
neg.efficacy -0.021 0.032 -0.657 0.511
ESCS -0.062 0.018 -3.381 0.001
male -0.389 0.027 -14.607 0.000
neg.efficacy ~
neg.selfcncpt 0.568 0.046 12.354 0.000
school.type 0.530 0.022 24.331 0.000
ESCS -0.252 0.016 -16.199 0.000
male -0.360 0.031 -11.668 0.000
math ~
neg.selfcncpt -0.179 0.015 -11.644 0.000
neg.efficacy -0.239 0.015 -16.413 0.000
school.type -0.606 0.019 -32.154 0.000
ESCS 0.312 0.014 22.511 0.000
male 0.009 0.024 0.375 0.708
lavaan.survey doberski@uvt.nl
Appendix*
Model fit taking nonnormality into account
Used Total
Number of observations 7785 8796
Estimator ML Robust
Minimum Function Test Statistic 8088.256 7275.544
Degrees of freedom 158 158
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.112
for the Satorra-Bentler correction
Comparative Fit Index (CFI) 0.921 0.921
Tucker-Lewis Index (TLI) 0.907 0.907
RMSEA 0.080 0.076
P-value RMSEA <= 0.05 0.000 0.000
SRMR 0.056 0.056
lavaan.survey doberski@uvt.nl
Appendix*
Complex survey SEM: fit measures
Number of observations 7785
Estimator ML Robust
Minimum Function Test Statistic 8187.514 5642.873
Degrees of freedom 158 158
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.451
for the Satorra-Bentler correction
Comparative Fit Index (CFI) 0.921 0.894
Tucker-Lewis Index (TLI) 0.906 0.875
RMSEA 0.081 0.067
P-value RMSEA <= 0.05 0.000 0.000
SRMR 0.058 0.058
lavaan.survey doberski@uvt.nl

Weitere ähnliche Inhalte

Was ist angesagt?

An Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachAn Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network Approach
COST action BM1006
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Simplilearn
 

Was ist angesagt? (20)

Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Isaac newton
Isaac newtonIsaac newton
Isaac newton
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
Machine Learning in 10 Minutes | What is Machine Learning? | EdurekaMachine Learning in 10 Minutes | What is Machine Learning? | Edureka
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
 
An Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachAn Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network Approach
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
 
Isaac newton
Isaac newtonIsaac newton
Isaac newton
 
Classification metrics
Classification metrics Classification metrics
Classification metrics
 
Lecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdfLecture 5 Decision tree.pdf
Lecture 5 Decision tree.pdf
 
Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701 Intro to Biomedical Informatics 701
Intro to Biomedical Informatics 701
 
Roadmap: How to Learn Machine Learning in 6 Months
Roadmap: How to Learn Machine Learning in 6 MonthsRoadmap: How to Learn Machine Learning in 6 Months
Roadmap: How to Learn Machine Learning in 6 Months
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear Regression
 
Sir isaac newton ppt
Sir isaac newton pptSir isaac newton ppt
Sir isaac newton ppt
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood Estimator
 
Lecture 2.pptx
Lecture 2.pptxLecture 2.pptx
Lecture 2.pptx
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
 
Ada Lovelace-The First Programmer
Ada Lovelace-The First ProgrammerAda Lovelace-The First Programmer
Ada Lovelace-The First Programmer
 
Measurement PPT
Measurement PPTMeasurement PPT
Measurement PPT
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 

Andere mochten auch

Multidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM modelMultidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM model
Daniel Oberski
 

Andere mochten auch (9)

ESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey ResearchESRA2015 course: Latent Class Analysis for Survey Research
ESRA2015 course: Latent Class Analysis for Survey Research
 
Complex sampling in latent variable models
Complex sampling in latent variable modelsComplex sampling in latent variable models
Complex sampling in latent variable models
 
How good are administrative register data and what can we do about it?
How good are administrative register data and what can we do about it?How good are administrative register data and what can we do about it?
How good are administrative register data and what can we do about it?
 
A measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisA measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysis
 
Multidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM modelMultidirectional survey measurement errors: the latent class MTMM model
Multidirectional survey measurement errors: the latent class MTMM model
 
Predicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQPPredicting the quality of a survey question from its design characteristics: SQP
Predicting the quality of a survey question from its design characteristics: SQP
 
Predicting the quality of a survey question from its design characteristics
Predicting the quality of a survey question from its design characteristicsPredicting the quality of a survey question from its design characteristics
Predicting the quality of a survey question from its design characteristics
 
An introduction to structural equation models in R using the Lavaan package
An introduction to structural equation models in R using the Lavaan packageAn introduction to structural equation models in R using the Lavaan package
An introduction to structural equation models in R using the Lavaan package
 
Detecting local dependence in latent class models
Detecting local dependence in latent class modelsDetecting local dependence in latent class models
Detecting local dependence in latent class models
 

Ähnlich wie lavaan.survey: An R package for complex survey analysis of structural equation models

SQCFramework: SPARQL Query containment Benchmark Generation Framework
SQCFramework: SPARQL Query containment  Benchmark Generation Framework SQCFramework: SPARQL Query containment  Benchmark Generation Framework
SQCFramework: SPARQL Query containment Benchmark Generation Framework
Muhammad Saleem
 
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSA WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
Silvio Cesare
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
 

Ähnlich wie lavaan.survey: An R package for complex survey analysis of structural equation models (20)

SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation FrameworkSQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
 
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
 
Data mining 8 estimasi linear regression
Data mining 8   estimasi linear regressionData mining 8   estimasi linear regression
Data mining 8 estimasi linear regression
 
On the effectiveness of feature set augmentation using clusters of word embed...
On the effectiveness of feature set augmentation using clusters of word embed...On the effectiveness of feature set augmentation using clusters of word embed...
On the effectiveness of feature set augmentation using clusters of word embed...
 
Essay on-data-analysis
Essay on-data-analysisEssay on-data-analysis
Essay on-data-analysis
 
SQCFramework: SPARQL Query containment Benchmark Generation Framework
SQCFramework: SPARQL Query containment  Benchmark Generation Framework SQCFramework: SPARQL Query containment  Benchmark Generation Framework
SQCFramework: SPARQL Query containment Benchmark Generation Framework
 
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERSA WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
A WHIRLWIND TOUR OF ACADEMIC TECHNIQUES FOR REAL-WORLD SECURITY RESEARCHERS
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-Answering
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Machine Learning - Simple Linear Regression
Machine Learning - Simple Linear RegressionMachine Learning - Simple Linear Regression
Machine Learning - Simple Linear Regression
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online Classification
 
[M3A4] Data Analysis and Interpretation Specialization
[M3A4] Data Analysis and Interpretation Specialization[M3A4] Data Analysis and Interpretation Specialization
[M3A4] Data Analysis and Interpretation Specialization
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptx
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
 
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
MM - KBAC: Using mixed models to adjust for population structure in a rare-va...
 
Topic Set Size Design with Variance Estimates from Two-Way ANOVA
Topic Set Size Design with Variance Estimates from Two-Way ANOVATopic Set Size Design with Variance Estimates from Two-Way ANOVA
Topic Set Size Design with Variance Estimates from Two-Way ANOVA
 
Slides for "Do Deep Generative Models Know What They Don't know?"
Slides for "Do Deep Generative Models Know What They Don't know?"Slides for "Do Deep Generative Models Know What They Don't know?"
Slides for "Do Deep Generative Models Know What They Don't know?"
 

Kürzlich hochgeladen

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 

Kürzlich hochgeladen (20)

Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 

lavaan.survey: An R package for complex survey analysis of structural equation models

  • 1. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions lavaan.survey: An R package for complex survey analysis of structural equation models Daniel Oberski Department of methodology and statistics lavaan.survey doberski@uvt.nl
  • 2. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions lavaan.survey doberski@uvt.nl
  • 3. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions What is "complex" about a sample survey? lavaan.survey doberski@uvt.nl
  • 4. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions 1. Clustering 3. Weighting 2. Stratification lavaan.survey doberski@uvt.nl
  • 5. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Why account for complex sampling in structural equation modeling? • Clustering, stratification, and weighting all affect the standard errors. • Weighting affects the point estimates; (Skinner, Holt & Smith 1989; Pfefferman 1993; Muthén & Satorra 1995; Asparouhov 2005; Asparouhov, & Muthén 2005; Stapleton 2006; Asparouhov & Muthén 2007; Fuller 2009; Bollen, Tueller, & Oberski 2013) lavaan.survey doberski@uvt.nl
  • 6. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions lavaan.survey Input: 1. A lavaan fit object (SEM analysis) 2. A survey design object (complex sample design) Output: A lavaan fit object (SEM analysis), accounting for the complex sample design lavaan.survey doberski@uvt.nl
  • 7. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions lavaan.survey Input: 1. A lavaan fit object (SEM analysis) 2. A survey design object (complex sample design) Output: A lavaan fit object (SEM analysis), accounting for the complex sample design Example call: lavaan.survey(lavaan.fit, survey.design) lavaan.survey doberski@uvt.nl
  • 8. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions How to fit a structural equation model accounting for complex sampling using lavaan.survey in 3 steps: ..1 Fit the SEM using lavaan as you normally would → lavaan fit object; ..2 Define the complex sampling design to the survey package → svydesign or svrepdesign object; ..3 Call lavaan.survey with the two objects obtained in (1) and (2). → corrected lavaan fit object. Afterwards, usual lavaan apparatus available. lavaan.survey doberski@uvt.nl
  • 9. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Example complex sample SEM analysis from the literature Ferla, Valcke & Cai (2009). Academic self-efficacy and academic self-concept: Reconsidering structural relationships. Learning and Individual Differences, 19 (4). lavaan.survey doberski@uvt.nl
  • 10. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Structural equation model of Belgian PISA data lavaan.survey doberski@uvt.nl
  • 11. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions PISA data in the lavaan.survey package load("pisa.be.2003.rda") > head(pisa.be.2003, 5)[,1:20] PV1MATH1 PV1MATH2 PV1MATH3 PV1MATH4 ST31Q01 ST31Q02 ST31Q03 ST31Q04 1 1.160 1.26 1.19 1.190 2 1 2 2 2 1.293 1.27 1.22 1.247 1 1 1 1 3 1.383 1.44 1.37 1.347 2 1 1 1 4 0.965 1.02 1.10 0.979 2 2 2 2 5 1.375 1.44 1.31 1.471 3 1 2 2 ST31Q05 ST31Q06 ST31Q07 ST31Q08 ST32Q02 ST32Q04 ST32Q06 ST32Q07 1 2 2 2 3 2 3 3 3 2 1 2 1 4 2 3 3 4 3 1 1 2 3 4 2 1 1 4 2 1 2 2 3 3 2 2 5 1 3 3 3 3 2 3 3 ST32Q09 ESCS male school.type 1 3 1.221 1 1 2 4 1.899 2 1 3 2 0.884 2 1 4 3 1.305 2 1 5 3 0.958 2 1 lavaan.survey doberski@uvt.nl
  • 12. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Belgium is a complex country. lavaan.survey doberski@uvt.nl
  • 13. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions In addition to the observed variables, the raw data also contain 80 replicate weights generated by ``balanced-repeated replication'' (BRR) > pisa.be.2003[sample(1:8796), 21:101] W_FSTUWT W_FSTR1 W_FSTR2 W_FSTR3 ... W_FSTR80 2988 1.12 1.64 0.57 1.70 ... 0.528 8666 11.90 17.84 5.95 5.95 ... 5.948 290 16.89 25.33 8.44 8.44 ... 25.333 3505 14.80 8.25 7.90 20.67 ... 8.206 4289 22.70 32.40 10.87 10.80 ... 11.755 2352 13.09 20.62 6.26 18.79 ... 6.301 .... lavaan.survey doberski@uvt.nl
  • 14. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Step 1: Fitting the model with lavaan model.pisa <- " math =~ PV1MATH1 + PV1MATH2 + PV1MATH3 + PV1MATH4 neg.efficacy =~ ST31Q01 + ST31Q02 + ST31Q03 + ST31Q04 + ST31Q05 + ST31Q06 + ST31Q07 + ST31Q08 neg.selfconcept =~ ST32Q02 + ST32Q04 + ST32Q06 + ST32Q07 + ST32Q09 neg.selfconcept ~ neg.efficacy + ESCS + male neg.efficacy ~ neg.selfconcept + school.type + ESCS + male math ~ neg.selfconcept + neg.efficacy + school.type + ESCS + male " fit <- lavaan(model.pisa, data=pisa.be.2003, auto.var=TRUE, std.lv=TRUE, meanstructure=TRUE, int.ov.free=TRUE, estimator="MLM") lavaan.survey doberski@uvt.nl
  • 15. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Step 2: Letting survey know about the replicate weight design des.rep <- svrepdesign(ids=~1, weights=~W_FSTUWT, data=pisa.be.2003, repweights="W_FSTR[0-9]+", type="Fay", rho=0.5) More information on defining complex sampling objects: Lumley (2004). Complex Surveys: A Guide to Analysis using R. New York: Wiley. Thomas Lumley's website with tutorials: http://r-survey.r-forge.r-project.org/survey/index.html lavaan.survey doberski@uvt.nl
  • 16. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Complex survey SEM fit.surv <- lavaan.survey(lavaan.fit=fit, survey.design=des.rep) lavaan.survey doberski@uvt.nl
  • 17. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Point and standard error estimates using robust ML with and without correction for the sampling design using BRR replicate weights. Estimate s.e. Naive PML Naive PML creff neg.selfconcept ∼ neg.efficacy -0.021 -0.050 0.032 0.046 1.415 neg.efficacy ∼ neg.selfconcept 0.568 0.609 0.046 0.065 1.421 neg.efficacy ∼ school.type 0.530 0.518 0.022 0.022 1.009 math ∼ neg.selfconcept -0.179 -0.177 0.015 0.021 1.362 math ∼ neg.efficacy -0.239 -0.237 0.015 0.018 1.216 math ∼ school.type -0.606 -0.596 0.019 0.035 1.858 lavaan.survey doberski@uvt.nl
  • 18. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Conclusions lavaan.survey doberski@uvt.nl
  • 19. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions Conclusions • When there are clusters, strata, and/or weights, you should usually adjust the SEM analysis for them; • Especially important for correct standard errors and model fit statistics. • lavaan.survey leverages the power of the lavaan and survey packages to let you do this. • Future: categorical data. lavaan.survey doberski@uvt.nl
  • 20. Complex sample surveys Using lavaan.survey Example lavaan.survey analysis Conclusions doberski@uvt.nl http://daob.nl Oberski, D.L. (2014). lavaan.survey: An R Package for Complex Survey Analysis of Structural Equation Models. Journal of Statistical Software, 57 (1). http://www.jstatsoft.org/v57/i01 lavaan.survey doberski@uvt.nl
  • 22. Appendix* Specifying regression relations in lavaan Dependent variable ``regressed on'' independent variables selfconcept ~ efficacy + ESCS + male efficacy ~ selfconcept + school.type + ESCS + male math ~ selfconcept + efficacy + school.type + ESCS + male lavaan.survey doberski@uvt.nl
  • 23. Appendix* PISA data: Math self-concept (ST32Q02,4,6,7,9) lavaan.survey doberski@uvt.nl
  • 24. Appendix* Measurement model for self-concept neg.selfconcept =~ ST32Q02 + ST32Q04 + ST32Q06 + ST32Q07 + ST32Q09 lavaan.survey doberski@uvt.nl
  • 25. Appendix* Measurement model for self-concept Latent variable neg.selfconcept =~ ST32Q02 + ST32Q04 + ST32Q06 + ST32Q07 + ST32Q09 lavaan.survey doberski@uvt.nl
  • 26. Appendix* Measurement model for self-concept Latent variable ``measured by'' neg.selfconcept =~ ST32Q02 + ST32Q04 + ST32Q06 + ST32Q07 + ST32Q09 lavaan.survey doberski@uvt.nl
  • 27. Appendix* Measurement model for self-concept Latent variable ``measured by'' obs. variables neg.selfconcept =~ ST32Q02 + ST32Q04 + ST32Q06 + ST32Q07 + ST32Q09 lavaan.survey doberski@uvt.nl
  • 28. Appendix* PISA data: math self-efficicacy (ST31Q01--08) lavaan.survey doberski@uvt.nl
  • 29. Appendix* Measurement model for self-efficacy neg.efficacy =~ ST31Q01 + ST31Q02 + ST31Q03 + ST31Q04 + ST31Q05 + ST31Q06 + ST31Q07 + ST31Q08 lavaan.survey doberski@uvt.nl
  • 30. Appendix* [1/3] PISA sampling design: clustering • Target population in each country: 15-year-old students attending educational institutions in grades 7+; • Minimum of 150 schools selected/country; • Within each participating school, (usually 35) students randomly selected with equal probability; • Total sample size at least 4,500+ students. • Other complications (with sampling): • Getting a population frame • School nonresponse • Student nonresponse • Small schools (< 35 students): representation vs. costs • Countries with few schools lavaan.survey doberski@uvt.nl
  • 31. Appendix* [2/3] PISA sampling design: stratification • Before sampling, schools stratified in the sampling frame; • Schools classified into ``like'' groups, for purpose: • improve efficiency of the sample design; • disproportionate sample allocations to specific groups of schools, e.g. states, provinces, or other regions; • ensure all parts of population included in the sample; • representation of specific groups in sample. lavaan.survey doberski@uvt.nl
  • 32. Appendix* PISA sampling: stratification in Belgium • Region × Public/private education → 23 strata • Then sampling within stratum with probability proportional to: • (Flanders) ISCED; Retention Rate; Vocational/Special Education; Percentage of Girls; • (French Community) National/International School; Retention Rate; Vocational-Special Education/Other; • (German Community) Public/Private. lavaan.survey doberski@uvt.nl
  • 33. Appendix* [3/3] Weighting wij = w1iw2ijf1if2ijt1i w1i: Base weight for school i; w2ij: Base weight for student j in school i; f2ij, f1i: Nonresponse factor for student and school nonparticipation; t1i: Weight trimming factor. lavaan.survey doberski@uvt.nl
  • 34. Appendix* Weighting wij = wbrr i w1iw2ijf1if2ijt1i w1i: Base weight for school i; w2ij: Base weight for student j in school i; f2ij, f1i: Nonresponse factor for student and school nonparticipation; t1i: Weight trimming factor. wbrr ij : ``Balanced-repeated replication'' weight (80 replications per case) created by the WesVar software lavaan.survey doberski@uvt.nl
  • 35. Appendix* Balanced-repeated replication (BRR), Fay's ρ = 0.5 lavaan.survey doberski@uvt.nl
  • 36. Appendix* Balanced-repeated replication (BRR), Fay's ρ = 0.5 lavaan.survey doberski@uvt.nl
  • 37. Appendix* Balanced-repeated replication (BRR), Fay's ρ = 0.5 lavaan.survey doberski@uvt.nl
  • 38. Appendix* Structural parameter estimates and standard errors taking nonnormality into account Estimate Std.err Z-value P(>|z|) Regressions: neg.selfconcept ~ neg.efficacy -0.021 0.032 -0.657 0.511 ESCS -0.062 0.018 -3.381 0.001 male -0.389 0.027 -14.607 0.000 neg.efficacy ~ neg.selfcncpt 0.568 0.046 12.354 0.000 school.type 0.530 0.022 24.331 0.000 ESCS -0.252 0.016 -16.199 0.000 male -0.360 0.031 -11.668 0.000 math ~ neg.selfcncpt -0.179 0.015 -11.644 0.000 neg.efficacy -0.239 0.015 -16.413 0.000 school.type -0.606 0.019 -32.154 0.000 ESCS 0.312 0.014 22.511 0.000 male 0.009 0.024 0.375 0.708 lavaan.survey doberski@uvt.nl
  • 39. Appendix* Model fit taking nonnormality into account Used Total Number of observations 7785 8796 Estimator ML Robust Minimum Function Test Statistic 8088.256 7275.544 Degrees of freedom 158 158 P-value (Chi-square) 0.000 0.000 Scaling correction factor 1.112 for the Satorra-Bentler correction Comparative Fit Index (CFI) 0.921 0.921 Tucker-Lewis Index (TLI) 0.907 0.907 RMSEA 0.080 0.076 P-value RMSEA <= 0.05 0.000 0.000 SRMR 0.056 0.056 lavaan.survey doberski@uvt.nl
  • 40. Appendix* Complex survey SEM: fit measures Number of observations 7785 Estimator ML Robust Minimum Function Test Statistic 8187.514 5642.873 Degrees of freedom 158 158 P-value (Chi-square) 0.000 0.000 Scaling correction factor 1.451 for the Satorra-Bentler correction Comparative Fit Index (CFI) 0.921 0.894 Tucker-Lewis Index (TLI) 0.906 0.875 RMSEA 0.081 0.067 P-value RMSEA <= 0.05 0.000 0.000 SRMR 0.058 0.058 lavaan.survey doberski@uvt.nl