SlideShare ist ein Scribd-Unternehmen logo
1 von 99
Lecture 5
Survey Research & Design in Psychology
James Neill, 2010
Exploratory Factor Analysis
2
Overview
1. What is factor analysis?
2. Assumptions
3. Steps / Process
4. Examples
5. Summary
3
What is factor analysis?
1. What is factor analysis?
2. Purpose
3. History
4. Types
5. Models
4
Universe : Galaxy
All variables : Factor
5
Conceptual model of factor analysis
FA uses correlations
among many items
to search for
common clusters.
6
Factor analysis...
• is used to identify clusters of inter-
correlated variables (called
'factors').
• is a family of multivariate statistical
techniques for examining
correlations amongst variables.
• empirically tests theoretical data
structures.
• is commonly used in psychometric
instrument development.
7
Purposes
There are two main applications of
factor analytic techniques:
1. Data reduction: Reduce the
number of variables to a smaller
number of factors.
2. Theory development: Detect
structure in the relationships
between variables, that is, to
classify variables.
8
Purposes: Data reduction
• Simplifies data structure by
revealing a smaller number of
underlying factors
• Helps to eliminate or identify
items for improvement:
• redundant variables
• unclear variables
• irrelevant variables
• Leads to calculating factor scores
9
Purposes: Theory development
• Investigates the underlying
correlational pattern shared by
the variables in order to test
theoretical models e.g., how many
personality factors are there?
• The goal is to address a
theoretical question as opposed
to calculating factor scores.
10
History of factor analysis
• Invented by Charles Spearman (1904)
• Usage hampered by onerousness of
hand calculation
• Since the advent of computers, usage
has thrived, esp. to develop:
– Theory e.g., determining the structure of
personality
– Practice e.g., development of 10,000s+ of
psychological screening & measurement
11
EFA = Exploratory Factor Analysis
• explores & summarises underlying
correlational structure for a data set
CFA = Confirmatory Factor Analysis
• tests the correlational structure of a
data set against a hypothesised
structure and rates the “goodness of fit”
Two main types of FA:
Exploratory vs.
confirmatory factor analysis
12
This (introductory) lecture focuses on
Exploratory Factor Analysis
(recommended for undergraduate level).
However, note that Confirmatory
Factor Analysis (and Structural
Equation Modeling) is generally preferred
but is more advanced and recommended
for graduate level.
This lecture focuses on
exploratory factor analysis
13
Factor 1 Factor 2 Factor 3
Conceptual model - Simple model
• e.g., 12 items testing might actually tap
only 3 underlying factors
• Factors consist of relatively
homogeneous variables.
14
Eysenck’s 3 personality factors
Extraversion/
introversion
Neuroticism Psychoticism
talkative
shy sociable
fun
anxious
gloomy
relaxed
tense
unconventional
nurturingharshloner
e.g., 12 items testing
three underlying dimensions
of personality
15
Question 1
Conceptual model - Simple model
Question 2
Question 3
Question 4
Question 5
Factor 1
Factor 2
Factor 3
Each question loads onto one factor
16
Question 1
Conceptual model - Complex model
Question 2
Question 3
Question 4
Question 5
Factor 1
Factor 2
Factor 3
Questions may load onto more than one factor
17
Conceptual model – Area plot
Correlation between
X1 and X2
A theoretical factor
which is partly
measured by the
common aspects of
X1 and X2
18
Nine factors?
(independent
items)
One factor?
Three factors?
How many factors?
19
Does personality consist of 2, 3, or
5, 16, etc. factors? e.g., the “Big 5”?
• Neuroticism
• Extraversion
• Agreeableness
• Openness
• Conscientiousness
Example: Personality
20
Does intelligence consist of
separate factors, e.g,.
• Verbal
• Mathematical
• Interpersonal, etc.?
...or is it one global factor (g)?
...or is it hierarchically structured?
Example: Intelligence
21
Example: Essential facial features
(Ivancevic, 2003)
22
Six orthogonal factors, represent
76.5% of the total variability in facial
recognition (in order of importance):
• upper-lip
• eyebrow-position
• nose-width
• eye-position
• eye/eyebrow-length
• face-width
Example: Essential facial features
(Ivancevic, 2003)
23
Assumptions
1. GIGO
2. Sample size
3. Levels of measurement
4. Normality
5. Linearity
6. Outliers
7. Factorability
24
Garbage. In.
Garbage. Out
• Screen the data
• Use variables that theoretically “go together”
25
Assumption testing:
Sample size
Some guidelines:
• Min.: N > 5 cases per variable
● e.g., 20 variables, should have > 100 cases (1:5)
• Ideal: N > 20 cases per variable
● e.g., 20 variables, ideally have > 400 cases (1:20)
• Total N > 200 preferable
26
Assumption testing:
Sample size
Comrey and Lee (1992):
50 = very poor,
100 = poor,
200 = fair,
300 = good,
500 = very good
1000+ = excellent
27
Assumption testing: Sample size
28
Assumption testing:
Level of measurement
• All variables must be suitable for
correlational analysis, i.e., they
should be ratio/metric data or at
least Likert data with several
interval levels.
29
Assumption testing:
Normality
• FA is robust to violation of
assumptions of normality
• If the variables are normally
distributed then the solution is
enhanced
30
Assumption Testing:
Linearity
• Because FA is based on
correlations between variables, it is
important to check there are linear
relations amongst the variables
(i.e., check scatterplots)
31
Assumption testing:
Outliers
• FA is sensitive to outlying cases
–Bivariate outliers
(e.g., check scatterplots)
–Multivariate outliers
(e.g., Mahalanobis’ distance)
• Identify outliers, then remove or
transform
32
• 15 classroom behaviours of high-
school children were rated by
teachers using a 5-point scale.
• Task: Identify groups of variables
(behaviours) that are strongly
inter-related & represent
underlying factors.
Example factor analysis:
Classroom behaviour
33
Classroom behaviour items
1. Cannot concentrate ↔ can concentrate
2. Curious & enquiring ↔ little curiousity
3. Perseveres ↔ lacks perseverance
4. Irritable ↔ even-tempered
5. Easily excited ↔ not easily excited
6. Patient ↔ demanding
7. Easily upset ↔ contented
34
8. Control ↔ no control
9. Relates warmly to others ↔ disruptive
10.Persistent ↔ frustrated
11.Difficult ↔ easy
12.Restless ↔ relaxed
13.Lively ↔ settled
14.Purposeful ↔ aimless
15.Cooperative ↔ disputes
Classroom behaviour items
35
Classroom behaviour items
36
Assumption testing:
Factorability
Check the factorability of the correlation
matrix (i.e., how suitable is the data for factor
analysis?) by one or more of the following
methods:
• Correlation matrix correlations > .3?
• Anti-image matrix diagonals > .5?
• Measures of sampling adequacy (MSAs)?
– Bartlett’s sig.?
– KMO > .5 or .6?
37
Assumption testing:
Factorability (Correlations)
Correlation Matrix
1.000 .717 .751 .554 .429
.717 1.000 .826 .472 .262
.751 .826 1.000 .507 .311
.554 .472 .507 1.000 .610
.429 .262 .311 .610 1.000
CONCENTRATES
CURIOUS
PERSEVERES
EVEN-TEMPERED
PLACID
Correlation
CONCEN
TRATES CURIOUS
PERSEV
ERES
EVEN-TE
MPERED PLACID
Are there SOME correlations over .3? If so,
proceed with FA
Takes some effort with a large number of
variables, but accurate
38
• Examine the diagonals on the anti-
image correlation matrix
• Consider variables with correlations
less that .5 for exclusion from the
analysis – they lack sufficient
correlation with other variables
• Medium effort, reasonably accurate
Assumption testing:
Factorability: Anti-image
correlation matrix
39
Anti-Image correlation matrix
Make sure to look at the anti-image
CORRELATION matrix
40
• Global diagnostic indicators -
correlation matrix is factorable if:
–Bartlett’s test of sphericity is
significant and/or
–Kaiser-Mayer Olkin (KMO) measure
of sampling adequacy > .5 or .6
• Quickest method, but least reliable
Assumption testing:
Factorability: Measures of
sampling adequacy
41
Assumption testing:
Factorability
42
Summary:
Measures of sampling adequacy
Draw on one or more of the
following to help determine the
factorability of a correlation matrix:
1. Several correlations > .3?
2. Anti-image matrix diagonals > .5?
3. Bartlett’s test significant?
4. KMO > .5 to .6?
(depends on whose rule of thumb)
43
Steps / Process
1. Test assumptions
2. Select type of analysis
3. Determine no. of factors
(Eigen Values, Scree plot, % variance explained)
4. Select items
(check factor loadings to identify which items belong
in which factor; drop items one by one; repeat)
5. Name and define factors
6. Examine correlations amongst factors
7. Analyse internal reliability
8. Compute composite scores
44
Type of EFA:
Extraction method: PC vs. PAF
Two main approaches to EFA:
• Analyses all variance:
Principle Components (PC)
• Analyses shared variance:
Principle Axis Factoring (PAF)
45
Principal components (PC)
• More common
• More practical
• Used to reduce data to a set of factor
scores for use in other analyses
• Analyses all the variance in each
variable
46
Principal axis factoring (PAF)
• Used to uncover the structure of an
underlying set of p original variables
• More theoretical
• Analyses only shared variance
(i.e. leaves out unique variance)
47
Total variance of a variable
Principal Components (PC)Principal Axis Factoring (PAF)
48
• Often there is little difference in the
solutions for the two procedures.
• If unsure, check your data using
both techniques
• If you get different solutions for the
two methods, try to work out why
and decide on which solution is
more appropriate
PC vs. PAF
49
Communalities
• Each variable has a communality =
– the proportion of its variance
explained by the extracted factors
– sum of the squared loadings for the
variable on each of the factors
• Ranges between 0 and 1
• If communality for a variable is low
(e.g., < .5, consider extracting more
factors or removing the variable)
50
• High communalities (> .5):
Extracted factors explain most of
the variance in the variables being
analysed
• Low communalities (< .5): A
variable has considerable variance
unexplained by the extracted factors
–May then need to extract MORE
factors to explain the variance
Communalities
51
Communalities - 2
52
Explained variance
• A good factor solution is one that
explains the most variance with
the fewest factors
• Realistically, happy with 50-75%
of the variance explained
53
Explained variance
3 factors explain 73.5%
of the variance in the
items
54
Eigen values
• Each factor has an eigen value
• Indicates overall strength of
relationship between a factor and the
variables
• Sum of squared correlations
• Successive EVs have lower values
• Rule of thumb: Eigen values over 1
are ‘stable’ (Kaiser's criterion)
55
Explained variance
The eigen values
ranged between .16 and
9.35. Two factors
satisfied Kaiser's
criterion (EVs > 1) but
the third EV is .93 and
appears to be a useful
factor.
56
Scree plot
• A line graph of Eigen Values
• Depicts amount of variance
explained by each factor
• Cut-off: Look for where additional
factors fail to add appreciably to the
cumulative explained variance
• 1st factor explains the most variance
• Last factor explains the least amount
of variance
57
Scree plot
58
Scree plot
Scree Plot
Component Number
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Eigenvalue 10
8
6
4
2
0
Scree plot
59
Scree plot
Scree plot
60
How many factors?
A subjective process ... Seek to
explain maximum variance using
fewest factors, considering:
1. Theory – what is predicted/expected?
2. Eigen Values > 1? (Kaiser’s criterion)
3. Scree Plot – where does it drop off?
4. Interpretability of last factor?
5. Try several different solutions?
(consider FA type, rotation, # of factors)
6. Factors must be able to be meaningfully
interpreted & make theoretical sense?
61
How many factors?
• Aim for 50-75% of variance
explained with 1/4 to 1/3 as many
factors as variables/items.
• Stop extracting factors when they no
longer represent useful/meaningful
clusters of variables
• Keep checking/clarifying the meaning
of each factor – make sure you are
reading the full wording of each item.
62
• Factor loadings (FLs) indicate
relative importance of each item to
each factor.
–In the initial solution, each factor tries
“selfishly” to grab maximum
unexplained variance.
–All variables will tend to load strongly
on the 1st factor
Initial solution:
Unrotated factor structure
63
Initial solution -
Unrotated factor structure
• Factors are
weighted
combinations of
variables
• A factor matrix
shows variables
in rows and
factors in
columns
64
1st factor extracted:
• Best possible line of best fit through
the original variables
• Seeks to explain lion's share of all
variance
• A single factor, best summary of the
variance in the whole set of items
Initial solution -
Unrotated factor structure
65
• Each subsequent factor tries to
explain the maximim possible amount
of remaining unexplained variance.
–Second factor is orthogonal to first
factor - seeks to maximise its own
eigen value (i.e., tries to gobble up as
much of the remaining unexplained
variance as possible)
Initial solution -
Unrotated factor structure
66
Vectors (Lines of best fit)
67
Initial solution:
Unrotated factor structure
• Seldom see a simple unrotated factor
structure
• Many variables load on 2 or more factors
• Some variables may not load highly on
any factors (check: low communality)
• Until the FLs are rotated, they are difficult
to interpret.
• Rotation of the FL matrix helps to find a
more interpretable factor structure.
68
Two basic types of
factor rotation
Orthogonal
(Varimax)
Oblique
(Oblimin)
69
Two basic types of
factor rotation
1. Orthogonal
minimises factor covariation,
produces factors which are
uncorrelated
2. Oblimin
allows factors to covary, allows
correlations between factors
70
Orthogonal rotation
71
Why rotate a factor loading
matrix?
• After rotation, the vectors (lines
of best fit) are rearranged to
optimally go through clusters of
shared variance
• Then the FLs and the factor they
represent can be more readily
interpreted
72
Why rotate a factor loading
matrix?
• A rotated factor structure is simpler
& more easily interpretable
–each variable loads strongly on only
one factor
–each factor shows at least 3 strong
loadings
–all loading are either strong or weak,
no intermediate loadings
73
Orthogonal vs. oblique rotations
• Consider purpose of factor analysis
• If in doubt, try both
• Consider interpretability
• Look at correlations between
factors in oblique solution
– if >.3 then go with oblique rotation
(>10% shared variance between
factors)
74
Interpretability
• It is dangerous to be driven by
factor loadings only – think
carefully - be guided by theory
and common sense in selecting
factor structure.
• You must be able to understand
and interpret a factor if you’re
going to extract it.
75
Interpretability
• However, watch out for ‘seeing
what you want to see’ when
evidence might suggest a
different, better solution.
• There may be more than one
good solution! e.g., in personality
–2 factor model
–5 factor model
–16 factor model
76
Factor loadings & item selection
A factor structure is most
interpretable when:
1. Each variable loads strongly (> +.40) on
only one factor
2. Each factor shows 3 or more strong
loadings; more loadings = greater reliability
3. Most loadings are either high or low, few
intermediate values.
4. These elements give a ‘simple’ factor
structure.
77
Initial solution –
Unrotated factor structure 4
Rotated Component Matrixa
.861 .158 .288
.858 .310
.806 .279 .325
.778 .373 .237
.770 .376 .312
.863 .203
.259 .843 .223
.422 .756 .295
.234 .648 .526
.398 .593 .510
.328 .155 .797
.268 .286 .748
.362 .258 .724
.240 .530 .662
.405 .396 .622
PERSEVERES
CURIOUS
PURPOSEFUL ACTIVITY
CONCENTRATES
SUSTAINED ATTENTION
PLACID
CALM
RELAXED
COMPLIANT
SELF-CONTROLLED
RELATES-WARMLY
CONTENTED
COOPERATIVE
EVEN-TEMPERED
COMMUNICATIVE
1 2 3
Component
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
Rotation converged in 6 iterations.a.
78
Pattern Matrixa
.920 .153
.845
.784 -.108
.682 -.338
.596 -.192 -.168
-.938
-.933 .171
-.839
-.831 -.201
-.788 -.181
-.902
-.131 -.841
-.314 -.686
.471 -.521
.400 -.209 -.433
RELATES-WARMLY
CONTENTED
COOPERATIVE
EVEN-TEMPERED
COMMUNICATIVE
PERSEVERES
CURIOUS
PURPOSEFUL ACTIVITY
CONCENTRATES
SUSTAINED ATTENTION
PLACID
CALM
RELAXED
COMPLIANT
SELF-CONTROLLED
1 2 3
Component
Extraction Method: Principal Component Analysis.
Rotation Method: Oblimin with Kaiser Normalization.
Rotation converged in 13 iterations.a.
Task
Orientation
Sociabilit
y
Settledness
79
3-d plot
80
• Bare min. = 2
• Recommended min. = 3
• Max. = unlimited
• More items:
→ ↑ reliability
→ ↑ 'roundedness'
→ Law of diminishing returns
• Typically = 4 to 10 is reasonable
How many items per factor?
81
How do I eliminate items?
A subjective process; consider:
1. Size of main loading (min = .4)
2. Size of cross loadings (max = .3?)
3. Meaning of item (face validity)
4. Contribution it makes to the factor
5. Eliminate 1 variable at a time, then re-
run, before deciding which/if any items to
eliminate next
6. Number of items already in the factor
82
Factor loadings & item selection
Comrey & Lee (1992):
loadings > .70 - excellent
> .63 - very good
> .55 - good
> .45 - fair
> .32 - poor
83
Factor loadings & item selection
Cut-off for acceptable loadings:
• Look for gap in loadings
(e.g., .8, .7, .6, .3, .2)
• Choose cut-off because factors
can be interpreted above but not
below cut-off
84
Other considerations:
Normality of items
Check the item descriptives.
e.g. if two items have similar Factor
Loadings and Reliability analysis,
consider selecting items which will
have the least skew and kurtosis.
The more normally distributed the
item scores, the better the
distribution of the composite
scores.
85
Factor analysis in practice
• To find a good solution, consider
each combination of:
–PC-varimax
–PC-oblimin
–PAF-varimax
–PAF-oblimin
• Apply the above methods to a
range of possible/likely factors, e.g.,
for 2, 3, 4, 5, 6, and 7 factors
86
• Eliminate poor items one at a time,
retesting the possible solutions
• Check factor structure across sub-
groups (e.g., gender) if there is
sufficient data
• You will probably come up with a
different solution from someone
else!
• Check/consider reliability analysis
Factor analysis in practice
87
Example: Condom use
• The Condom Use Self-Efficacy
Scale (CUSES) was administered to
447 multicultural college students.
• PC FA with a varimax rotation.
• Three distinct factors were
extracted:
– Appropriation
– Sexually Transmitted Diseases
– Partners' Disapproval
88
Factor loadings & item
selection
.56
I feel confident I could gracefully remove and
dispose of a condom after sexual intercourse
.61
I feel confident I could remember to carry a
condom with me should I need one
.65
I feel confident I could purchase condoms without
feeling embarrassed
.75
I feel confident in my ability to put a condom on
myself or my partner
FLFactor 1: Appropriation
89
Factor loadings & item
selection
.80
I would not feel confident suggesting using
condoms with a new partner because I would be
afraid he or she would think I thought they had a
sexually transmitted disease
.86
I would not feel confident suggesting using
condoms with a new partner because I would be
afraid he or she would think I have a sexually
transmitted disease
.72
I would not feel confident suggesting using
condoms with a new partner because I would be
afraid he or she would think I've had a past
homosexual experience
FLFactor 2: STDs
90
Factor loadings & item
selection
.58
If my partner and I were to try to use a condom and
did not succeed, I would feel embarrassed to try to
use one again (e.g. not being able to unroll condom,
putting it on backwards or awkwardness)
.65
If I were unsure of my partner's feelings about
using condoms I would not suggest using one
.73
If I were to suggest using a condom to a partner, I
would feel afraid that he or she would reject me
FLFactor 3: Partner's reaction
91
Summary
1. Introduction
2. Assumptions
3. Steps/Process
92
Introduction: Summary
• Factor analysis is a family of
multivariate correlational data
analysis methods for summarising
clusters of covariance.
• FA summarises correlations
amongst items.
• The common clusters (called
factors) are summary indicators of
underlying fuzzy constructs.
93
Assumptions: Summary
• Sample size
– 5+ cases per variables (ideally
20+ cases per variable)
– N > 200
• Bivariate & multivariate outliers
• Factorability of correlation matrix
(Measures of Sampling Adequacy)
• Normality enhances the solution
94
Summary: Steps / Process
1. Test assumptions
2. Select type of analysis
3. Determine no. of factors
(Eigen Values, Scree plot, % variance explained)
4. Select items
(check factor loadings to identify which items belong
in which factor; drop items one by one; repeat)
5. Name and define factors
6. Examine correlations amongst factors
7. Analyse internal reliability
8. Compute composite scores
Next
lecture
95
Summary:
Types of FA
• PC: Data reduction
–uses all variance
• PAF: Theoretical data exploration
–uses shared variance
• Try both ways
–Are solutions different? Why?
96
Summary: Rotation
• Orthogonal (varimax)
– perpendicular vectors
• Oblique (oblimin)
– angled vectors
• Try both ways
– Are solutions different? Why?
97
No. of factors to extract?
• Inspect Evs
– look for > 1 or sudden drop
(Inspect scree plot)
• % of variance explained
– aim for 50 to 75%)
• Interpretability / theory
Summary: Factor extraction
98
1. Comrey, A. L., & Lee, H. B. (1992). A first course in
factor analysis (2nd ed.). Hillsdale, NJ: Lawrence
Erlbaum.
2. Ivancevic, V., Kaine, A.K., MCLindin, B.A, & Sunde, J.
(2003). Factor analysis of essential facial features. In
the proceedings of the 25th International Conference
on Information Technology Interfaces (ITI), pp. 187-
191, Cavtat, Croatia.
1. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., &
Strahan, E. J. (1999). Evaluating the use of exploratory
factor analysis in psychological research.
Psychological Methods, 4(3), 272-299.
2. Tabachnick, B. G. & Fidell, L. S. (2001). Principal
components and factor analysis. In Using multivariate
statistics. (4th ed., pp. 582 - 633). Needham Heights,
MA: Allyn & Bacon.
References
99
Open Office Impress
● This presentation was made using
Open Office Impress.
● Free and open source software.
● http://www.openoffice.org/product/impress.html

Weitere ähnliche Inhalte

Was ist angesagt?

Correlational research
Correlational researchCorrelational research
Correlational research
Jijo G John
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
saba khan
 

Was ist angesagt? (20)

Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Factor Analysis, Assumptions. Exploratory Factor Analysis and Confirmatory Fa...
Factor Analysis, Assumptions. Exploratory Factor Analysis and Confirmatory Fa...Factor Analysis, Assumptions. Exploratory Factor Analysis and Confirmatory Fa...
Factor Analysis, Assumptions. Exploratory Factor Analysis and Confirmatory Fa...
 
Factor Extraction method in factor analysis with example in R studio.pptx
Factor Extraction method in factor analysis with example in R studio.pptxFactor Extraction method in factor analysis with example in R studio.pptx
Factor Extraction method in factor analysis with example in R studio.pptx
 
Exploratory factor analysis
Exploratory factor analysisExploratory factor analysis
Exploratory factor analysis
 
Mediation analysis
Mediation analysisMediation analysis
Mediation analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Factor Analysis - Statistics
Factor Analysis - StatisticsFactor Analysis - Statistics
Factor Analysis - Statistics
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
 
Factor analysis
Factor analysis Factor analysis
Factor analysis
 
Correlational research
Correlational researchCorrelational research
Correlational research
 
Structural Equation Modelling (SEM) Part 2
Structural Equation Modelling (SEM) Part 2Structural Equation Modelling (SEM) Part 2
Structural Equation Modelling (SEM) Part 2
 
Correlation & Regression Analysis using SPSS
Correlation & Regression Analysis  using SPSSCorrelation & Regression Analysis  using SPSS
Correlation & Regression Analysis using SPSS
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Structural equation modeling in amos
Structural equation modeling in amosStructural equation modeling in amos
Structural equation modeling in amos
 
Structural Equation Modelling (SEM) Part 1
Structural Equation Modelling (SEM) Part 1Structural Equation Modelling (SEM) Part 1
Structural Equation Modelling (SEM) Part 1
 
Exploratory Factor Analysis
Exploratory Factor AnalysisExploratory Factor Analysis
Exploratory Factor Analysis
 
Factor Analysis
Factor Analysis Factor Analysis
Factor Analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling
 

Ähnlich wie Exploratory

Applied statistics lecture_5
Applied statistics lecture_5Applied statistics lecture_5
Applied statistics lecture_5
Daria Bogdanova
 
Overview Of Factor Analysis Q Ti A
Overview Of  Factor  Analysis  Q Ti AOverview Of  Factor  Analysis  Q Ti A
Overview Of Factor Analysis Q Ti A
Zoha Qureshi
 
factor analysis (basics) for research .ppt
factor analysis (basics) for research .pptfactor analysis (basics) for research .ppt
factor analysis (basics) for research .ppt
MsHumaJaved
 
Data Ananlysis lecture 7 Simon Fraser University
Data Ananlysis lecture 7 Simon Fraser UniversityData Ananlysis lecture 7 Simon Fraser University
Data Ananlysis lecture 7 Simon Fraser University
soniyamarghani
 
Factor Analysis in Research
Factor Analysis in ResearchFactor Analysis in Research
Factor Analysis in Research
Qasim Raza
 

Ähnlich wie Exploratory (20)

Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Summary and conclusion - Survey research and design in psychology
Summary and conclusion - Survey research and design in psychologySummary and conclusion - Survey research and design in psychology
Summary and conclusion - Survey research and design in psychology
 
Applied statistics lecture_5
Applied statistics lecture_5Applied statistics lecture_5
Applied statistics lecture_5
 
TCI in primary care - SEM (2006)
TCI in primary care - SEM (2006)TCI in primary care - SEM (2006)
TCI in primary care - SEM (2006)
 
Overview Of Factor Analysis Q Ti A
Overview Of  Factor  Analysis  Q Ti AOverview Of  Factor  Analysis  Q Ti A
Overview Of Factor Analysis Q Ti A
 
factor analysis (basics) for research .ppt
factor analysis (basics) for research .pptfactor analysis (basics) for research .ppt
factor analysis (basics) for research .ppt
 
Factor analysis (1)
Factor analysis (1)Factor analysis (1)
Factor analysis (1)
 
Chapter37
Chapter37Chapter37
Chapter37
 
Research Procedure
Research ProcedureResearch Procedure
Research Procedure
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Data Ananlysis lecture 7 Simon Fraser University
Data Ananlysis lecture 7 Simon Fraser UniversityData Ananlysis lecture 7 Simon Fraser University
Data Ananlysis lecture 7 Simon Fraser University
 
Biostatistics and Research Methodology Semester 8
Biostatistics and Research Methodology Semester 8Biostatistics and Research Methodology Semester 8
Biostatistics and Research Methodology Semester 8
 
intro-factor-analysis.pdf
intro-factor-analysis.pdfintro-factor-analysis.pdf
intro-factor-analysis.pdf
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Slides sem on pls-complete
Slides sem on pls-completeSlides sem on pls-complete
Slides sem on pls-complete
 
Factor Analysis in Research
Factor Analysis in ResearchFactor Analysis in Research
Factor Analysis in Research
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
R m 99
R m 99R m 99
R m 99
 
Factor analysis using spss 2005
Factor analysis using spss 2005Factor analysis using spss 2005
Factor analysis using spss 2005
 
Probability and data 1w
Probability and data 1wProbability and data 1w
Probability and data 1w
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Exploratory

  • 1. Lecture 5 Survey Research & Design in Psychology James Neill, 2010 Exploratory Factor Analysis
  • 2. 2 Overview 1. What is factor analysis? 2. Assumptions 3. Steps / Process 4. Examples 5. Summary
  • 3. 3 What is factor analysis? 1. What is factor analysis? 2. Purpose 3. History 4. Types 5. Models
  • 4. 4 Universe : Galaxy All variables : Factor
  • 5. 5 Conceptual model of factor analysis FA uses correlations among many items to search for common clusters.
  • 6. 6 Factor analysis... • is used to identify clusters of inter- correlated variables (called 'factors'). • is a family of multivariate statistical techniques for examining correlations amongst variables. • empirically tests theoretical data structures. • is commonly used in psychometric instrument development.
  • 7. 7 Purposes There are two main applications of factor analytic techniques: 1. Data reduction: Reduce the number of variables to a smaller number of factors. 2. Theory development: Detect structure in the relationships between variables, that is, to classify variables.
  • 8. 8 Purposes: Data reduction • Simplifies data structure by revealing a smaller number of underlying factors • Helps to eliminate or identify items for improvement: • redundant variables • unclear variables • irrelevant variables • Leads to calculating factor scores
  • 9. 9 Purposes: Theory development • Investigates the underlying correlational pattern shared by the variables in order to test theoretical models e.g., how many personality factors are there? • The goal is to address a theoretical question as opposed to calculating factor scores.
  • 10. 10 History of factor analysis • Invented by Charles Spearman (1904) • Usage hampered by onerousness of hand calculation • Since the advent of computers, usage has thrived, esp. to develop: – Theory e.g., determining the structure of personality – Practice e.g., development of 10,000s+ of psychological screening & measurement
  • 11. 11 EFA = Exploratory Factor Analysis • explores & summarises underlying correlational structure for a data set CFA = Confirmatory Factor Analysis • tests the correlational structure of a data set against a hypothesised structure and rates the “goodness of fit” Two main types of FA: Exploratory vs. confirmatory factor analysis
  • 12. 12 This (introductory) lecture focuses on Exploratory Factor Analysis (recommended for undergraduate level). However, note that Confirmatory Factor Analysis (and Structural Equation Modeling) is generally preferred but is more advanced and recommended for graduate level. This lecture focuses on exploratory factor analysis
  • 13. 13 Factor 1 Factor 2 Factor 3 Conceptual model - Simple model • e.g., 12 items testing might actually tap only 3 underlying factors • Factors consist of relatively homogeneous variables.
  • 14. 14 Eysenck’s 3 personality factors Extraversion/ introversion Neuroticism Psychoticism talkative shy sociable fun anxious gloomy relaxed tense unconventional nurturingharshloner e.g., 12 items testing three underlying dimensions of personality
  • 15. 15 Question 1 Conceptual model - Simple model Question 2 Question 3 Question 4 Question 5 Factor 1 Factor 2 Factor 3 Each question loads onto one factor
  • 16. 16 Question 1 Conceptual model - Complex model Question 2 Question 3 Question 4 Question 5 Factor 1 Factor 2 Factor 3 Questions may load onto more than one factor
  • 17. 17 Conceptual model – Area plot Correlation between X1 and X2 A theoretical factor which is partly measured by the common aspects of X1 and X2
  • 19. 19 Does personality consist of 2, 3, or 5, 16, etc. factors? e.g., the “Big 5”? • Neuroticism • Extraversion • Agreeableness • Openness • Conscientiousness Example: Personality
  • 20. 20 Does intelligence consist of separate factors, e.g,. • Verbal • Mathematical • Interpersonal, etc.? ...or is it one global factor (g)? ...or is it hierarchically structured? Example: Intelligence
  • 21. 21 Example: Essential facial features (Ivancevic, 2003)
  • 22. 22 Six orthogonal factors, represent 76.5% of the total variability in facial recognition (in order of importance): • upper-lip • eyebrow-position • nose-width • eye-position • eye/eyebrow-length • face-width Example: Essential facial features (Ivancevic, 2003)
  • 23. 23 Assumptions 1. GIGO 2. Sample size 3. Levels of measurement 4. Normality 5. Linearity 6. Outliers 7. Factorability
  • 24. 24 Garbage. In. Garbage. Out • Screen the data • Use variables that theoretically “go together”
  • 25. 25 Assumption testing: Sample size Some guidelines: • Min.: N > 5 cases per variable ● e.g., 20 variables, should have > 100 cases (1:5) • Ideal: N > 20 cases per variable ● e.g., 20 variables, ideally have > 400 cases (1:20) • Total N > 200 preferable
  • 26. 26 Assumption testing: Sample size Comrey and Lee (1992): 50 = very poor, 100 = poor, 200 = fair, 300 = good, 500 = very good 1000+ = excellent
  • 28. 28 Assumption testing: Level of measurement • All variables must be suitable for correlational analysis, i.e., they should be ratio/metric data or at least Likert data with several interval levels.
  • 29. 29 Assumption testing: Normality • FA is robust to violation of assumptions of normality • If the variables are normally distributed then the solution is enhanced
  • 30. 30 Assumption Testing: Linearity • Because FA is based on correlations between variables, it is important to check there are linear relations amongst the variables (i.e., check scatterplots)
  • 31. 31 Assumption testing: Outliers • FA is sensitive to outlying cases –Bivariate outliers (e.g., check scatterplots) –Multivariate outliers (e.g., Mahalanobis’ distance) • Identify outliers, then remove or transform
  • 32. 32 • 15 classroom behaviours of high- school children were rated by teachers using a 5-point scale. • Task: Identify groups of variables (behaviours) that are strongly inter-related & represent underlying factors. Example factor analysis: Classroom behaviour
  • 33. 33 Classroom behaviour items 1. Cannot concentrate ↔ can concentrate 2. Curious & enquiring ↔ little curiousity 3. Perseveres ↔ lacks perseverance 4. Irritable ↔ even-tempered 5. Easily excited ↔ not easily excited 6. Patient ↔ demanding 7. Easily upset ↔ contented
  • 34. 34 8. Control ↔ no control 9. Relates warmly to others ↔ disruptive 10.Persistent ↔ frustrated 11.Difficult ↔ easy 12.Restless ↔ relaxed 13.Lively ↔ settled 14.Purposeful ↔ aimless 15.Cooperative ↔ disputes Classroom behaviour items
  • 36. 36 Assumption testing: Factorability Check the factorability of the correlation matrix (i.e., how suitable is the data for factor analysis?) by one or more of the following methods: • Correlation matrix correlations > .3? • Anti-image matrix diagonals > .5? • Measures of sampling adequacy (MSAs)? – Bartlett’s sig.? – KMO > .5 or .6?
  • 37. 37 Assumption testing: Factorability (Correlations) Correlation Matrix 1.000 .717 .751 .554 .429 .717 1.000 .826 .472 .262 .751 .826 1.000 .507 .311 .554 .472 .507 1.000 .610 .429 .262 .311 .610 1.000 CONCENTRATES CURIOUS PERSEVERES EVEN-TEMPERED PLACID Correlation CONCEN TRATES CURIOUS PERSEV ERES EVEN-TE MPERED PLACID Are there SOME correlations over .3? If so, proceed with FA Takes some effort with a large number of variables, but accurate
  • 38. 38 • Examine the diagonals on the anti- image correlation matrix • Consider variables with correlations less that .5 for exclusion from the analysis – they lack sufficient correlation with other variables • Medium effort, reasonably accurate Assumption testing: Factorability: Anti-image correlation matrix
  • 39. 39 Anti-Image correlation matrix Make sure to look at the anti-image CORRELATION matrix
  • 40. 40 • Global diagnostic indicators - correlation matrix is factorable if: –Bartlett’s test of sphericity is significant and/or –Kaiser-Mayer Olkin (KMO) measure of sampling adequacy > .5 or .6 • Quickest method, but least reliable Assumption testing: Factorability: Measures of sampling adequacy
  • 42. 42 Summary: Measures of sampling adequacy Draw on one or more of the following to help determine the factorability of a correlation matrix: 1. Several correlations > .3? 2. Anti-image matrix diagonals > .5? 3. Bartlett’s test significant? 4. KMO > .5 to .6? (depends on whose rule of thumb)
  • 43. 43 Steps / Process 1. Test assumptions 2. Select type of analysis 3. Determine no. of factors (Eigen Values, Scree plot, % variance explained) 4. Select items (check factor loadings to identify which items belong in which factor; drop items one by one; repeat) 5. Name and define factors 6. Examine correlations amongst factors 7. Analyse internal reliability 8. Compute composite scores
  • 44. 44 Type of EFA: Extraction method: PC vs. PAF Two main approaches to EFA: • Analyses all variance: Principle Components (PC) • Analyses shared variance: Principle Axis Factoring (PAF)
  • 45. 45 Principal components (PC) • More common • More practical • Used to reduce data to a set of factor scores for use in other analyses • Analyses all the variance in each variable
  • 46. 46 Principal axis factoring (PAF) • Used to uncover the structure of an underlying set of p original variables • More theoretical • Analyses only shared variance (i.e. leaves out unique variance)
  • 47. 47 Total variance of a variable Principal Components (PC)Principal Axis Factoring (PAF)
  • 48. 48 • Often there is little difference in the solutions for the two procedures. • If unsure, check your data using both techniques • If you get different solutions for the two methods, try to work out why and decide on which solution is more appropriate PC vs. PAF
  • 49. 49 Communalities • Each variable has a communality = – the proportion of its variance explained by the extracted factors – sum of the squared loadings for the variable on each of the factors • Ranges between 0 and 1 • If communality for a variable is low (e.g., < .5, consider extracting more factors or removing the variable)
  • 50. 50 • High communalities (> .5): Extracted factors explain most of the variance in the variables being analysed • Low communalities (< .5): A variable has considerable variance unexplained by the extracted factors –May then need to extract MORE factors to explain the variance Communalities
  • 52. 52 Explained variance • A good factor solution is one that explains the most variance with the fewest factors • Realistically, happy with 50-75% of the variance explained
  • 53. 53 Explained variance 3 factors explain 73.5% of the variance in the items
  • 54. 54 Eigen values • Each factor has an eigen value • Indicates overall strength of relationship between a factor and the variables • Sum of squared correlations • Successive EVs have lower values • Rule of thumb: Eigen values over 1 are ‘stable’ (Kaiser's criterion)
  • 55. 55 Explained variance The eigen values ranged between .16 and 9.35. Two factors satisfied Kaiser's criterion (EVs > 1) but the third EV is .93 and appears to be a useful factor.
  • 56. 56 Scree plot • A line graph of Eigen Values • Depicts amount of variance explained by each factor • Cut-off: Look for where additional factors fail to add appreciably to the cumulative explained variance • 1st factor explains the most variance • Last factor explains the least amount of variance
  • 58. 58 Scree plot Scree Plot Component Number 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Eigenvalue 10 8 6 4 2 0 Scree plot
  • 60. 60 How many factors? A subjective process ... Seek to explain maximum variance using fewest factors, considering: 1. Theory – what is predicted/expected? 2. Eigen Values > 1? (Kaiser’s criterion) 3. Scree Plot – where does it drop off? 4. Interpretability of last factor? 5. Try several different solutions? (consider FA type, rotation, # of factors) 6. Factors must be able to be meaningfully interpreted & make theoretical sense?
  • 61. 61 How many factors? • Aim for 50-75% of variance explained with 1/4 to 1/3 as many factors as variables/items. • Stop extracting factors when they no longer represent useful/meaningful clusters of variables • Keep checking/clarifying the meaning of each factor – make sure you are reading the full wording of each item.
  • 62. 62 • Factor loadings (FLs) indicate relative importance of each item to each factor. –In the initial solution, each factor tries “selfishly” to grab maximum unexplained variance. –All variables will tend to load strongly on the 1st factor Initial solution: Unrotated factor structure
  • 63. 63 Initial solution - Unrotated factor structure • Factors are weighted combinations of variables • A factor matrix shows variables in rows and factors in columns
  • 64. 64 1st factor extracted: • Best possible line of best fit through the original variables • Seeks to explain lion's share of all variance • A single factor, best summary of the variance in the whole set of items Initial solution - Unrotated factor structure
  • 65. 65 • Each subsequent factor tries to explain the maximim possible amount of remaining unexplained variance. –Second factor is orthogonal to first factor - seeks to maximise its own eigen value (i.e., tries to gobble up as much of the remaining unexplained variance as possible) Initial solution - Unrotated factor structure
  • 66. 66 Vectors (Lines of best fit)
  • 67. 67 Initial solution: Unrotated factor structure • Seldom see a simple unrotated factor structure • Many variables load on 2 or more factors • Some variables may not load highly on any factors (check: low communality) • Until the FLs are rotated, they are difficult to interpret. • Rotation of the FL matrix helps to find a more interpretable factor structure.
  • 68. 68 Two basic types of factor rotation Orthogonal (Varimax) Oblique (Oblimin)
  • 69. 69 Two basic types of factor rotation 1. Orthogonal minimises factor covariation, produces factors which are uncorrelated 2. Oblimin allows factors to covary, allows correlations between factors
  • 71. 71 Why rotate a factor loading matrix? • After rotation, the vectors (lines of best fit) are rearranged to optimally go through clusters of shared variance • Then the FLs and the factor they represent can be more readily interpreted
  • 72. 72 Why rotate a factor loading matrix? • A rotated factor structure is simpler & more easily interpretable –each variable loads strongly on only one factor –each factor shows at least 3 strong loadings –all loading are either strong or weak, no intermediate loadings
  • 73. 73 Orthogonal vs. oblique rotations • Consider purpose of factor analysis • If in doubt, try both • Consider interpretability • Look at correlations between factors in oblique solution – if >.3 then go with oblique rotation (>10% shared variance between factors)
  • 74. 74 Interpretability • It is dangerous to be driven by factor loadings only – think carefully - be guided by theory and common sense in selecting factor structure. • You must be able to understand and interpret a factor if you’re going to extract it.
  • 75. 75 Interpretability • However, watch out for ‘seeing what you want to see’ when evidence might suggest a different, better solution. • There may be more than one good solution! e.g., in personality –2 factor model –5 factor model –16 factor model
  • 76. 76 Factor loadings & item selection A factor structure is most interpretable when: 1. Each variable loads strongly (> +.40) on only one factor 2. Each factor shows 3 or more strong loadings; more loadings = greater reliability 3. Most loadings are either high or low, few intermediate values. 4. These elements give a ‘simple’ factor structure.
  • 77. 77 Initial solution – Unrotated factor structure 4 Rotated Component Matrixa .861 .158 .288 .858 .310 .806 .279 .325 .778 .373 .237 .770 .376 .312 .863 .203 .259 .843 .223 .422 .756 .295 .234 .648 .526 .398 .593 .510 .328 .155 .797 .268 .286 .748 .362 .258 .724 .240 .530 .662 .405 .396 .622 PERSEVERES CURIOUS PURPOSEFUL ACTIVITY CONCENTRATES SUSTAINED ATTENTION PLACID CALM RELAXED COMPLIANT SELF-CONTROLLED RELATES-WARMLY CONTENTED COOPERATIVE EVEN-TEMPERED COMMUNICATIVE 1 2 3 Component Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Rotation converged in 6 iterations.a.
  • 78. 78 Pattern Matrixa .920 .153 .845 .784 -.108 .682 -.338 .596 -.192 -.168 -.938 -.933 .171 -.839 -.831 -.201 -.788 -.181 -.902 -.131 -.841 -.314 -.686 .471 -.521 .400 -.209 -.433 RELATES-WARMLY CONTENTED COOPERATIVE EVEN-TEMPERED COMMUNICATIVE PERSEVERES CURIOUS PURPOSEFUL ACTIVITY CONCENTRATES SUSTAINED ATTENTION PLACID CALM RELAXED COMPLIANT SELF-CONTROLLED 1 2 3 Component Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization. Rotation converged in 13 iterations.a. Task Orientation Sociabilit y Settledness
  • 80. 80 • Bare min. = 2 • Recommended min. = 3 • Max. = unlimited • More items: → ↑ reliability → ↑ 'roundedness' → Law of diminishing returns • Typically = 4 to 10 is reasonable How many items per factor?
  • 81. 81 How do I eliminate items? A subjective process; consider: 1. Size of main loading (min = .4) 2. Size of cross loadings (max = .3?) 3. Meaning of item (face validity) 4. Contribution it makes to the factor 5. Eliminate 1 variable at a time, then re- run, before deciding which/if any items to eliminate next 6. Number of items already in the factor
  • 82. 82 Factor loadings & item selection Comrey & Lee (1992): loadings > .70 - excellent > .63 - very good > .55 - good > .45 - fair > .32 - poor
  • 83. 83 Factor loadings & item selection Cut-off for acceptable loadings: • Look for gap in loadings (e.g., .8, .7, .6, .3, .2) • Choose cut-off because factors can be interpreted above but not below cut-off
  • 84. 84 Other considerations: Normality of items Check the item descriptives. e.g. if two items have similar Factor Loadings and Reliability analysis, consider selecting items which will have the least skew and kurtosis. The more normally distributed the item scores, the better the distribution of the composite scores.
  • 85. 85 Factor analysis in practice • To find a good solution, consider each combination of: –PC-varimax –PC-oblimin –PAF-varimax –PAF-oblimin • Apply the above methods to a range of possible/likely factors, e.g., for 2, 3, 4, 5, 6, and 7 factors
  • 86. 86 • Eliminate poor items one at a time, retesting the possible solutions • Check factor structure across sub- groups (e.g., gender) if there is sufficient data • You will probably come up with a different solution from someone else! • Check/consider reliability analysis Factor analysis in practice
  • 87. 87 Example: Condom use • The Condom Use Self-Efficacy Scale (CUSES) was administered to 447 multicultural college students. • PC FA with a varimax rotation. • Three distinct factors were extracted: – Appropriation – Sexually Transmitted Diseases – Partners' Disapproval
  • 88. 88 Factor loadings & item selection .56 I feel confident I could gracefully remove and dispose of a condom after sexual intercourse .61 I feel confident I could remember to carry a condom with me should I need one .65 I feel confident I could purchase condoms without feeling embarrassed .75 I feel confident in my ability to put a condom on myself or my partner FLFactor 1: Appropriation
  • 89. 89 Factor loadings & item selection .80 I would not feel confident suggesting using condoms with a new partner because I would be afraid he or she would think I thought they had a sexually transmitted disease .86 I would not feel confident suggesting using condoms with a new partner because I would be afraid he or she would think I have a sexually transmitted disease .72 I would not feel confident suggesting using condoms with a new partner because I would be afraid he or she would think I've had a past homosexual experience FLFactor 2: STDs
  • 90. 90 Factor loadings & item selection .58 If my partner and I were to try to use a condom and did not succeed, I would feel embarrassed to try to use one again (e.g. not being able to unroll condom, putting it on backwards or awkwardness) .65 If I were unsure of my partner's feelings about using condoms I would not suggest using one .73 If I were to suggest using a condom to a partner, I would feel afraid that he or she would reject me FLFactor 3: Partner's reaction
  • 92. 92 Introduction: Summary • Factor analysis is a family of multivariate correlational data analysis methods for summarising clusters of covariance. • FA summarises correlations amongst items. • The common clusters (called factors) are summary indicators of underlying fuzzy constructs.
  • 93. 93 Assumptions: Summary • Sample size – 5+ cases per variables (ideally 20+ cases per variable) – N > 200 • Bivariate & multivariate outliers • Factorability of correlation matrix (Measures of Sampling Adequacy) • Normality enhances the solution
  • 94. 94 Summary: Steps / Process 1. Test assumptions 2. Select type of analysis 3. Determine no. of factors (Eigen Values, Scree plot, % variance explained) 4. Select items (check factor loadings to identify which items belong in which factor; drop items one by one; repeat) 5. Name and define factors 6. Examine correlations amongst factors 7. Analyse internal reliability 8. Compute composite scores Next lecture
  • 95. 95 Summary: Types of FA • PC: Data reduction –uses all variance • PAF: Theoretical data exploration –uses shared variance • Try both ways –Are solutions different? Why?
  • 96. 96 Summary: Rotation • Orthogonal (varimax) – perpendicular vectors • Oblique (oblimin) – angled vectors • Try both ways – Are solutions different? Why?
  • 97. 97 No. of factors to extract? • Inspect Evs – look for > 1 or sudden drop (Inspect scree plot) • % of variance explained – aim for 50 to 75%) • Interpretability / theory Summary: Factor extraction
  • 98. 98 1. Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. 2. Ivancevic, V., Kaine, A.K., MCLindin, B.A, & Sunde, J. (2003). Factor analysis of essential facial features. In the proceedings of the 25th International Conference on Information Technology Interfaces (ITI), pp. 187- 191, Cavtat, Croatia. 1. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272-299. 2. Tabachnick, B. G. & Fidell, L. S. (2001). Principal components and factor analysis. In Using multivariate statistics. (4th ed., pp. 582 - 633). Needham Heights, MA: Allyn & Bacon. References
  • 99. 99 Open Office Impress ● This presentation was made using Open Office Impress. ● Free and open source software. ● http://www.openoffice.org/product/impress.html

Hinweis der Redaktion

  1. 7126/6667 Survey Research &amp; Design in Psychology Semester 1, 2010, University of Canberra, ACT, Australia James T. Neill Home page: http://ucspace.canberra.edu.au/display/7126 Lecture page: http://en.wikiversity.org/wiki/Exploratory_factor_analysis Notes page:http://en.wikiversity.org/wiki/Survey_research_methods_and_design_in_psychology Image source:http://www.flickr.com/photos/jimkster/832582262/ By jimkster: http://www.flickr.com/photos/jimkster License: CC-by-A 2.0 Description: This lecture is accompanied by a computer-lab based tutorial, the notes for which are available here: http://ucspace.canberra.edu.au/display/7126/Tutorial+-+Psychometrics Introduces and explains the use of exploratory factor analysis particularly for the purposes of psychometric instrument development.
  2. Image source: http://commons.wikimedia.org/wiki/File:Information_icon4.svg License: Public domain
  3. Image source: http://commons.wikimedia.org/wiki/File:Information_icon4.svg License: Public domain
  4. &amp;lt;number&amp;gt; Image source:http://www.flickr.com/photos/jimkster/832582262/ By jimkster: http://www.flickr.com/photos/jimkster License: CC-by-A 2.0
  5. &amp;lt;number&amp;gt; Image source: James Neill, Creative Commons Attribution 3.0, http://commons.wikimedia.org/wiki/Image:FactorAnalysis_ConceptualModel_DotsRings.png Factor analysis can be conceived of as a method for examining a matrix of correlations in search of clusters of highly correlated variables. Exploratory factor analysis is a tool to help a researcher ‘throw a hoop’ around clusters of related items (i.e., items that seem to share a central underlying theme), to distinguish between clusters, and to identify and eliminate irrelevant or indistinct (overlapping) items.
  6. See also http://www.statsoft.com/textbook/stfacan.html A major purpose of factor analysis is data reduction, i.e., to reduce complexity in the data, by identifying underlying (latent) clusters of association. Commonly used in psychometric instrument development.
  7. FA helps to eliminate: redundant variables (e.g, items which are highly correlated are unnecessary) unclear variables(e.g., items which don’t load cleanly on a single factor) irrelevant variables(e.g., variables which have low loadings)
  8. FA helps to eliminate: redundant variables (e.g, items which are highly correlated are unnecessary) unclear variables(e.g., items which don’t load cleanly on a single factor) irrelevant variables(e.g., variables which have low loadings)
  9. “Introduced and established by Pearson in 1901 and Spearman three years thereafter, factor analysis is a process by which large clusters and grouping of data are replaced and represented by factors in the equation. As variables are reduced to factors, relationships between the factors begin to define the relationships in the variables they represent (Goldberg &amp; Digman, 1994). In the early stages of the process&amp;apos; development, there was little widespread use due largely in part to the immense amount of hand calculations required to determine accurate results, often spanning periods of several months. Later on a mathematical foundation would be developed aiding in the process and contributing to the later popularity of the methodology. In present day, the power of super computers makes the use of factor analysis a simplistic process compared to the 1900&amp;apos;s when only the devoted researchers could use it to accurately attain results (Goldberg &amp; Digman, 1994).” From http://www.personalityresearch.org/papers/fehringer.html Terminology was introduced by Thurstone (1931) - http://www.statsoft.com/textbook/stfacan.html
  10. &amp;lt;number&amp;gt; CFA has become the professional standard now for testing and developing theoretical factor structures, but Exploratory factor analysis still has its place and is an important multivariate statistical tool. We will only be focusing on EFA, a more tradition method, which is, however, been gradually used less and CFA is becoming more common. CFA uses is more recent and powerful, usually covered at PhD level.
  11. &amp;lt;number&amp;gt; CFA has become the professional standard now for testing and developing theoretical factor structures, but Exploratory factor analysis still has its place and is an important multivariate statistical tool. We will only be focusing on EFA, a more tradition method, which is, however, been gradually used less and CFA is becoming more common. CFA uses is more recent and powerful, usually covered at PhD level.
  12. &amp;lt;number&amp;gt; Arrows represent single items (or variables) on a test/survey/questionnaire. The circles represent factors, i.e., the shared variance amongst different sets of related items.
  13. &amp;lt;number&amp;gt;
  14. &amp;lt;number&amp;gt;
  15. &amp;lt;number&amp;gt;
  16. &amp;lt;number&amp;gt; Image source: http://www.siu.edu/~epse1/pohlmann/factglos/ This diagram represents the variance for a single factor. It also shows the variance for two items which correlate (r12) and which form part of the overall factor.
  17. &amp;lt;number&amp;gt; Image sources: http://www.creative-wisdom.com/teaching/assessment/subscales.html
  18. &amp;lt;number&amp;gt;
  19. &amp;lt;number&amp;gt;
  20. &amp;lt;number&amp;gt; Image source: http://people.cs.uct.ac.za/~dburford/masters/SATNAC/2000/index.html Image license: Unknown Rv
  21. &amp;lt;number&amp;gt; Ivancevic, V., Kaine, A.K., MCLindin, B.A, &amp; Sunde, J. (2003). Factor analysis of essential facial features. In the proceedings of the 25th International Conference on Information Technology Interfaces (ITI), pp. 187-191, Cavtat, Croatia.http://hdl.handle.net/1947/2580 Information Technology Interfaces, Proceedings of the 25th International Conference on ?, Volume , Issue , 16-19 June 2003 Page(s): 187 - 191Summary: We present the results of an exploratory factor analysis to determine the minimum number of facial features required for recognition. In the introduction, we provide an overview of research we have done in the area of the evaluation of facial recognition systems. In the next section, we provide a description of the facial recognition process (including signal processing operations such as face capture, normalisation, feature extraction, database comparison, decision and action). The following section provides description of the variables. We follow with our main result, where we have used a sample of 80 face images from a test database. We have preselected 20 normalized facial features and extracted six essential orthogonal factors, representing 76.5 % of the total facial feature variability. They are (in order of importance): upper-lip, eyebrow-position, nose-width, eye-position, eye/eyebrow-length and face-width. http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/8683/27508/01225343.pdf
  22. Image source: http://commons.wikimedia.org/wiki/File:Information_icon4.svg License: Public domain
  23. &amp;lt;number&amp;gt; Image sources: ClipArt (Unknown) Garbage In –&amp;gt; Garbage Out i.e., Put Crap (i.e., data with no logical relationship) into Factor Analysis –&amp;gt; Get Crap Out of Factor Analysis i.e., check your assumptions
  24. Image source: Table 7 Fabrigar et al (1999).
  25. &amp;lt;number&amp;gt; Example: Someone who works full-time, studies full-time, averages an HD, and has 2 children.
  26. Based on Francis Section 5.6, which is based on the Victorian Quality Schools Project. Francis, G. (2007). Introduction to SPSS for Windows: v. 15.0 and 14.0 with Notes for Studentware (5th ed.). Sydney: Pearson Education.
  27. Image source: http://www.acer.edu.au/research/programs/documents/RoweITUC_2004Paper.pdf This is from the questionnaire used in Francis 5.6, pp. 157-164.
  28. Image source: Francis 5.6, pp. 157-164?
  29. &amp;lt;number&amp;gt;
  30. Image source: Francis 5.6, pp. 157-164 Be careful to examine the anti-image correlation matrix (not the anti-image covariance matrix)
  31. &amp;lt;number&amp;gt; Large KMO values indicate that a factor analysis of the variables is a good idea. Should be &amp;gt; .5 for a satisfactory factor analysis
  32. Image source: This is from Francis 5.6, pp. 157-164
  33. &amp;lt;number&amp;gt;
  34. Image source: http://commons.wikimedia.org/wiki/File:Information_icon4.svg License: Public domain See also http://en.wikiversity.org/wiki/Exploratory_factor_analysis/Data_analysis_tutorial See also http://en.wikiversity.org/wiki/Exploratory_factor_analysis/Data_analysis_tutorial
  35. &amp;lt;number&amp;gt;
  36. &amp;lt;number&amp;gt;
  37. &amp;lt;number&amp;gt;
  38. &amp;lt;number&amp;gt; Image source: http://www.siu.edu/~epse1/pohlmann/factglos/
  39. &amp;lt;number&amp;gt;
  40. &amp;lt;number&amp;gt;
  41. &amp;lt;number&amp;gt;
  42. Image source: Francis 5.6, pp. 157-164
  43. Image: Francis 5.6, pp. 157-164
  44. Image: Francis 5.6, pp. 157-164
  45. If all the EVs are added together, it will equal the number of items. In other words, each item contributes a variance of 1 to be explained.
  46. Image: Francis 5.6, pp. 157-164 Y-axis shows variance (eigen variables) for each factor. Look for “elbow” in the curve - point where additional factors don’t add much to explained variance.
  47. &amp;lt;number&amp;gt; Image source: James Neill, Creative Commons Attribution 2.5 Australia, http://creativecommons.org/licenses/by/2.5/au/
  48. &amp;lt;number&amp;gt;
  49. &amp;lt;number&amp;gt;
  50. &amp;lt;number&amp;gt; The max. no. of possible factors = no. of variables Its important to refer back to original wording of questionnaire items – get the item labels to show on the SPSS output
  51. Image source: http://www.siu.edu/~epse1/pohlmann/factglos/
  52. Image source (left): http://www.xs-stock.co.uk/shopimages/products/normal/Kerplunk.jpg Image source (right): http://mayhem-chaos.net/photoblog/images/super_sized_kerplunk.jpg
  53. Orthogonal - Vectors (lines of best fit or factors) are perpendicular Oblimin - Vectors angles are permitted to be acute, i.e., correlation allowed between factors
  54. There is also: Quartimaxmaximises variance of loadings on each variable Equamaxcompromise between quartimax and varimax
  55. Image source: Unknown
  56. Image source: Francis 5.6, pp. 157-164
  57. Image source: Francis 5.6, pp. 157-164 ‘Compliant’ (6) and ‘Self-controlled’ (7) load onto 2 factors - run analysis without these two variables.
  58. Image source: (Based on?) Francis 5.6, pp. 157-164
  59. &amp;lt;number&amp;gt;
  60. &amp;lt;number&amp;gt;
  61. Tip: The rotated factor loading results can be printed out and color-coded with a highlighter for careful cross-comparison
  62. Factor analysis of the Condom Use Self-Efficacy Scale among multicultural college students Thomas W. Barkley, Jr and Jason L. Burns (2000) http://her.oxfordjournals.org/cgi/content/full/15/4/485
  63. Image source: http://commons.wikimedia.org/wiki/File:Information_icon4.svg License: Public domain
  64. Image source: http://commons.wikimedia.org/wiki/File:Information_icon4.svg License: Public domain See also http://en.wikiversity.org/wiki/Exploratory_factor_analysis/Data_analysis_tutorial See also http://en.wikiversity.org/wiki/Exploratory_factor_analysis/Data_analysis_tutorial