1. 1
Introduction to applied statistics
& applied statistical methods
Prof. Dr. Chang Zhu1
Overview
•Reliability analysis
•Factor analysis
2. 2
Validity and Reliability
• The principles of validity and reliability are
fundamental cornerstones of the scientific
method.
Reliability
• The degree of consistency between two
measures of the same thing (Mehrens and
Lehman, 1987).
• The measure of how stable, dependable,
trustworthy, and consistent a test is in
measuring the same thing each time
(Worthen et al., 1993)
3. 3
Reliability
Intrinsic
motivation
• Extrinsic
motivation
Reliability
• In science, theoretical constructs are often unobservable
things.
• Even when things are observable, measurement error
means often there is a need to calculate “summary”
variables.
• Reliability analysis tests whether the different
measurements are reliable/consistent
4. 4
Reliability
Reliability is used to measure the extent to
which an item, scale, or instrument will
yield the same score when administered in
different times, locations, or populations,
when the two administrations do not differ
in relevant variables.
Reliability analysis
• Reliability analysis allows you to study the
properties of measurement scales and the items
that make them up.
• Test the extent to which the items in your
questionnaire are related to each other
• Cronbach’s alpha is the most common used
measure of reliability.
• In SPSS: choose Analyze > Scale > Reliability
6. 6
Reliability analysis results
• The commonly accepted value of α is .7
Brief report:
• The reliability of the scale of xxx was
satisfactory (Cronbach’s alpha=xx).
Or
• The reliability of the scale of xxx was not
satisfactory (Cronbach’s alpha=xx).
Factor Analysis: constructs
• Constructs are usually defined as unobservable latent
variables.
• Example: the construct of teaching effectiveness. Several
variables are used to allow the measurement of such construct
(usually several scale items are used) because the construct
may include several dimensions.
• Unlike variables directly measured such as speed, height,
weight, etc., some variables such as egoism, creativity,
happiness, satisfaction, learning conceptions, learning styles,
teaching styles, self-regulation…. are not a single measurable
entity.12
8. 8
Factor Analysis
• A major goal of factor analysis is to represent
relationships among sets of variables parsimoniously
yet keeping factors meaningful.
• A good factor solution is both simple and
interpretable.
• When factors can be interpreted, new insights are
possible.
15
Understanding Factor Analysis
• Factor analysis is commonly used in:
–Data reduction
–Scale development
–The evaluation of the psychometric
quality of a measure, and
–The assessment of the dimensionality
of a set of variables.
16
9. 9
An example, a questionnaire of 30 items
5 factors are identified for the 30 item questionnaire
10. 10
Application of Factor Analysis
• Examine three common applications of factor analysis:
– Defining indicators of constructs (1)
– Defining dimensions for an existing measure (2)
– Selecting items or scales to be included in a measure (3)
19
Application of Factor Analysis (1)
Defining indicators of constructs:
Ideally 4 or more measures should be chosen to
represent each construct of interest.
The choice of measures should, as much as possible,
be guided by theory, previous research, and logic.
20
13. 13
Factor analysis
Step 1: The Correlation Matrix
– Generate a correlation matrix for all variables
– Identify variables not related to other variables
– If the correlation between variables are small, it is unlikely
that they share common factors (variables must be related
to each other for the factor model to be appropriate).
– Think of correlations in absolute value.
– Correlation coefficients greater than 0.3 in absolute value
are indicative of acceptable correlations.
– Examine visually the appropriateness of the factor model.
25
Factor analysis
Step 1: The Correlation Matrix
In SPSS:
• The Kaiser-Meyer-Olkin of sampling adequacy
(KMO) should be greater than .5 to be acceptable.
• Barlett’s test should be significant to indicate
variables are relatively independent from one another.
26
14. 14
The primary objective of this stage is to determine
the factors.
Initial decisions can be made here about the number
of factors underlying a set of measured variables.
Estimates of initial factors are obtained using
Principal components analysis.
The principal components analysis is the most
commonly used extraction method.
27
Factor analysis
Step 2: Factor extraction
• In principal components analysis, linear combinations of the
observed variables are formed.
• The 1st principal component is the combination that accounts
for the largest amount of variance in the sample (1st extracted
factor).
• The 2nd principle component accounts for the next largest
amount of variance and is uncorrelated with the first (2nd
extracted factor).
• Successive components explain progressively smaller portions
of the total sample variance, and all are uncorrelated with each
other.
28
Factor analysis
Step 2: Factor extraction
15. 15
• To decide on how many factors we need to represent
the data, we use 2 statistical criteria:
– Eigen Values, and
– The Scree Plot
29
Factor analysis
Step 2: Factor extraction
• The determination of the number
of factors is usually done by
considering only factors with
Eigen values greater than 1.
• Factors with a variance less than 1
are no better than a single variable,
since each variable is expected to
have a variance of 1.
30
Total Variance Explained
Comp
onent
Initial Eigenvalues
Extraction Sums of Squared
Loadings
Total
% of
Variance
Cumulativ
e % Total
% of
Variance
Cumulativ
e %
1 3.046 30.465 30.465 3.046 30.465 30.465
2 1.801 18.011 48.476 1.801 18.011 48.476
3 1.009 10.091 58.566 1.009 10.091 58.566
4 .934 9.336 67.902
5 .840 8.404 76.307
6 .711 7.107 83.414
7 .574 5.737 89.151
8 .440 4.396 93.547
9 .337 3.368 96.915
10 .308 3.085 100.000
Extraction Method: Principal Component Analysis.
Factor analysis
Step 2: Factor extraction
16. 16
• The examination of the Scree plot provides a
visual of the total variance associated with
each factor.
• The steep slope shows the large factors.
• The gradual trailing off (scree) shows the
rest of the factors usually lower than an
Eigen value of 1.
• In choosing the number of factors, in
addition to the statistical criteria, one should
make initial decisions based on conceptual
and theoretical grounds.
• At this stage, the decision about the number
of factors is not final.
31
Factor analysis
Step 2: Factor extraction
32
Component Matrixa
Component
1 2 3
I discussed my frustrations and feelings with person(s) in school .771 -.271 .121
I tried to develop a step-by-step plan of action to remedy the problems .545 .530 .264
I expressed my emotions to my family and close friends .580 -.311 .265
I read, attended workshops, or sought someother educational approach to correct the
problem
.398 .356 -.374
I tried to be emotionally honest with my self about the problems .436 .441 -.368
I sought advice from others on how I should solve the problems .705 -.362 .117
I explored the emotions caused by the problems .594 .184 -.537
I took direct action to try to correct the problems .074 .640 .443
I told someone I could trust about how I felt about the problems .752 -.351 .081
I put aside other activities so that I could work to solve the problems .225 .576 .272
Extraction Method: Principal Component Analysis.
a. 3 components extracted.
Component Matrix using Principle Component Analysis
Factor analysis
Step 2: Factor extraction
18. 18
• 4th Step: Making final decisions
– The final decision about the number of factors to choose is the
number of factors for the rotated solution that is most interpretable.
– To identify factors, group variables that have large loadings for the
same factor.
– Plots of loadings provide a visual for variable clusters.
– Interpret factors according to the meaning of the variables
• This decision should be guided by:
– A priori conceptual beliefs about the number of factors from past
research or theory
– Eigen values computed in step 2.
– The relative interpretability of rotated solutions computed in step 3.
35
Factor analysis
Step 4: Making final decisions
Factor Analysis
• What can be included in factor analysis?
19. 19
Practice
Practice: conduct factor and
reliability analyses
• A researcher has generated a new questionnaire which
is designed to measure happiness. The questionnaire
that she has generated has 10 items on it and she has
collected responses from 200 respondents.
• The questionnaire is measured on a five point scale
where 1 = strongly disagree and 5 = strongly agree.
• The data file is named Happy_measure.sav
• This example is taken from:
http://wps.pearsoned.co.uk/ema_uk_he_dancey_statsmath_4/84/21627/55366
53.cw/content/index.html
20. 20
In SPSS: Factor analysis
• Analyze > Dimension Reduction > Factor
• Move all the variables to the Items list
In SPSS: Descriptives options
• Select all the options in the Descriptives dialog box
21. 21
In SPSS: Extraction method
• Method: Principal components
• Analyze: correlation matrix and Scree plot
• Eigenvalues greater than 1
In SPSS: Rotation method
• Choose Varimax as the rotation method
22. 22
In SPSS: Factor scores
• Choose Anderson-Rubin as method of calculating
In SPSS: Options
• Choose Exclude case listwise for missing values
• Absolute value below: .4
23. 23
Preliminary analysis
• The first table we should look at is labeled
KMO and Barlett’s Test. The KMO value is
.79 (above .05) and the Barlett’s test is
significant (p < .001), which indicates that
the sample is adequate for factor analysis.
KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .790
Bartlett's Test of Sphericity Approx. Chi-Square 819.746
df 45
Sig. .000
How many factors to extract?
• eigenvalues
scree plot
Total Variance Explained
Component
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 3.186 31.862 31.862 3.186 31.862 31.862 3.170 31.699 31.699
2 2.928 29.279 61.140 2.928 29.279 61.140 2.944 29.442 61.140
3 .757 7.569 68.710
4 .658 6.583 75.293
5 .637 6.369 81.662
6 .522 5.220 86.882
7 .429 4.290 91.171
8 .380 3.801 94.973
9 .316 3.155 98.128
10 .187 1.872 100.000
Extraction Method: Principal Component Analysis.
24. 24
interpretation
• Examine the underlying theme
Rotated Component Matrixa
Component
1 2
Q8_I want to go out and party .892
Q7_I want to contact friends & family .837
Q9_The people at work inspire me .779
Q2_I have lots of friends .754
Q3_I love meeting people .694
Q6_I have a lot to look forward to .825
Q10_I feel excited at the start of each day .802
Q4_I feel full of energy .801
Q1_I feel enthusiastic .748
Q5_I have lots of interesting things to do .647
In SPSS: Reliability analysis
Based on the factor analysis, we have 2 factors
extracted or 2 sub-scales and the respective items as
below:
• Sub-scale 1 (sociability): Q2, 3, 7, 8, and 9
• Sub-scale 2 (positive feeling): Q1, 4, 5, 6, 10
We will calculate the Cronbach’s α for sub-scale 1 first.
25. 25
In SPSS: Factor analysis
• Analyze > Scale > Reliability
• Move the variables Q2, 3, 7, 8, and 9 to the Items list
• In the output, the table Reliability Statistics tells us that
the internal consistency of the 5 items is measured with
α = .851 (which is high).
Reporting the results
• Description of the analysis
• Table of factor loadings
(practical guideline page 8 and 9)
26. 26
Reporting the results
• A principal component analysis (PCA) was conducted on the 10
items with orthogonal rotation (varimax). The Kaiser-Meyer-Olkin
measure verified the sampling adequacy for the analysis: KMO = .79
which is good according to Field (2009). All KMO values for
individual items are well above the acceptable limit of .50 (Field,
2009). Bartlett’s test of spherity χ² (45) = 819.746, p < .001,
indicated that correlations between items were sufficiently large for
PCA. Two components had eigenvalues over Kaiser’s criterion of 1
and in combination explained 61.14% of the variance. The scree plot
also supports a two-factor structure. Table 1 shows the factor
loadings after rotation. The items that cluster on the same factors
suggest that factor 1 represents sociability and factor 2 positive
feeling.
Analysis description
Assignment
• Reliability analysis
• Factor analysis