Reliable, Valid & Generalizable: Multi-Item Scales

Reliability, validity,
generalizability and the use of
multi-item scales
Edward Shiu (Dept of Marketing)
edward.shiu@strath.ac.uk
Reliable? Valid?
Generalizable?

How to use a questionnaire from
published work
• Appendix with items
• Methodology section

Existing multi-item scales
• Used by many
• Reliability and validity may be known
• Good starting block
• Basis to compare / contrast results

Development of a Multi-item Scale
(Doing it the HARD way!! See Malhotra & Birks, 2007)
Develop Theory
Generate Initial Pool of Items: Theory, Secondary Data, and
Qualitative Research
Collect Data from a Large Pretest Sample
Statistical Analysis
Develop Purified Scale
Collect More Data from a Different Sample
Final Scale
Select a Reduced Set of Items Based on Qualitative Judgment
Evaluate Scale Reliability, Validity, and Generalizability

Example of Scale Development
• See Richins & Dawson (1992) “A Consumer
Values Orientation for Materialism and its
Measurement: Scale Development and
Validation,” Journal of Consumer Research, 19
(December), 303-316.
• Materialism scale (7 items)
– Marketing Scales Handbook (Vol IV) p. 352.
1.It is important to me to have really nice things.
2.I would like to be rich enough to buy anything I want.
3.I‟d be happier if I could afford to buy more things.
4. ......
• Note, published scales not always perfect!!!

Scale Evaluation
(See Malhotra & Birks, 2007)
Discriminant NomologicalConvergent
Test/
Retest
Alternative
Forms
Internal
Consistency
Content Criterion Construct
GeneralizabilityReliability Validity
Scale Evaluation

Reliability & Validity
• Reliability - extent a measuring
procedure yields consistent results on
repeated administrations of the scale
• Validity - degree a measuring
procedure accurately reflects or assesses
or captures the specific concept that the
researcher is attempting to measure
Reliable  Valid

Reliability
• Internal consistency reliability
DO THE ITEMS IN THE SCALE GEL WELL TOGETHER
• Split-half reliability, the items on the scale are divided
into two halves and the resulting half scores are
correlated
• Cronbach alpha (α)
– average of all possible „split-half‟ correlation coefficients resulting
from different ways of splitting the scale items
– value varies from 0 to 1
– α < 0.6 indicates unsatisfactory internal consistency reliability
(see Malhotra & Birks, 2007, p.358)
– Note: alpha tends to increase with an increase in the number of
items in scale

• test-retest reliability
– identical scale items administered at two different
times to same set of respondents
– assess (via correlation) if respondents give similar
answers
• alternative-forms reliability
– two equivalent forms of the scale are constructed
– same respondents are measured at two different
times, with a different form being used each time
– assess (via correlation) if respondents give similar
answers
– Note. Hardly ever practical

Construct Validity
• Construct validity is evidenced if we can establish
– convergent validity, discriminant validity and nomological validity
• Convergent validity extent to which scale correlates
positively with other measures of the same construct
• Discriminant validity extent to which scale does not
correlate with other conceptually distinct constructs
• Nomological validity extent to which scale correlates in
theoretically predicted ways with other distinct but
related constructs.
• Also read Malhotra & Birks, 2007, 358-359 on
– content (or face) validity, criterion (concurrent & predictive)
validity

Generalizability
• Refers to extent you can generalise from
your specific observations to beyond your
limited study, situation, items used,
method of administration, context.....
• Hardly even possible!!!

Fun time
• Now onto the data (COCB.sav) !!!!!!
• Read my forthcoming JBR article for
background on COCB and the scale
• 1st SPSS and Cronbach alpha
• Next, Amos and CFA
• Followed by Excel to calculate
composite/construct reliability and AVE, as
well as establish discriminant validity

Cronbach alpha (α)
• SPSS (Analyze…Scale…Reliability Analysis)
• α < 0.6 indicates unsatisfactory internal
consistency reliability (see Malhotra &
Birks, 2007, p.358)
• α > 0.7 indicates satisfactory internal
consistency reliability (Nunnally &
Berstein,1994)
Ref: Nunnally JC & Berstein IH. (1994) Psychometric
Theory. New York: McGraw-Hill.

SPSS output for α
Alpha value for dimension Credibility = 0.894 > 0.7 hence satisfactory

SPSS further output for α
• We note that alpha value for the Credibility
dimension would increase in value (from 0.894
to 0.902) if item cred4 is removed.
• However, unless the improvement is dramatic
AND there is separate reasons (e.g. similar
findings from other studies), then we should
leave the item as part of the dimension.

Limitations for Cronbach alpha
• We should employ multiple measures of
reliability (Cronbach alpha, composite/construct
reliability CR & Average Variance Extracted
AVE)
– Alpha and CR values often are very similar
but AVE‟s can vary much more from alpha
values
– AVE‟s are also used to assess construct
discriminant validity

Composite/Construct Reliability
• CR = {(sum of standardized loadings)2} / {(sum of
standardized loadings)2 + (sum of indicator
measurement errors)}
• AVE = Average Variance Extracted = Variance Extracted
= {sum of (standardzied loadings squared)} / {[sum of
(standardzied loadings squared)] + (sum of indicator
measurement errors)}
• Note: Recommended thresholds: CR > 0.6 & AVE > 0.5,
then construct internal consistency is evidenced (Fornell
& Larker, 1981).
Ref: Fornell, Claes and David G. Larcker (1981). “Evaluating Structural
Equation Models with Unobservable Variables and Measurement
Error,” Journal of Marketing Research, 18(1, February): 39-50.

Discriminant validity
• Discriminant validity is assessed by comparing
the shared variance (squared correlation)
between each pair of constructs against the
minimum of the AVEs for these two constructs.
• If within each possible pairs of constructs, the
shared variance observed is lower than the
minimum of their AVEs, then discriminant validity
is evidenced (Fornell and Larker, 1981).

Amos (Analysis of Moment Structures)
Comm
comm2e2
1
comm1e3 11
Bene
bene3e4
bene2e5
bene1e6
1
1
1
1
Cred
cred3e8
cred2e9
cred1e10
cred4e11
1
1
1
1
1
COCB
ave_SSI e12
ave_POC e13
ave_Voice e14
ave_wom e15
1
1
1
1
1
ave_BAoSF e16
1
ave_DoRA e17
1
ave_Flex e18
1
ave_PiFA e19
1
Loyalty
loy1
e22
1
1
loy2
e23
1
loy3
e24
1
Rectangles
= observed variables
Ellipses
= unobserved variables
loy1; loy2; loy3; comm1;
comm2;….; cred1; ….
bene1;....;ave_PiFA
= SPSS variables
e1 to e24
= error variances
= uniqueness
Loyalty; Comm; Cred;
Bene; COCB
= latent factors
= unobserved factors

CFA and goodness of fit
• See Hair et al.‟s book
• E.g.,
• The CFA resulted in an acceptable overall fit
(GFI=.90, CFI=.94, TLI=.92, RMSEA=.068, and
χ2=524.64, df=160, p<.001). All indicators load
significantly (p<.001) and substantively
(standardized coef >.5) on to their respective
constructs; thus providing evidence of
convergent validity.

Refs
• Baumgartner H, Homburg C. (1996). “Applications of structural
equation modeling in marketing and consumer research: a review,”
International Journal of Research in Marketing,13(2):139–61.
• Churchill, Gilbert A., Jr. (1979). “A Paradigm for Developing Better
Measures of Marketing Constructs,” Journal of Marketing Research,
16(1, February): 64-73.
• Fornell, Claes and David G. Larcker (1981). “Evaluating Structural
Equation Models with Unobservable Variables and Measurement
Error,” Journal of Marketing Research, 18(1, February): 39-50.
• Hair, Joseph F., Jr., Rolph E. Anderson, Ronald L. Tatham, and
William C. Black (1998), Multivariate Data Analysis. 5th ed.
Englewood Cliffs, NJ: Prentice Hall.
• Nunnally JC & Berstein IH. (1994) Psychometric Theory. New York:
McGraw-Hill.

Reliable, Valid & Generalizable: Multi-Item Scales

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Reliable, Valid & Generalizable: Multi-Item Scales

Ähnlich wie Reliable, Valid & Generalizable: Multi-Item Scales (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Reliable, Valid & Generalizable: Multi-Item Scales