This document discusses research design and measurement concepts related to data collection forms. It begins with learning outcomes, which focus on measurement scales, concepts, attitudes, and questionnaire design. It then covers determining what to measure based on research questions, operationalizing concepts, and different levels of measurement scales including nominal, ordinal, interval, and ratio. The document also discusses techniques for measuring attitudes, such as ranking, rating, sorting, and choice. Specific scales are described like Likert scales, semantic differentials, and category scales. Guidelines are provided for selecting a measurement scale based on objectives and properties of the data.
How to Get Started in Social Media for Art League City
Mba2216 week 07 08 measurement and data collection forms
1. Research Design :Research Design :
Measurement &Measurement &
Data Collection FormsData Collection Forms
Research Design :Research Design :
Measurement &Measurement &
Data Collection FormsData Collection Forms
MBA2216 BUSINESS RESEARCH PROJECTMBA2216 BUSINESS RESEARCH PROJECT
by
Stephen Ong
Visiting Fellow, Birmingham City
University, UK
4. 13–4
LEARNING OUTCOMESLEARNING OUTCOMESLEARNING OUTCOMESLEARNING OUTCOMES
1. Determine what needs to be measured to address
a research question or hypothesis
2. Distinguish levels of scale measurement
3. Know how to form an index or composite measure
4. List the three criteria for good measurement
5. Perform a basic assessment of scale reliability
and validity
After this lecture, you should be able to
5. 6. Describe how business researchers think of
attitudes
7. Identify basic approaches to measuring attitudes
8. Discuss the use of rating scales for measuring
attitudes
9. Represent a latent construct by constructing a
summated scale
10. Summarize ways to measure attitudes with ranking
and sorting techniques
11. Discuss major issues involved in the selection of a
measurement scale
13–5
LEARNING OUTCOMESLEARNING OUTCOMES
(cont’d)(cont’d)
LEARNING OUTCOMESLEARNING OUTCOMES
(cont’d)(cont’d)
6. 12. Explain the significance of decisions about
questionnaire design and wording
13. Define alternatives for wording open-ended and
fixed-alternative questions
14. Summarize guidelines for questions that avoid
mistakes in questionnaire design
15. Describe how the proper sequence of questions
may improve a questionnaire
16. Discuss how to design a questionnaire layout
17. Describe criteria for pretesting and revising a
questionnaire and for adapting it to global markets
LEARNING OUTCOMES (cont’d)LEARNING OUTCOMES (cont’d)LEARNING OUTCOMES (cont’d)LEARNING OUTCOMES (cont’d)
8. WHAT DO I MEASURE?WHAT DO I MEASURE?
Before the measurement process can be defined,
researchers have to decide exactly what it is that
needs to be produced.
The decision statement, corresponding research
questions and research hypotheses can be used to
decide what concepts need to be measured.
Measurement is the process of describing some
property of a phenomenon of interest usually by
assigning numbers in a reliable and valid way.
When numbers are used, the researcher must have
a rule for assigning a number to an observation in a
way that provides an accurate description.
All measurement, particularly in the social sciences,
contains error.
13–8
9. WHAT DO I MEASURE?WHAT DO I MEASURE?
(cont’d)(cont’d)
Concepts
A researcher has to know what to measure before
knowing how to measure something.
A concept is a generalized idea that represents
something of meaning.
Concepts such as age, sex, education and number of
children are relatively concrete properties and present
few problems in either definition or measurement.
Concepts such as brand loyalty, corporate culture,
and so on are more abstract and are more difficult to
both define and measure.
13–9
10. WHAT DO I MEASURE?WHAT DO I MEASURE?
(cont’d)(cont’d)
Operational Definitions
Researchers measure concepts through a process
known as operationalization, which is a process that
involves identifying scales that correspond to variance
in the concept.
Scales provide a range of values that correspond to
different values in the concept being measured.
Scales provide correspondence rules that indicate
that a certain value on a scale corresponds to some
true value of a concept, hopefully in a truthful way.
13–10
11. WHAT DO I MEASURE? (cont’d)WHAT DO I MEASURE? (cont’d)
Operational Definitions (cont’d)
Variables
Researchers use variance in concepts to
make diagnoses.
Variables capture different concept values.
Scales capture variance in concepts and as
such, the scales provide the researcher’s
variables.
For practical purposes, once a research
project is underway, there is little difference
between a concept and a variable.
12. WHAT DO I MEASURE?WHAT DO I MEASURE?
(cont’d)(cont’d)
Operational Definitions (cont’d)
Constructs
Sometimes a single variable cannot capture a
concept alone.
Using multiple variables to measure one
concept can often provide a more complete
account of some concept than could any
single variable.
A construct is a term used for concepts that
are measured with multiple variables.
Can be very helpful in operationlizing a
concept.
13–12
13. EXHIBIT 13.EXHIBIT 13.33 Susceptibility to Interpersonal Influence: An Operational DefinitionSusceptibility to Interpersonal Influence: An Operational Definition
19. 11-19
Levels of MeasurementLevels of Measurement
OrdinalOrdinalOrdinalOrdinal
intervalinterval
RatioRatio
NominalNominal ClassificationClassification
OrderOrder
ClassificationClassification
20. 11-20
Ordinal ScalesOrdinal Scales
• Characteristics ofCharacteristics of
nominal scalenominal scale
• OrderOrder
• Implies greater thanImplies greater than
or less thanor less than
21. 11-21
Levels of MeasurementLevels of Measurement
OrdinalOrdinal
IntervalIntervalIntervalInterval
RatioRatio
NominalNominal ClassificationClassification
OrderOrder
ClassificationClassification
OrderOrder
ClassificationClassification DistanceDistance
22. 11-22
Interval ScalesInterval Scales
Characteristics ofCharacteristics of
nominal and ordinalnominal and ordinal
scalesscales
Equality of interval.Equality of interval.
Equal distanceEqual distance
between numbersbetween numbers
25. Levels of Scale MeasurementLevels of Scale Measurement
The level of scale measurement is important
because it determines the mathematical
comparisons that are allowed.
The four levels of scale measurement are:
26. 13–26
Levels of Scale MeasurementLevels of Scale Measurement
(cont’d)(cont’d)
Nominal
Assigns a value to an object for
identification or classification purposes.
Most elementary level of measurement.
Ordinal
Ranking scales allowing things to be
arranged based on how much of some
concept they possible.
Have nominal properties.
27. 13–27
Levels of Scale MeasurementLevels of Scale Measurement
(cont’d)(cont’d)
Interval
Capture information about differences in
quantities of a concept.
Have both nominal and ordinal properties.
Ratio
Highest form of measurement.
Have all the properties of interval scales
with the additional attribute of
representing absolute quantities.
Absolute zero.
28. EXHIBIT 13.EXHIBIT 13.44 Nominal, Ordinal, Interval, and Ratio Scales Provide DifferentNominal, Ordinal, Interval, and Ratio Scales Provide Different
InformationInformation
29. EXHIBIT 13.EXHIBIT 13.55 Facts About the Four Levels of ScalesFacts About the Four Levels of Scales
30. 12-30
Measurements are RelativeMeasurements are Relative
“Any measurement must take into account
the position of the observer. There is no
such thing as measurement absolute, there
is only measurement relative.”
Jeanette Winterson
journalist and author
32. 12-32
Nature of AttitudesNature of Attitudes
Cognitive
I think oatmeal is healthier
than corn flakes for breakfast.
Affective
Behavioural
I hate corn flakes.
I intend to eat more oatmeal
for breakfast.
35. 12-35
Selecting aSelecting a
Measurement ScaleMeasurement Scale
Research objectives Response types
Data properties
Number of
dimensions
Forced or unforced
choices
Balanced or
unbalanced
Rater errors
Number of
scale points
38. 12-38
Balanced or UnbalancedBalanced or Unbalanced
Very badVery bad
BadBad
Neither good norNeither good nor
badbad
GoodGood
Very goodVery good
PoorPoor
FairFair
GoodGood
Very goodVery good
ExcellentExcellent
How good an actress is Angelina Jolie?
39. 12-39
Forced or Unforced ChoicesForced or Unforced Choices
Very badVery bad
BadBad
Neither good nor badNeither good nor bad
GoodGood
Very goodVery good
Very badVery bad
BadBad
Neither good nor badNeither good nor bad
GoodGood
Very goodVery good
No opinionNo opinion
Don’t knowDon’t know
How good an actress is Angelina Jolie?
40. 12-40
Number of Scale PointsNumber of Scale Points
Very badVery bad
BadBad
Neither good norNeither good nor
badbad
GoodGood
Very goodVery good
Very badVery bad
Somewhat badSomewhat bad
A little badA little bad
Neither good nor badNeither good nor bad
A little goodA little good
Somewhat goodSomewhat good
Very goodVery good
How good an actress is Angelina Jolie?
41. 12-41
Rater ErrorsRater Errors
Error of
central tendency
Error of leniency
•Adjust strength of
descriptive adjectives
•Space intermediate
descriptive phrases
farther apart
•Provide smaller
differences
in meaning between
terms near the
ends of the scale
•Use more scale points
44. ATTITUDES AS HYPOTHETICALATTITUDES AS HYPOTHETICAL
CONSTRUCTSCONSTRUCTS
Attitude
An enduring disposition to consistently
respond in a given manner to various aspects
of the world.
Components of attitudes:
Affective Component
The feelings or emotions toward an object
Cognitive Component
Knowledge and beliefs about an object
Behavioural Component
Predisposition to action
Intentions
Behavioural expectations
45. Techniques for MeasuringTechniques for Measuring
AttitudesAttitudes
Ranking
Requiring the respondent to rank order
objects in overall performance on the
basis of a characteristic or stimulus.
Rating
Asking the respondent to estimate the
magnitude of a characteristic, or quality,
that an object possesses by indicating on
a scale where he or she would rate an
object.
46. 14–46
Techniques for MeasuringTechniques for Measuring
Attitudes (cont’d)Attitudes (cont’d)
Sorting
Presenting the respondent with several
concepts typed on cards and requiring the
respondent to arrange the cards into a
number of piles or otherwise classify the
concepts.
Choice
Asking a respondent to choose one
alternative from among several
alternatives; it is assumed that the chosen
alternative is preferred over the others.
47. Attitude Rating ScalesAttitude Rating Scales
Simple Attitude Scale
Requires that an individual agree/disagree
with a statement or respond to a single
question.
This type of self-rating scale classifies
respondents into one of two categories (e.g.,
yes or no).
Example:
THE PRESIDENT SHOULD RUN FOR RE-ELECTION
_______ AGREE ______ DISAGREE
49. Attitude Rating Scales (cont’d)Attitude Rating Scales (cont’d)
Category Scale
A more sensitive measure than a simple
scale in that it can have more than two
response categories.
Question construction is an extremely
important factor in increasing the usefulness of
these scales.
Example:
How important were the following in your decision to visit San Diego? (check one for each item)
VERY SOMEWHAT NOT TOO
IMPORTANT IMPORTANT IMPORTANT
CLIMATE ___________ ___________ ___________
COST OF TRAVEL ___________ ___________ ___________
FAMILY ORIENTED ___________ ___________ ___________
EDUCATIONAL/HISTORICAL ASPECTS ___________ ___________ ___________
FAMILIARITY WITH AREA ___________ ___________ ___________
53. 12-53
Likert ScaleLikert Scale
The Internet is superior to traditional libraries for
comprehensive searches.
Strongly disagree
Disagree
Neither agree nor disagree
Agree
Strongly agree
54. Attitude Rating Scales (cont’d)Attitude Rating Scales (cont’d)
Likert Scale
A popular means for measuring attitudes.
Respondents indicate their own attitudes
by checking how strongly they agree or
disagree with statements.
Typical response alternatives: “strongly
agree,” “agree,” “uncertain,” “disagree,” and
“strongly disagree.”
Example:
It is more fun to play a tough, competitive tennis match than to
play an easy one.
___Strongly Agree ___Agree ___Not Sure ___Disagree ___Strongly Disagree
55. EXHIBIT 14.EXHIBIT 14.22 Likert Scale Items for Measuring Attitudes toward Patients’Likert Scale Items for Measuring Attitudes toward Patients’
Interaction with a Physician’s Service StaffInteraction with a Physician’s Service Staff
62. Other Scale Types (cont’d)Other Scale Types (cont’d)
Image Profile
A graphic representation of semantic
differential data for competing brands,
products, or stores to highlight
comparisons.
Because the data are assumed to be
interval, either the arithmetic mean or the
median will be used to compare the profile
of one product, brand, or store with that of
a competing product, brand, or store.
63. EXHIBIT 14.EXHIBIT 14.44 Image Profiles of Commuter Airlines versus Major AirlinesImage Profiles of Commuter Airlines versus Major Airlines
65. Attitude Rating Scales (cont’d)Attitude Rating Scales (cont’d)
Numerical Scales
Scales that have numbers as response
options, rather than “semantic space” or
verbal descriptions, to identify categories
(response positions).
In practice, researchers have found that a scale
with numerical labels for intermediate points
on the scale is as effective a measure as the
true semantic differential.
Example:
Now that you’ve had your automobile for about
one year, please tell us how satisfied you are
with your Ford Taurus.
Extremely Dissatisfied 1 2 3 4 5 6 7 Extremely Satisfied
66. 12-66
Multiple Rating List ScalesMultiple Rating List Scales
“Please indicate how important or unimportant each service characteristic
is:”
IMPORTANT UNIMPORTANT
Fast, reliable repair 7 6 5 4 3 2 1
Service at my location 7 6 5 4 3 2 1
Maintenance by manufacturer 7 6 5 4 3 2 1
Knowledgeable technicians 7 6 5 4 3 2 1
Notification of upgrades 7 6 5 4 3 2 1
Service contract after warranty 7 6 5 4 3 2 1
68. Other Scale Types (cont’d)Other Scale Types (cont’d)
Stapel Scale
Uses a single adjective as a substitute for
the semantic differential when it is
difficult to create pairs of bipolar
adjectives.
Tends to be easier to conduct and
administer than a semantic differential
scale.
69. EXHIBIT 14.EXHIBIT 14.55 A Stapel Scale for Measuring a Store’s ImageA Stapel Scale for Measuring a Store’s Image
71. Other Scale Types (cont’d)Other Scale Types (cont’d)
Constant-Sum Scale
Respondents are asked to divide a constant sum
to indicate the relative importance of attributes.
Respondents often sort cards, but the task may also be
a rating task (e.g., indicating brand preference).
Example:
Divide 100 points among each of the following
brands according to your preference for the
brand:
Brand A _________
Brand B _________
Brand C _________
74. EXHIBIT 14.EXHIBIT 14.88 Graphic Rating Scale with Picture ResponseGraphic Rating Scale with Picture Response
Categories Stressing Visual CommunicationCategories Stressing Visual Communication
75. Other Scale Types (cont’d)Other Scale Types (cont’d)
Graphic Rating Scale
A measure of attitude that allows
respondents to rate an object by choosing
any point along a graphic continuum.
Advantage:
Allows the researcher to choose any interval
desired for scoring purposes.
Disadvantage:
There are no standard answers.
79. RankingRanking
An ordinal scale may be developed by
asking respondents to rank order (from
most preferred to least preferred) a set
of objects or attributes.
Paired comparisons
Sorting
81. Paired ComparisonPaired Comparison
A measurement technique that involves
presenting the respondent with two
objects and asking the respondent to
pick the preferred object; more than
two objects may be presented, but
comparisons are made in pairs.
Number of comparisons = [(n)(n-1)/2]
Example:
I would like to know your overall opinion of two brands of adhesive bandages.
They are MedBand and Super-Aid. Overall, which of these two brands—
MedBand or Super-Aid—do you think is the better one? Or are both the same?
MedBand is better _____
Super-Aid is better _____
They are the same _____
85. SortingSorting
Require that respondents indicate their
attitudes or beliefs by arranging items
on the basis of perceived similarity or
some other attribute.
Example:
Here is a sheet that lists several airlines. Next to the name of each airline is a pocket. Here
are ten cards. I would like you to put these cards in the pockets next to the airlines you
would prefer to fly on your next trip. Assume that all of the airlines fly to wherever you
would choose to travel. You can put as many cards as you want next to an airline, or you
can put no cards next to an airline.
Cards
American Airlines _____
Delta Airlines _____
United Airlines _____
Southwest Airlines _____
Northwest Airlines _____
86. 12-86
Example : MindWriter ScalingExample : MindWriter Scaling
Likert Scale
The problem that prompted service/repair was resolved
Strongly
Disagree Disagree
Neither Agree
Nor Disagree Agree
Strongly
Agree
1 2 3 4 5
Numerical Scale (MindWriter’s Favourite)
To what extent are you satisfied that the problem that prompted service/repair was resolved?
Very
Dissatisfied
Very
Satisfied
1 2 3 4 5
Hybrid Expectation Scale
Resolution of the problem that prompted service/repair.
Met Few
Expectations
Met Some
Expectations
Met Most
Expectations
Met All
Expectations
Exceeded
Expectations
1 2 3 4 5
87. 12-87
Ideal Scalogram PatternIdeal Scalogram Pattern
Item
Participant
Score
2 4 1 3
X X X X 4
__ X X X 3
__ __ X X 2
__ __ __ X 1
__ __ __ __ 0
88. Measuring Behavioural IntentionMeasuring Behavioural Intention
Behavioural Component
The behavioural expectations (expected
future actions) of an individual toward an
attitudinal object.
Example:
How likely is it that you will purchase a Honda
Fit?
I definitely will buy
I probably will buy
I might buy
I probably will not buy
I definitely will not buy
89. Measuring Behavioural IntentionMeasuring Behavioural Intention
(cont’d)(cont’d)
Behavioural Differential
A rating scale instrument similar to a
semantic differential, developed to measure
the behavioural intentions of subjects toward
future actions.
A description of the object to be judged is placed
on the top of a sheet, and the subjects indicate
their behavioural intentions toward this object on a
series of scales.
Example:
A 25 year-old woman sales representative
Would ___ : ___ : ___ : ___ : ___ : ___ : ___ : Would Not
ask this person for advice.
90. Mathematical and StatisticalMathematical and Statistical
Analysis of ScalesAnalysis of Scales
Although you can put numbers into
formulas and perform calculations
with almost any numbers, the
researcher has to know the meaning
behind the numbers before useful
conclusions can be drawn (e.g.,
averaging the numbers used to
identify school busses is
meaningless).
91. Mathematical and StatisticalMathematical and Statistical
Analysis of Scales (cont’d)Analysis of Scales (cont’d)
Discrete Measures
Discrete measures are those that take on only one
of a finite number of values.
Most often used to represent a classificatory
variable and thus do not represent intensity of
measures, only membership.
Common discrete scales include any yes-no
response, matching, colour choice or practically all
scales that involve selecting from a small number
of categories.
Nominal and ordinal scales are discrete measures.
The central tendency of discrete measures is best
captured by the mode (i.e., most frequent level).
92. 13–92
Mathematical and StatisticalMathematical and Statistical
Analysis of Scales (cont’d)Analysis of Scales (cont’d)
Continuous Measures
Continuous measures are those assigning values
anywhere along some scale range in a place that
corresponds to the intensity of some concept.
Ratio measures are continuous measures.
Strictly speaking, interval scales are not necessarily
continuous.
e.g., Likert item ranging from 1=strongly
disagree to 5=strongly agree.
This is a discrete scale because only the
values 1, 2, 3, 4, or 5 can be assigned.
93. Index MeasuresIndex Measures
Attributes
Single characteristics or fundamental features
that pertain to an object, person, or issue.
Index Measures
Assign a value based on how much of the
concept being measured is associated with
an observation.
Indexes often are formed by putting several
variables together.
Composite Measures
Assign a value to an observation based on a
mathematical derivation of multiple variables.
94. Computing Scale ValuesComputing Scale Values
Summated Scale
A scale created by simply summing
(adding together) the response to each
item making up the composite measure.
Reverse Coding
Means that the value assigned for a
response is treated oppositely from the
other items.
96. Three Criteria for GoodThree Criteria for Good
MeasurementMeasurement
SensitivitySensitivitySensitivitySensitivity
ReliabilityReliabilityReliabilityReliability ValidityValidityValidityValidity
GoodGood
MeasurementMeasurement
GoodGood
MeasurementMeasurement
97. ReliabilityReliability
Reliability
Reliability is an indicator of a measure’s
internal consistency.
A measure is reliable when different attempts
at measuring something converge on the
same result.
When the measuring process provides
reproducible results, the measuring
instrument is reliable.
Internal Consistency
Represents a measure’s homogeneity or the
extent to which each indicator of a concept
converges on some common meaning.
Measured by correlating scores on subsets
of items making up a scale.
98. Internal ConsistencyInternal Consistency
Split-half Method
Assessing internal consistency by
checking the results of one-half of a set of
scaled items against the results from the
other half.
The two scale halves should correlate
highly.
They should also produce similar scores.
99. Internal Consistency (cont’d)Internal Consistency (cont’d)
Coefficient alpha (α)
The most commonly applied estimate of a multiple
item scale’s reliability.
Represents the average of all possible split-half
reliabilities for a construct.
The coefficient demonstrates whether or not the
different items converge.
Ranges in value from 0 (no consistency) to 1
(complete consistency).
Generally, scales with a coefficient α:
100. Test-Retest ReliabilityTest-Retest Reliability
Test-retest Method
Administering the same scale or measure to
the same respondents at two separate points
in time to test for stability.
Represents a measure’s repeatability.
Problems:
The pre-measure, or first measure, may
sensitize the respondents and subsequently
influence the results of the second measure.
Time effects that produce changes in attitude
or other maturation of the subjects.
101. ValidityValidity
ValidityValidity
Good measures should be both precise
(i.e., reliable) and accurate (i.e., valid).
Validity is the accuracy of a measure or
the extent to which a score truthfully
represents a concept.
Does a scale measure what was intended
to be measured?
When a measure lacks validity, any
conclusions based on that measure are
also likely to be faulty.
102. Validity : Face, Content …Validity : Face, Content …
Establishing Validity:
The four basic approaches to establishing
validity are face validity, content validity,
criterion validity, and construct validity.
Face validityFace validity refers to the subjective
agreement among professionals that a
scale logically reflects the concept
being measured.
Content validityContent validity refers to the degree
that a measure covers the domain of
interest.
103. Validity : Criterion …Validity : Criterion …
Criterion validityCriterion validity addresses the question:
“Does my measure correlate with measures“Does my measure correlate with measures
of similar concepts or known quantities?”of similar concepts or known quantities?”
May be classified as either concurrent
validity or predictive validity depending
on the time sequence in which the new
measurement scale and the criterion
measure are correlated.
If measures taken at the same time
concurrent validity.
If measures taken at different times
predictive validity.
104. Validity : Construct …Validity : Construct …
Construct validityConstruct validity exists when a measure
reliably measures and truthfully represents a
unique concept and consists of several
components:
Face and Content validityFace and Content validity
Convergent validityConvergent validity – another way of
expressing internal consistency; highly
reliable scales contain convergent validity.
Criterion validityCriterion validity
Discriminant validityDiscriminant validity – represents how unique
or distinct is a measure; a scale should not
correlate too highly (i.e., above .75) with a
measure of a different construct.
106. SensitivitySensitivity
Sensitivity
A measurement instrument’s ability to
accurately measure variability in stimuli or
responses.
Generally increased by adding more
response points or adding scale items.
107. Selecting a Measurement ScaleSelecting a Measurement Scale
Some Practical Questions:
Is a ranking, sorting, rating, or choice technique best?
Should a monadic or a comparative scale be used?
What type of category labels, if any, will be used for the
rating scale?
How many scale categories or response positions are
needed to accurately measure an attitude?
Should a balanced or unbalanced rating scale be chosen?
Should a scale that forces a choice among predetermined
options be used?
Should a single measure or an index measure be used?
108. Selecting a Measurement ScaleSelecting a Measurement Scale
(cont’d)(cont’d)
Monadic Rating Scale
Asks about a single concept in isolation.
The respondent is not given a specific
frame of reference.
Example:
Now that you’ve had your automobile for about 1 year, please tell us
how satisfied you are with its engine power and pickup.
109. Please indicate how the amount of authority in your present position
compares with the amount of authority that would be ideal for this
position.
TOO MUCH ABOUT RIGHT TOO LITTLE
Selecting a Measurement ScaleSelecting a Measurement Scale
(cont’d)(cont’d)
Comparative Rating Scale
Asks respondents to rate a concept in
comparison with a benchmark explicitly
used as a frame of reference.
Example:
110. Selecting a MeasurementSelecting a Measurement
Scale (cont’d)Scale (cont’d)
What Type of Category Labels, If Any?
Verbal labels for response categories help
respondents better understand the response
positions.
The maturity and educational levels of the
respondents will influence the labeling
decision.
How Many Scale Categories or Response
Positions?
Five to eight points are optimal for sensitivity.
The researcher must determine the number of
positions that is best for the specific project.
111. Selecting a Measurement ScaleSelecting a Measurement Scale
(cont’d)(cont’d)
Balanced Rating Scale
A fixed-alternative rating scale with an
equal number of positive and negative
categories; a neutral point or point of
indifference is at the center of the scale.
Example:
112. Selecting a Measurement ScaleSelecting a Measurement Scale
(cont’d)(cont’d)
Unbalanced Rating Scale
A fixed-alternative rating scale that has
more response categories at one end than
the other resulting in an unequal number
of positive and negative categories.
Example:
113. Selecting a MeasurementSelecting a Measurement
Scale (cont’d)Scale (cont’d)
Forced-choice Rating Scale
A fixed-alternative rating scale that
requires respondents to choose one of the
fixed alternatives.
Non-forced Choice Scale
A fixed-alternative rating scale that
provides a “no opinion” category or that
allows respondents to indicate that they
cannot say which alternative is their
choice.
114. Selecting a MeasurementSelecting a Measurement
Scale (cont’d)Scale (cont’d)
Factors affecting the choice of using a
single measure or an index measure:
The complexity of the issue to be
investigated.
The number of dimensions the issue
contains.
Whether individual attributes of the stimulus
are part of a holistic attitude or are seen as
separate items.
The researcher’s conceptual (problem)
definition will be helpful in making this
choice.
117. Strategic Concerns inStrategic Concerns in
Instrument DesignInstrument Design
What type of scale is needed?
What communication approach will be used?
Should the questions be structured?
Should the questioning be disguised?
118. Technology AffectsTechnology Affects
Questionnaire DevelopmentQuestionnaire Development
WebSurveyor used to write an instrument.
Write questionnairesWrite questionnaires
more quicklymore quickly
Create visually drivenCreate visually driven
instrumentsinstruments
Eliminate manualEliminate manual
data entrydata entry
Save time in dataSave time in data
analysisanalysis
119. 13-119
Disguising Study ObjectivesDisguising Study Objectives
Situations
where
disguise is
unnecessary
Situations
where
disguise is
unnecessary
Willingly shared,
Conscious-level
information
Reluctantly shared,
Conscious-level
information
Knowable,
Limited-conscious-level
information
Subconscious-level
information
120. Dummy Table forDummy Table for
American Eating HabitsAmerican Eating Habits
Age
Use of Convenience Foods
Always
Use
Use
Frequently
Use
Sometime
s Rarely Use Never Use
18-24
25-34
35-44
55-64
65+
123. Engagement = ConvenienceEngagement = Convenience
“Participants are becoming more and
more aware of the value of their time. The
key to maintaining a quality dialogue with
them is to make it really convenient for
them to engage, whenever and wherever
they want.”
Tom Anderson
managing partner
Anderson Analytics
124. 13-124
Question ContentQuestion Content
Should this question be asked?
Is the question of proper scope and coverage?
Can the participant adequately
answer this question as asked?
Will the participant willingly
answer this question as asked?
129. 13-129
Multiple Choice ResponseMultiple Choice Response
StrategyStrategy
Which one of the following factors was most influential
in your decision to attend Metro U?
Good academic standing
Specific program of study desired
Enjoyable campus life
Many friends from home
High quality of faculty
130. 13-130
Checklist Response StrategyChecklist Response Strategy
Which of the following factors influenced
your decision to enroll in Metro U? (Check all that apply.)
Tuition cost
Specific program of study desired
Parents’ preferences
Opinion of brother or sister
Many friends from home attend
High quality of faculty
131. 13-131
Rating Response StrategyRating Response Strategy
Strongly
influential
Somewhat
influential
Not at all
influential
Good academic reputation
Enjoyable campus life
Many friends
High quality faculty
Semester calendar
132. 13-132
RankingRanking
Please rank-order your top three factors from the following list
based on their influence in encouraging you to apply to Metro
U. Use 1 to indicate the most encouraging factor, 2 the next
most encouraging factor, etc.
_____ Opportunity to play collegiate sports
_____ Closeness to home
_____ Enjoyable campus life
_____ Good academic reputation
_____ High quality of faculty
133. 13-133
Summary of Scale TypesSummary of Scale Types
Type Restrictions Scale
Items
Data Type
Rating Scales
Simple Category
Scale
• Needs mutually exclusive choices One or
more
Nominal
Multiple Choice
Single-Response
Scale
• Needs mutually exclusive choices
• May use exhaustive list or ‘other’
Many Nominal
Multiple Choice
Multiple-Response
Scale
(checklist)
• Needs mutually exclusive choices
• Needs exhaustive list or ‘other’
Many Nominal
Likert Scale • Needs definitive positive or
negative statements with which to
agree/disagree
One or
more
Ordinal
Likert-type Scale •Needs definitive positive or
negative statements with which to
agree/disagree
One or
more
Ordinal
134. 13-134
Summary of Scale TypesSummary of Scale Types
Type Restrictions Scale
Items
Data Type
Rating Scales
Numerical
Scale
•Needs concepts with standardized meanings;
•Needs number anchors of the scale or end-points
•Score is a measurement of graphical space
One or
many
Ordinal or
Interval
Multiple
Rating List
Scale
•Needs words that are opposites to anchor the
end-points on the verbal scale
Up to
10
Ordinal
Fixed Sum
Scale
•Participant needs ability to calculate total to
some fixed number, often 100.
Two or
more
Interval or Ratio
135. 13-135
Summary of Scale TypesSummary of Scale Types
Type Restrictions Scale
Items
Data Type
Rating Scales
Stapel
Scale
•Needs verbal labels that are operationally
defined or standard.
One or
more
Ordinal or
Interval
Graphic
Rating
Scale
•Needs visual images that can be interpreted as
positive or negative anchors
•Score is a measurement of graphical space
from one anchor.
One or
more
Ordinal
(Interval, or
Ratio)
136. 13-136
Summary of Scale TypesSummary of Scale Types
Type Restrictions Scale
Items
Data Type
Ranking Scales
Paired
Comparison
Scale
• Number is controlled by participant’s
stamina and interest.
Up to 10 Ordinal
Forced Ranking
Scale
• Needs mutually exclusive choices. Up to 10 Ordinal or
Interval
Comparative
Scale
• Can use verbal or graphical scale. Up to 10 Ordinal
140. 13-140
Sources of QuestionsSources of Questions
Handbook ofHandbook of
Marketing ScalesMarketing Scales
The Gallup PollThe Gallup Poll
Cumulative IndexCumulative Index
Measures ofMeasures of
Personality andPersonality and
Social-Social-
PsychologicalPsychological
AttitudesAttitudes
Measures ofMeasures of
Political AttitudesPolitical Attitudes
Index to InternationalIndex to International
Public OpinionPublic Opinion
Sourcebook of HarrisSourcebook of Harris
National SurveysNational Surveys
Marketing ScalesMarketing Scales
HandbookHandbook
American SocialAmerican Social
Attitudes DataAttitudes Data
SourcebookSourcebook
142. 13-142
Guidelines forGuidelines for
Question SequencingQuestion Sequencing
Interesting topics earlyInteresting topics early
Simple topics earlySimple topics early
Sensitive questions laterSensitive questions later
Classification questions laterClassification questions later
Transition between topicsTransition between topics
Reference changes limitedReference changes limited
143. Illustrating the Funnel ApproachIllustrating the Funnel Approach
How do you think this country is gettingHow do you think this country is getting
along in its relations with otheralong in its relations with other
countries?countries?
How do you think we are doing in ourHow do you think we are doing in our
relations with Iran?relations with Iran?
Do you think we ought to be dealing withDo you think we ought to be dealing with
Iran differently than we are now?Iran differently than we are now?
(If yes) What should we be doing(If yes) What should we be doing
differently?differently?
Some people say we should get tougherSome people say we should get tougher
with Iran and others think we are toowith Iran and others think we are too
tough as it is; how do you feel about it?tough as it is; how do you feel about it?
148. Questionnaire Quality and Design:Questionnaire Quality and Design:
Basic ConsiderationsBasic Considerations
Questionnaire design is one of the most
critical stages in the survey research
process.
A questionnaire (survey) is only as good as
the questions it asks—ask a bad question,
get bad results.
Composing a good questionnaire appears
easy, but it is usually the result of long,
painstaking work.
The questions must meet the basic criteria of
relevance and accuracy.
149. Decisions in QuestionnaireDecisions in Questionnaire
DesignDesign
1. What should be asked?
2. How should questions be phrased?
3. In what sequence should the questions
be arranged?
4. What questionnaire layout will best serve
the research objectives?
5. How should the questionnaire be
pretested? Does the questionnaire need
to be revised?
150. What Should Be Asked?What Should Be Asked?
Questionnaire Relevancy
All information collected should address a
research question in helping the decision maker
in solving the current business problem.
Questionnaire Accuracy
Increasing the reliability and validity of
respondent information requires that:
Questionnaires should use simple, understandable,
unbiased, unambiguous, and nonirritating words.
Questionnaire design should facilitate recall and
motivate respondents to cooperate.
Proper question wording and sequencing to avoid
confusion and biased answers.
151. Wording QuestionsWording Questions
Open-ended Response Questions
Pose some problem and ask respondents to answer in
their own words.
Advantages:
Are most beneficial in exploratory research, especially
when the range of responses is not known.
May reveal unanticipated reactions toward the product.
Are good first questions because they allow respondents to
warm up to the questioning process.
Disadvantages:
High cost of administering open-ended response questions.
The possibility that interviewer bias will influence the
answer.
Bias introduced by articulate individuals’ longer answers.
152. Wording Questions (cont’d)Wording Questions (cont’d)
Fixed-alternative Questions
Questions in which respondents are given
specific, limited-alternative responses and asked
to choose the one closest to their own viewpoint.
Advantages:
Require less interviewer skill
Take less time to answer
Are easier for the respondent to answer
Provides comparability of answers
Disadvantages:
Lack of range in the response alternatives
Tendency of respondents to choose convenient
alternative
153. Types of Fixed-AlternativeTypes of Fixed-Alternative
QuestionsQuestions
Simple-dichotomy (dichotomous) Question
Requires the respondent to choose one of two alternatives
(e.g., yes or no).
Determinant-choice Question
Requires the respondent to choose one response from
among multiple alternatives (e.g., A, B, or C).
Frequency-determination Question
Asks for an answer about general frequency of occurrence
(e.g., often, occasionally, or never).
Checklist Question
Allows the respondent to provide multiple answers to a
single question by checking off items.
154. Phrasing Questions for Self-Phrasing Questions for Self-
Administered,Telephone, andAdministered,Telephone, and
Personal Interview SurveysPersonal Interview Surveys
Influences on Question Phrasing:
The means of data collection—telephone
interview, personal interview, self-
administered questionnaire—will influence
the question format and question phrasing.
Questions for mail, Internet, and telephone
surveys must be less complex than those used in
personal interviews.
Questionnaires for telephone and personal
interviews should be written in a conversational
style.
155. EXHIBIT 15.EXHIBIT 15.11 Reducing Question Complexity by Providing Fewer Responses forReducing Question Complexity by Providing Fewer Responses for
Telephone InterviewsTelephone Interviews
156. Guidelines for ConstructingGuidelines for Constructing
QuestionsQuestions
Avoid complexity: Simpler language is better.
Avoid leading and loaded questions.
Avoid ambiguity: Be as specific as possible.
Avoid double-barreled items.
Avoid making assumptions.
Avoid burdensome questions that may tax the
respondent’s memory.
Make certain questions generate variance.
157. What Is the Best QuestionWhat Is the Best Question
Sequence?Sequence?
Order bias
Bias caused by the influence of earlier questions
in a questionnaire or by an answer’s position in a
set of answers.
Funnel technique
Asking general questions before specific
questions in order to obtain unbiased responses.
Filter question
A question that screens out respondents who are
not qualified to answer a second question.
Pivot question
A filter question used to determine which version
of a second question will be asked.
158. EXHIBIT 15.EXHIBIT 15.22
Flow of QuestionsFlow of Questions
to Determine theto Determine the
Level of PromptingLevel of Prompting
Required toRequired to
Stimulate RecallStimulate Recall
159. What Is the Best Layout?What Is the Best Layout?
Traditional Questionnaires
Multiple-grid question
Several similar questions arranged in a grid
format.
The title of a questionnaire should be
phrased carefully:
To capture the respondent’s interest, underline
the importance of the research
Emphasize the interesting nature of the study
Appeal to the respondent’s ego
Emphasize the confidential nature of the study
To not bias the respondent in the same way
that a leading question might
160. EXHIBIT 15.EXHIBIT 15.33 Layout of a Page from a Telephone QuestionnaireLayout of a Page from a Telephone Questionnaire
161. EXHIBIT 15.EXHIBIT 15.44 Telephone Questionnaire with Skip QuestionsTelephone Questionnaire with Skip Questions
164. Internet QuestionnairesInternet Questionnaires
Graphical User Interface (GUI) Software
The researcher can control the background,
colours, fonts, and other features displayed
on the screen so as to create an attractive and
easy-to-use interface between the user and
the Internet survey.
Layout Issues
Paging layout - going from screen to screen.
Scrolling layout – entire questionnaire
appears on one page and respondent has the
ability to scroll down.
165. Internet Questionnaire LayoutInternet Questionnaire Layout
Push Button
A small outlined area, such as a rectangle or an
arrow, that the respondent clicks on to select an
option or perform a function, such as submit.
Status Bar
A visual indicator that tells the respondent what
portion of the survey he or she has completed.
Radio Button
A circular icon, resembling a button, that
activates one response choice and deactivates
others when a respondent clicks on it.
166. Internet Questionnaire LayoutInternet Questionnaire Layout
(cont’d)(cont’d)
Drop-down Box
A space saving device that reveals responses when they
are needed but otherwise hides them from view.
Check Boxes
Small graphic boxes, next to an answers, that a respondent
clicks on to choose an answer; typically, a check mark or an
X appears in the box when the respondent clicks on it.
Open-ended Boxes
Boxes where respondents can type in their own answers to
open-ended questions.
Pop-up Boxes
Boxes that appear at selected points and contain
information or instructions for respondents.
167. EXHIBIT 15.EXHIBIT 15.77 Question in an Online Screening Survey for Joining a ConsumerQuestion in an Online Screening Survey for Joining a Consumer
PanelPanel
168. 15–168
EXHIBIT 15.EXHIBIT 15.88 Alternative Ways of Displaying Internet QuestionsAlternative Ways of Displaying Internet Questions
169. Internet Questionnaire LayoutInternet Questionnaire Layout
(cont’d)(cont’d)
Software That Makes Questionnaires
Interactive
Variable piping software
Allows variables to be inserted into an Internet
questionnaire as a respondent is completing it.
Error trapping software
Controls the flow of an Internet questionnaire.
Forced answering software
Prevents respondents from continuing with an
Internet questionnaire if they fail to answer a
question.
Interactive help desk
A live, real-time support feature that solves
problems or answers questions respondents may
encounter in completing the questionnaire.
170. Pretesting and RevisingPretesting and Revising
QuestionnairesQuestionnaires
Pretesting Process
Seeks to determine whether respondents
have any difficulty understanding the
questionnaire and whether there are any
ambiguous or biased questions.
Preliminary Tabulation
A tabulation of the results of a pretest to
help determine whether the questionnaire
will meet the objectives of the research.
171. Designing Questionnaires forDesigning Questionnaires for
Global MarketsGlobal Markets
Back Translation
Taking a questionnaire that has previously
been translated into another language and
having a second, independent translator
translate it back to the original language.
A questionnaire developed in one country
may be difficult to translate because
equivalent language concepts do not exist or
because of differences in idiom and
vernacular.
172. Further ReadingFurther Reading
COOPER, D.R. AND SCHINDLER, P.S. (2011)
BUSINESS RESEARCH METHODS, 11TH
EDN,
MCGRAW HILL
ZIKMUND, W.G., BABIN, B.J., CARR, J.C. AND
GRIFFIN, M. (2010) BUSINESS RESEARCH
METHODS, 8TH
EDN, SOUTH-WESTERN
SAUNDERS, M., LEWIS, P. AND THORNHILL, A.
(2012) RESEARCH METHODS FOR BUSINESS
STUDENTS, 6TH
EDN, PRENTICE HALL.
SAUNDERS, M. AND LEWIS, P. (2012) DOING
RESEARCH IN BUSINESS & MANAGEMENT, FT
PRENTICE HALL.
Editor's Notes
This "Deco" border was drawn on the Slide master using PowerPoint's Rectangle and Line tools. A smaller version was placed on the Notes Master by selecting all of the elements (using Select All from the Edit menu), deselecting the unwanted elements such as the Title (holding down the Shift key and clicking on the unwanted elements), and then using Paste as Picture from the Edit menu to place the border on the Notes Master. After pasting as a picture, we used the resize handles (with Shift to maintain the proportions) to reduce it to the size you see. Be sure to delete this word processing box before using this template for your own presentation.
Exhibit 6-1 illustrates design in the research process and highlights the topics covered by the term research design. Subsequent chapters will provide more detailed coverage of the research design topics.
Exhibit 11-4 While Exhibit 11-3 summarized the characteristics of all the measurement scales. Exhibit 11-4, shown in the slide, illustrates the process of deciding which type of data is appropriate for one’s research needs.
Measurement in research consists of assigning numbers to empirical events, objects or properties, or activities in compliance with a set of rules. This slide illustrates the three-part process of measurement. Text uses an example of auto show attendance. A mapping rule is a scheme for assigning numbers to aspects of an empirical event.
Exhibit 11-1. The goal of measurement – of assigning numbers to empirical events in compliance with a set of rules – is to provide the highest-quality, lowest-error data for testing hypotheses, estimation or prediction, or description. The object of measurement is a concept, the symbols we attach to bundles of meaning that we hold and share with others. Higher-level concepts, constructs, are for specialized scientific explanatory purposes that are not directly observable and for thinking about and communicating abstractions. Concepts and constructs are used at theoretical levels while variables are used at the empirical level. Variables accept numerals or values for the purpose of testing and measurement. An operational definition defines a variable in terms of specific measurement and testing criteria. These are further reviewed in Exhibit 11-2 on page 341 of the text.
Students will be building their measurement questions from different types of scales. They need to know the difference in order to choose the appropriate type. Each scale type has its own characteristics.
This is a good time to ask students to develop a question they could ask that would provide only classification of the person answering it . Classification means that numbers are used to group or sort responses. Consider asking students if a number of anything is always an indication of ratio data. For example, what if we ask people how many cookies they eat a day? What if a business calls themselves the “number 1” pizza in town? These questions lead up to the next slide. Does the fact that James wears 23 mean he shoots better or plays better defense than the player donning jersey number 18? In measuring, one devises some mapping rule and then translates the observation of property indicants using this rule. Mapping rules have four characteristics and these are named in the slide. Classification means that numbers are used to group or sort responses. Order means that the numbers are ordered. One number is greater than, less than, or equal to another number. Distance means that differences between numbers can be measured. Origin means that the number series has a unique origin indicated by the number zero. Combinations of these characteristics provide four widely used classifications of measurement scales: nominal, ordinal, interval, and ratio.
Nominal scales collect information on a variable that can be grouped into categories that are mutually exclusive and collectively exhaustive. For example, symphony patrons could be classified by whether or not they had attended prior performances. The counting of members in each group is the only possible arithmetic operation when a nominal scale is employed. If we use numerical symbols within our mapping rule to identify categories, these numbers are recognized as labels only and have no quantitative value. Nominal scales are the least powerful of the four data types. They suggest no order or distance relationship and have no arithmetic origin. The researcher is restricted to use of the mode as a measure of central tendency. The mode is the most frequently occurring value. There is no generally used measure of dispersion for nominal scales. Dispersion describes how scores cluster or scatter in a distribution. Even though LeBron James wears #23, it doesn’t mean that he is better player than #24 or a worse player than #22. The number has no meaning other than identifying James for someone who doesn’t follow the Cavs.
Order means that the numbers are ordered. One number is greater than, less than, or equal to another number. You can ask students to develop a question that allows them to order the responses as well as group them. This is the perfect place to talk about the possible confusion that may exist when people order objects but the order may be the only consistent criteria. For instance, if two people tell them that Pizza Hut is better than Papa Johns, they are not necessarily thinking precisely the same. One could really favor Pizza Hut and never considering eating another Papa John’s pizza, which another could consider them almost interchangeable with only a slight preference for Pizza Hut. This discussion is a perfect lead in to the ever confusing ‘terror alert’ scale (shown on the next slide)…or the ‘weather warning’ system used in some states to keep drivers off the roads during poor weather. Students can probably come up with numerous other ordinal scales used in their environment.
Ordinal data require conformity to a logical postulate, which states: If a is greater than b , and b is greater than c , then a is greater than c . Rankings are examples of ordinal scales. Attitude and preference scales are also ordinal. The appropriate measure of central tendency is the median. The median is the midpoint of a distribution. A percentile or quartile reveals the dispersion. Nonparametric tests should be used with nominal and ordinal data. This is due to their simplicity, statistical power, and lack of requirements to accept the assumptions of parametric testing.
In measuring, one devises some mapping rule and then translates the observation of property indicants using this rule. Mapping rules have four characteristics and these are named in the slide. Classification means that numbers are used to group or sort responses. Order means that the numbers are ordered. One number is greater than, less than, or equal to another number. Distance means that differences between numbers can be measured. Origin means that the number series has a unique origin indicated by the number zero. Combinations of these characteristics provide four widely used classifications of measurement scales: nominal, ordinal, interval, and ratio.
Researchers treat many attitude scales as interval (this will be illustrated in the next chapter). When a scale is interval and the data are relatively symmetric with one mode, one can use the arithmetic mean as the measure of central tendency. The standard deviation is the measure of dispersion. The product-moment correlation, t-tests, F-tests, and other parametric tests are the statistical procedures of choice for interval data.
In measuring, one devises some mapping rule and then translates the observation of property indicants using this rule. Mapping rules have four characteristics and these are named in the slide. Classification means that numbers are used to group or sort responses. Order means that the numbers are ordered. One number is greater than, less than, or equal to another number. Distance means that differences between numbers can be measured. Origin means that the number series has a unique origin indicated by the number zero. Combinations of these characteristics provide four widely used classifications of measurement scales: nominal, ordinal, interval, and ratio.
Examples Weight Height Number of children Ratio data represent the actual amounts of a variable. In business research, there are many examples such as monetary values, population counts, distances, return rates, and amounts of time. All statistical techniques mentioned up to this point are usable with ratio scales. Geometric and harmonic means are measures of central tendency and coefficients of variation may also be calculated. Higher levels of measurement generally yield more information and are appropriate for more powerful statistical procedures.
This note relates to the effort it takes to develop a good measurement scale, and that the emphasis is always on helping the manager make a better decision—actionable data.
Exhibit 12-1 Exhibit 12-1 illustrates where scaling fits into the research process.
An attitude is a learned, stable predisposition to respond to oneself, other persons, objects, or issues in a consistently favorable or unfavorable way. Attitudes can be expressed or based cognitively, affectively, and behaviorally. A example for each is provided in the slide. Business researchers treat attitudes as hypothetical constructs because of their complexity and the fact that they are inferred from the measurement data, not actually observed.
Several factors have an effect on the applicability of attitudinal research for business. Specific attitudes are better predictors of behavior than general ones. Strong attitudes are better predictors of behavior than weak attitudes composed of little intensity or topic interest. Direct experiences with the attitude object produce behavior more reliably. Cognitive-based attitudes influence behaviors better than affective-based attitudes. Affective-based attitudes are often better predictors of consumption behaviors. Using multiple measurements of attitude or several behavioral assessments across time and environments improve prediction. The influence of reference groups and the individual’s inclination to conform to these influences improves the attitude-behavior linkage.
This note relates to the effort it takes to develop a good measurement scale, and that the emphasis is always on helping the manager make a better decision—actionable data.
Attitude scaling is the process of assessing an attitudinal disposition using a number that represents a person’s score on an attitudinal continuum ranging from an extremely favorable disposition to an extremely unfavorable one. Scaling is the procedure for the assignment of numbers to a property of objects in order to impart some of the characteristics of numbers to the properties in question. Selecting and constructing a measurement scale requires the consideration of several factors that influence the reliability, validity, and practicality of the scale. These factors are listed in the slide. Researchers face two types of scaling objectives : 1) to measure characteristics of the participants who participate in the study, and 2) to use participants as judges of the objects or indicants presented to them. Measurement scales fall into one of four general response types : rating, ranking, categorization, and sorting. These are discussed further on the following slide. Decisions about the choice of measurement scales are often made with regard to the data properties generated by each scale: nominal, ordinal, interval, and ratio. Measurement scales are either unidimensional or multidimensional, balanced or unbalanced, forced or unforced . These characteristics are discussed further as is the issue of number of scale points and rater errors.
A rating scale is used when participants score an object or indicant without making a direct comparison to another object or attitude. For example, they may be asked to evaluate the styling of a new car on a 7-point rating scale. Ranking scale constrain the study participant to making comparisons and determining order among two or more properties or objects. Participants may be asked to choose which one of a pair of cars has more attractive styling. A choice scale requires that participants choose one alternative over another. They could also be asked to rank-order the importance of comfort, ergonomics, performance, and price for the target vehicle. Categorization asks participants to put themselves or property indicants in groups or categories. Sorting requires that participants sort card into piles using criteria established by the researcher. The cards might contain photos or images or verbal statements of product features such as various descriptors of the car’s performance.
With a unidimensional scale, one seeks to measure only one attribute of the participant or object. One measure of an actor’s star power is his or her ability to “carry” a movie. It is a single dimension. A multidimensional scale recognizes that an object might be better described with several dimensions. The actor’s star power variable might be better expressed by three distinct dimensions - ticket sales for the last three movies, speed of attracting financial resources, and column-inch/amount of TV coverage of the last three movies.
A balanced rating scale has an equal number of categories above and below the midpoint. Scales can be balanced with or without a midpoint option. An unbalanced rating scale has an unequal number of favorable and unfavorable response choices.
An unforced-choice rating scale provides participants with an opportunity to express no opinion when they are unable to make a choice among the alternatives offered. A forced-choice scale requires that participants select one of the offered alternatives.
What is the ideal number of points for a rating scale? A scale should be appropriate for its purpose. For a scale to be useful, it should match the stimulus presented and extract information proportionate to the complexity of the attitude object, concept, or construct. E.g., A product that requires little effort or thought to purchase can be measured with a simple scale (perhaps a 3 point scale). When the product is complex, a scale with 5 to 11 points should be considered. As the number of scale points increases, the reliability of the measure increases. In some studies, scales with 11 points may produce more valid results than 3, 5, or 7 point scales. Some constructs require greater measurement sensitivity and the opportunity to extract more variance, which additional scale points provide. A larger number of scale points are needed to produce accuracy when using single-dimension versus multiple dimension scales.
Some raters are reluctant to give extreme judgments and this fact accounts for the error of central tendency . Participants may also be “easy raters” or “hard raters” making what is called error of leniency . Suggestions for addressing these tendencies are provided in the slide.
A primacy effect is one that occurs when respondents tend to choose the answer that they saw first. When respondents choose the answer seen most recently, the recency effect has occurred. These problems can be avoided by randomizing the order in which responses are presented.
The halo effect is the systematic bias that the rater introduces by carrying over a generalized impression of the subject from one rating to another. For instance, a teacher may expect that a student who did well on the first exam to do well on the second. Ways of counteracting the halo effect are listed in the slide.
This scale is also called a dichotomous scale . It offers two mutually exclusive response choices. In the example shown in the slide, the response choices are yes and no, but they could be other response choices too such as agree and disagree.
When there are multiple options for the rater but only one answer is sought, the multiple-choice, single-response scale is appropriate. The other response may be omitted when exhaustiveness of categories is not critical or there is no possibility for an other response. This scale produces nominal data.
This scale is a variation of the last and is called a checklist. It allows the rater to select one or several alternatives. The cumulative feature of this scale can be beneficial when a complete picture of the participant’s choice is desired, but it may also present a problem for reporting when research sponsors expect the responses to sum to 100 percent. This scale generates nominal data.
The Likert scale was developed by Rensis Likert and is the most frequently used variation of the summated rating scale. Summated rating scales consist of statements that express either a favorable or unfavorable attitude toward the object of interest. The participant is asked to agree or disagree with each statement. Each response is given a numerical score to reflect its degree of attitudinal favorableness and the scores may be summed to measure the participant’s overall attitude. Likert-like scales may use 7 or 9 scale points. They are quick and easy to construct. The scale produces interval data. Originally, creating a Likert scale involved a procedure known as item analysis . Item analysis assesses each item based on how well it discriminates between those people whose total score is high and those whose total score is low. It involves calculating the mean scores for each scale item among the low scorers and the high scorers. The mean scores for the high-score and low-score groups are then tested for statistical significance by computing t values. After finding the t values for each statement, the statements are rank-ordered, and those statements with the highest t values are selected. Researchers have found that a larger number of items for each attitude object improves the reliability of the scale.
From Exhibit 12-3 The semantic differential scale measures the psychological meanings of an attitude object using bipolar adjectives. Researchers use this scale for studies of brand and institutional image, employee morale, safety, financial soundness, trust, etc. The method consists of a set of bipolar rating scales, usually with 7 points, by which one or more participants rate one or more concepts on each scale item. The scale is based on the proposition that an object can have several dimensions of connotative meaning. The meanings are located in multidimensional property space, called semantic space. The semantic differential scale is efficient and easy for securing attitudes from a large sample. Attitudes may be measured in both direction and intensity. The total set of responses provides a comprehensive picture of the meaning of an object and a measure of the person doing the rating. It is standardized and produces interval data. Exhibit 12-7 provides basic instructions for constructing an SD scale.
The steps in constructing a semantic differential scale are provided in Exhibit 12-7 .
In Exhibit 12-8 , we see a scale used by a consulting firm to help a movie production company evaluate actors for the leading role of a risky film venture. The selection of concepts is driven by the characteristics they believe the actor must possess to produce box office financial targets. To analyze the results, the set of values for each component (evaluation, potency, and activity) is averaged.
In Exhibit 12-9 , the data are plotted on a snake diagram. Here the adjective pairs are reordered so evaluation, potency, and activity descriptors are grouped together, with the ideal factor reflected by the left side of the scale. Profiles of the three actor candidates may be compared to each other and to the ideal.
From Exhibit 12-3 Numerical scales have equal intervals that separate their numeric scale points. The verbal anchors serve as the labels for the extreme points. Numerical scales are often 5-point scales but may have 7 or 10 points. The participants write a number from the scale next to each item. It produces either ordinal or interval data.
From Exhibit 12-3: A multiple rating scale is similar to the numerical scale but differs in two ways: it accepts a circled response from the rater, and the layout facilitates visualization of the results. The advantage is that a mental map of the participant’s evaluations is evident to both the rater and the researcher. This scale produces interval data.
From Exhibit 12-3: The Stapel scale is used as an alternative to the semantic differential, especially when it is difficult to find bipolar adjectives that match the investigative question. In the example, there are three attributes of corporate image. The scale is composed of the word identifying the image dimension and a set of 10 response categories for each of the three attributes. Stapel scales produce interval data.
From Exhibit 12-3: The constant-sum scale helps researchers to discover proportions. The participant allocates points to more than one attribute or property indicant, such that they total a constant sum, usually 100 or 10. Participant precision and patience suffer when too many stimuli are proportioned and summed. A participant’s ability to add may also be taxed. Its advantage is its compatibility with percent and the fact that alternatives that are perceived to be equal can be so scored. This scale produces interval data.
From Exhibit 12-3: The graphic rating scale was originally created to enable researchers to discern fine differences. Theoretically, an infinite number of ratings is possible if participants are sophisticated enough to differentiate and record them. They are instructed to mark their response at any point along a continuum. Usually, the score is a measure of length from either endpoint. The results are treated as interval data. The difficulty is in coding and analysis. Graphic rating scales use pictures, icons, or other visuals to communicate with the rater and represent a variety of data types. Graphic scales are often used with children.
From Exhibit 12-3: In ranking scales , the participant directly compares two or more objects and makes choices among them. The participant may be asked to select one as the best or most preferred.
From Exhibit 12-10: Using the paired-comparison scale , the participant can express attitudes unambiguously by choosing between two objects. The number of judgments required in a paired comparison is [(n)(n-1)/2], where n is the number of stimuli or objects to be judged. Paired comparisons run the risk that participants will tire to the point that they give ill-considered answers or refuse to continue. Paired comparisons provide ordinal data.
From Exhibit 12-10: The forced ranking scale lists attributes that are ranked relative to each other. This method is faster than paired comparisons and is usually easier and more motivating to the participant. With five item, it takes ten paired comparisons to complete the task, but the simple forced ranking of five is easier. A drawback of this scale is the limited number of stimuli (usually no more than 7) that can be handed by the participant. This scale produces ordinal data.
From Exhibit 12-10: When using a comparative scale , the participant compares an object against a standard. The comparative scale is ideal for such comparisons if the participants are familiar with the standard. Some researchers treat the data produced by comparative scales as interval data since the scoring reflects an interval between the standard and what is being compared, but the text recommends treating the data as ordinal unless the linearity of the variables in question can be supported.
Q-sorts require sorting of a deck of cards into piles that represent points along a continuum. The participant groups the cards based on his or her response to the concept written on the card. Researchers using Q-sort resolve three special problems: item selection, structured or unstructured choices in sorting, and data analysis. The basic Q-sort procedure involves the selection of a verbal statements, phrases, single words, or photos related to the concept being studied. For statistical stability, the number of cards should not be less than 60, and, for convenience, not be more than 120. After the cards are created, they are shuffled, and the participant is instructed to sort the cards into a set of piles (usually 7 to 11), each pile representing a point on the judgment continuum. The left-most pile represents the concept statements, which are “most valuable,” “favorable,” and “agreeable.” The right-most pile contains the least favorable cards. In the case of a structured sort, the distribution of cards allowed in each pile is predetermined. With an unstructured sort, only the number of piles will be determined. The purpose of sorting is to get a conceptual representation of the sorter’s attitude toward the attitude object and to compare the relationships between people.
Exhibit 12-12 There is never just one correct way to ask a question. The MindWriter Close-Up gives you the opportunity to discuss why MindWriter chose the scales that they did. In the Close-Up, Jason and Myra are conversing with the general manager of MindWriter about the necessity of testing their measurement questions. T Henry & Associates has developed three scales shown in the exhibit in the slide. They also debated the wording of the anchors. This would be a good place to discuss the MindWriter scale exercise from the vignette and the Close-Up.
Exhibit 12-14 With a cumulative scale , a participant’s agreement with one extreme scale item endorses all other items that take a less extreme position. A pioneering scale of this type was the scalogram. Scalogram analysis is a procedure for determining whether a set of items forms a unidimensional scale. A scale is unidimensional if the responses fall into a pattern in which endorsement of the item reflecting the extreme position results in endorsing all items that are less extreme. The scalogram and similar procedures for discovering underlying structure are useful for assessing attitudes and behaviors that are highly structured, such as social distance, organizational hierarchies, and evolutionary product stages.
Exhibit 13-1 is a suggested flowchart for instrument design. The procedures followed in developing an instrument vary from study to study, but the flowchart suggests three phases. Each phase is discussed in this chapter.
Exhibit 13-2 : By this stage in a research project, the process of moving from the general management dilemma to specific measurement question has traveled through the first three question levels: Management question –the dilemma, stated in question form, that the manager needs resolved; Research question(s) – the fact-based translation of the question the researcher must answer to contribute to the solution of the management questions; Investigative questions – specific questions the researcher must answer to provide sufficient detail and coverage of the research question. Within this level, there may be several questions as the researcher moves from the general to the specific; Measurement questions – questions participants must answer if the researcher is to gather the needed information and resolve the management question. Once the researcher understands the connection between the investigative questions and the potential measurement questions, a strategy for the survey is the next logical step.
Exhibit 13-3: Researchers are concerned with adequate coverage of the topic and with securing the information in its most usable form. A good way to test how well the study plan meets those needs is to develop “dummy” tables that display the data one expects to secure. For example, we might be interested to know whether age influences the use of convenience foods. The dummy table would match the age ranges of participants with the degree to which they use convenience foods. The preliminary analysis plan serves as a check on whether the planned measurement questions meet the data needs of the research questions. This also helps the researcher determine the type of scale needed for each question.
In Phase 2 (Exhibit 13-4), you generate specific measurement questions considering subject content, the wording of each question, and each response strategy. The order, type, and wording of the measurement questions, the introduction, the instructions, the transitions, and the closure in a quality communication instrument should accomplish the following: Encourage each participant to provide accurate responses, Encourage each participant to provide an adequate amount of information, Discourage each participant from refusing to answer specific questions, Discourage each participant from early discontinuation of participation, and Leave the participant with a positive attitude about survey participation.
Questionnaires can range from those that have a great deal of structure to those that are unstructured. They contain three categories of measurement questions. Administrative questions identify the participant, interviewer, interviewer location, and conditions. These questions are rarely asked of the participant but are necessary for studying patterns within the data and identify possible error sources. Classification questions usually cover sociological-demographic variables that allow participants’ answers to be grouped so that patterns are revealed and can be studied. These questions usually appear at the end of a survey. Target questions address the investigative questions of a specific study. These are grouped by topic in the survey. Target questions may be structured or unstructured.
This note relates to the effort it takes to develop a good measurement scale, and that the emphasis is always on helping the manager make a better decision—actionable data.
These are the seven activities the researcher must accomplish to make an experiment a success. In the first step, the researcher is challenged to select variables that are the best operational definitions of the original concepts, determine how many variables to test, and 3) select or design appropriate measures for the chosen variables. The selection of measures for testing requires a thorough review of the available literature and instruments. In an experiment, participants experience a manipulation of the independent variable, called the experimental treatment. The treatment levels are the arbitrary or natural groups the researcher makes within the independent variable. A control group can provide a base level for comparison. A control group is a group of participants that is measured but not exposed the independent variable being studied. Environmental control means holding the physical environment of the experiment constant. When participants do not know if they are receiving the experimental treatment, they are said to be blind. When neither the participant nor the researcher knows, the experiment is said to be double-blind. The design is then selected. Several designs are discussed on the next several slides. The participants selected for the experiment should be representative of the population to which the researcher wishes to generalize the study’s results. Random assignment is required to make the groups as comparable as possible.
Survey questions should be revised until they satisfy these criteria: Is the question stated in terms of a shared vocabulary? Does the question contain vocabulary with a single meaning? Does the question contain unsupported or misleading assumptions? Does the question contain biased wording? Is the question correctly personalized? Are adequate alternatives presented within the question?
In choosing response options in questions, researchers must consider these factors.
Free-response questions, also known as open-ended questions, ask the participant a question and either the interviewer pauses for the answer or the participant records his or her ideas in his or her own words in the space provided on a questionnaire. Survey researchers try to reduce the number of these questions as they are difficult to interpret and are costly to analyze.
Dichotomous questions are measurement questions that offer two mutually exclusive and exhaustive alternatives.
Multiple-choice questions are appropriate when there are more than two alternatives or where we seek gradations of preference, interest, or agreement. A problem can occur with this question type when one or more responses have not been anticipated. A second problem occurs when the list of choices is not exhaustive. Participants may also feel that they have multiple answers while the question allows for only one response; in this case, the response choices are not mutually exclusive. The order in which choices are given can also cause bias. Order bias with non-numeric response categories often leads the participant to choose the first alternative (primacy effect) or the last alternative (recency effect) over the middle ones. Primacy effects dominate in visual surveys while recency effects dominate oral surveys. Multiple choice questions usually generate nominal data.
The checklist question is a question that poses numerous alternatives and encourages multiple unordered responses. Checklists are efficient and provided nominal data.
When relative order of the alternatives is important, the ranking question is ideal. The ranking question is a measurement question that asks the participant to compare and order two or more objects or properties using a numeric scale. It is always best to have participants rank only those elements with which they are familiar. For this reason, ranking questions might follow a checklist question which identifies the objects of familiarity. Avoid asking participants to rank more than seven items. Ranking generates ordinal data.
Exhibit 13-7 summarizes some important considerations in choosing between the various response strategies. While all of the response strategies are available for use in Web questionnaires, there are slightly different layout options for response in Web surveys.
Exhibit 13-7 summarizes some important considerations in choosing between the various response strategies. While all of the response strategies are available for use in Web questionnaires, there are slightly different layout options for response in Web surveys.
Exhibit 13-7 summarizes some important considerations in choosing between the various response strategies. While all of the response strategies are available for use in Web questionnaires, there are slightly different layout options for response in Web surveys.
Exhibit 13-7 summarizes some important considerations in choosing between the various response strategies. While all of the response strategies are available for use in Web questionnaires, there are slightly different layout options for response in Web surveys. Exhibit 13-6, starting on the next slide, illustrates these layout options.
Exhibit 13-6 ( 1 of 3)
Exhibit 13-6 ( 2 of 3)
Exhibit 13-6 ( 3 of 3)
Exhibit 13-8 provides sources of questions from books and web sites. This slide highlights many of the books listed in the Exhibit.
As depicted in Exhibit 13-9, instrument design is a multistep process. Develop the participant-screening process along with the introduction. A screen question is a question to qualify the participant’s knowledge about the target questions of interest or experience necessary to participate. Arrange the measurement question sequence: Identify groups of target questions by topic Establish a logical sequence for the question groups and questions within groups Develop transitions between these question groups. Prepare and insert instructions including termination instructions, skip directions, and probes. Create and insert a conclusion, including a survey disposition statement. Pretest specific questions and the instrument as a whole.
The question process must quickly awaken interest and motivate the participant to participate in the interview. More interesting topical target questions should come early. Classification questions that are not used as filters or screens should come at the end of the survey. The participant should not be confronted by early requests for information that might be considered personal or ego-threatening. Buffer questions are neutral measurement questions designed to establish rapport with the participant. These can be used prior to sensitive questions. The questioning process should begin with simple items and then move to the more complex, as well as move from general items to the more specific. Taxing and challenging questions later in the questioning process. Changes in the frame of references should be small and should be clearly pointed out. Use transition statements between different topics of the target question set. An example of a transition is provided in Exhibit 13-10.
The procedure of moving from general to more specific questions is sometimes called the funnel approach. The objectives of this procedure are to learn the participant’s frame of reference and to extract a full range of desired information while limiting the distortion effect of earlier questions on later ones.
Sometimes the content of one question assumes other questions have been asked and answered. This is a branched question. In web surveys, branching allows for respondents to avoid “skip patterns”. Based on their answers, respondents are branched to the appropriate section of the survey. Web surveys also allow for piping. Piping takes the answer from one question and uses the answer in a later question. For instance, if in one question, a respondent named Diet Coke as his or her favorite beverage, a later question might ask “Which is your favorite characteristic of Diet Coke?”
Exhibit 13-10 illustrates the components of a communication instruments. Instructions to the interviewer or participant attempt to ensure that all participants are treated equally. Two principles form the foundation for good instructions: clarity and courtesy. Instruction language needs to be unfailingly simple and polite. Instruction topics include those for 1) terminating an unqualified participant, 2) terminating a discontinued interview, 3) moving between questions on an instrument, and 4) disposing of a completed questionnaire. The role of the conclusion is to leave the participant with the impression that his or her involvement has been valuable.
Now is a great time to evaluate the MindWriter instrument that has been discussed in several vignettes and Close-ups. It is contained in this chapter’s Close-Up. In Exhibit 13-12
There is no substitution for a thorough understanding of question wording, question content, and question sequencing issues. However, the researcher can do several things to help improve survey results. These are listed in the slide. Most information can be secured by direct undisguised questioning if rapport has been developed. The assurance of confidentiality can also increase participant motivation. You can redesign the questioning process to improve the quality of answers by modifying the administrative process and the response strategy. When drafting the original question, try developing positive, negative, and neutral versions of each type of question. Minimize nonresponses to particular questions by recognizing the sensitivity of certain topics. The final step toward improving survey results is pretesting, the assessment of questions and instruments before the start of a study. Pretesting can allow one to 1) discover ways to increase participant interest, 2) increase the likelihood that participants will remain engaged, 3) discover question content, wording, and sequencing problems, 4) discover target question groups where researcher training is needed, and 5) explore ways improve the overall quality of survey data. Urge your students to review Appendix 13a, especially Exhibit 13a-5, Restructuring Questions, for some insight into overcoming problems.