Google’s Mario Callegaro explains how paradata works – the data that shows not just what your survey respondents said but how they answered the question, opening up new avenues for researchers to interrogate and analyse the survey data they receive.
What paradata can tell you about the quality of web surveys?
1. What paradata can tell you about the
quality of web surveys?
Mario Callegaro Ph.D.
Senior Survey Research Scientist
User Insights team, Brand Studio
Google London
Qualtrics Converge Europe, London April 26, 2017
3. How do we know if a question works?
How do we know if a question measures what is intended to measure?
How do we know if respondents understand the question and can appropriately respond to it?
3
5. Taxonomy of paradata types
Paradata for web surveys can be classified into the following groups:
1. Direct paradata
• Contact-info
• Device-type paradata
• Questionnaire navigation paradata
2. Indirect paradata
• E.g. eye tracking, video recording, behavioral coding
5
7. Direct paradata: Contact info
• Outcomes of an email invitation
• Access to the questionnaire introduction page
• Last question answered before breakoff
7
8. Survey breakoffs by question
8
(Sakshaug & Crawford, 2010) Data courtesy from Sakshaug
75
80
85
90
95
100
Permission asked to use
school records (grades)
for research purposes
10. Direct paradata: Device type
• User-agent string
• Screen resolution
• Browser window size
• Javascript and Flash active
• IP Address (mostly considered Personal Identifiable Information)
• GPS coordinates (mostly considered Personal Identifiable Information)
• Cookies
10
11. Device type: GPS coordinates example
11Dayton, J & H. Driscoll: The Next CAPI Evolution - Completing Web Surveys on Cell-Enabled iPads. AAPOR
12. Device type: GPS coordinates example (cont.)
12Dayton, J & H. Driscoll: The Next CAPI Evolution - Completing Web Surveys on Cell-Enabled iPads. AAPOR 2011
14. Direct paradata: Questionnaire navigation 1
Mouse clicks and mouse coordinates
Mouse clicks and its position can be captured with a JavaScript. Excessive mouse movements can
be a sign of problems with the question
Change of answers
Change of answers is an indicator of potential confusion with a question and can be used to improve
questionnaire design
Typing and keystrokes
Typing and keystrokes can create an audit trail for each survey and used to detect unusual behavior
both from the respondent side and the interviewer side
14
15. Questionnaire navigation paradata example
lXNtoilre7_2|1|M677|13|1320#
M548|174|830#
M160|101|1750#
M366|192|550#
M728|4|7690#
M489|247|610#
C493|229|3301#
R110|1#
C493|280|4301#
R110|3#
C493|345|3901#
R110|5#
C521|399|3801#
SU521|399|60|undefined#|
15
Stieger and Reips (2010, p. 1490)
17. Fully labeled vs. polar point vs. polar point with numbers vs. answer box
17
Stern (2008, p. 384)
18. Fully labeled vs. polar point vs. polar point with numbers vs. answer box
Mean ratings
18
2
2 2
3
1
2
3
4
5
Fully labeled Polar point Polar point w/#'s Answer box
Stern (2008) & Christian (2003)
19. Fully labeled vs. polar point vs. polar point with numbers vs. answer box
% of reciprocal changes
19
2
7
6
8
0
2
4
6
8
10
Fully labeled Polar point Polar point w/ #'s Answer box
Stern (2008)
21. Direct paradata: Questionnaire navigation 2
Order of answering
In a page with multiple questions the order of answering is an indicator on how the respondent reads
the questions
Movements across the questionnaire (forward/backward)
If the questionnaire allows going backward or going forward by skipping questions, unusual
movements are a symptom of issues with the questionnaire or the respondent
Scrolling
The amount of scrolling depends on the screen size of the device used and on the size of the
browser window used by the respondent
21
22. Time latency paradata
Time spent per question/screen
This is the most published topic in paradata research: time latency information.
There are many studies focusing on major themes:
• Attitude strength
• Response uncertainty
• Question wording
• Response error (e.g. speeding)
• Satisficing / Optimizing
22
23. Order of response categories:
Positive vs. negative orientation
POSITIVE
How accessible have your
instructors been both in and
outside of class?
Very accessible
Somewhat accessible
Neutral
Somewhat inaccessible
Very inaccessible
Don’t know
23
NEGATIVE
How accessible have your
instructors been both in and
outside of class?
Very inaccessible
Somewhat inaccessible
Neutral
Somewhat accessible
Very accessible
Don’t know
Christian, Parsons & Dillman (2009)
24. Positive vs. negative orientation
Results in %
24
0
10
20
30
40
50
Positive order Negative order
Christian, Parsons & Dillman (2009)
25. Positive vs. negative orientation
Time spent answering the question
25
0
0.4
0.8
1.2
1.6
2
2.4
Positive order Negative order
Christian, Parsons & Dillman (2009)
26. Privacy and ethical issues in collecting paradata
Should we tell respondents we are collecting paradata?
What happens when we tell respondents we are collecting paradata and we ask permission to use
them?
• 59.5% agreed in the LISS Dutch panel (across experimental manipulations)
• 65.6% agreed in the Knowledge Networks U.S. panel (across experiment manipulations)
• 69.3% agreed in a U.S. volunteer non-probability panel (across experimental manipulations)
(Couper and Singer, 2013, studies done using vignettes)
26
28. Conclusions on paradata
• The amount of paradata that can be collected grow as the technological capabilities grow
• Although paradata can be collected “easily” and at a low cost, we should not underestimate the
cost of managing and analysing paradata (Nicolaas, 2011)
• Paradata should not replace other ways of pretesting the questionnaire because it does not
answer all the research questions
• Paradata analysis is another tool to use in assessing the quality of a survey and in making
improvements to the questionnaire and the entire online survey experience
28
29. References on Paradata for web surveys
Callegaro, M. (2013). Paradata in web surveys
(Chapter 11).
In F. Kreuter (Ed.), Improving surveys with paradata:
Analytic use of process information (pp. 261–279).
Hoboken, NJ: Wiley.
PDF available at
http://research.google.com/pubs/MarioCallegaro.html
Callegaro, Lozar Manfreda & Vehovar (2015). Web
survey methodology. London: Sage
29