Katie Sproule and Chiara Kovarik (IFPRI) present on using cognitive testing and vignettes in the Women's Empowerment in Agriculture Index (WEAI), for a Gender Methods Seminar, Dec. 12, 2014.
Audio recording of the presentation available here: http://bit.ly/1zG14dI
Kelly Jones: The Intersection of Health and Agriculture through a Gender Lens
Cognitive testing and vignettes in the WEAI - IFPRI Gender Methods Seminar
1. Lessons from WEAI fieldwork in Uganda and Bangladesh
Presented by Katie Sproule & Chiara Kovarik (Senior Research Assistants, PHND)
Gender Methods Seminar, IFRPI
December 12, 2014
2. Why use cognitive testing and vignettes in the WEAI?
Introduction to cognitive testing
Introduction to vignettes
Applying these tools to the WEAI
The fieldwork
The results
Lessons learned
2
3. • The WEAI was developed by IFPRI, OPHI,
and USAID in 2012 to measure women’s
levels of empowerment and inclusion in
the agricultural sector
• It was initially designed to be a monitoring
and evaluation tool for USAID’s Feed the
Future (FTF) programming in the 19 FTF
countries
• It is composed of 2 sub-indexes: the five
domains of empowerment (5DE) and the
Gender Parity Index (GPI)
• The 5 domains are: Production, Resources,
Income, Leadership, and Time
3
4. After the 2012-2013 baselines, it became obvious that the WEAI
needed to undergo some revisions and streamlining
Key indicators were identified as problematic
Decision was made to develop a second version of the WEAI
Cognitive testing was conducted to ensure that the questions
were capturing the various dimensions of empowerment and also
to ensure that the index remained standardized across countries
Vignettes were included to see if they would be a better way of
getting at issues of autonomy in decision-making
4
5. • Cognitive testing is a qualitative method that is paired with a (quantitative) survey
• The purpose of cognitive testing is to systematically identify and analyze sources of
response error in surveys, and to use that information to improve the quality and
accuracy of survey instruments (Johnson, 2013)
• Cognitive testing can be especially important for new/revised instruments, or those
that will be used in multiple country contexts (Johnson, 2013)
• Generally conducted as a pre-test before full field work begins
• Cognitive testing helps identify the stage in the cognitive process where response
error occurs
5
6. 6
Cognitive Stages Cognitive Stage
Definition
Problems Causes
1. Comprehension
2. Retrieval
3. Judgment
4. Response
Source: Johnson, 2013
Respondent
interprets the
question
Respondent
does not
understand
Unknown terms, ambiguous
concepts, long and overly
complex
Respondent
searchers memory
for relevant
information
Respondent does
not
remember/does
not know
Recall difficulty, questions
assume respondent has
information
Respondent
evaluates and/or
estimates response
Respondent does
not want to tell,
can’t tell
Biased or sensitive, estimation
difficulty
Respondent
provides
information in the
format requested
Respondent can’t
respond in the
format requested
Incomplete response options,
multiple responses necessary
Breakdown can occur in ANY of the four stages
7. How satisfied are you with your available time for leisure activities? Please give your
opinion on a scale of 1 to 10. 1 means you are not satisfied and 10 means you are very satisfied. If you
are neither satisfied nor dissatisfied this would be in the middle or 5 on the scale.
Breakdown in comprehension:
Respondent may not
understand the concept of
“leisure”, or may understand
it differently from the
researcher
The concept of
“satisfaction” is ambiguous
and subjective
Breakdown in response:
Respondent may have never
answered a question in this
format. While questions with
ranking scales are familiar to
Western audiences, they may
not be to everyone
Response error!!
7
8. Back to our example question: How satisfied are you with your leisure time? Please
rank on a scale of 1-10, with 1 being completely unsatisfied and 10 being
completely satisfied
Some follow up cognitive testing questions might be:
1. Can you tell me in your own words what “leisure” means?
2. What does it mean to you to be “satisfied”?
3. What recall period did you use in your response? Were you thinking about your
leisure time in the past week? The past month?
4. Did you find this question difficult? If so, why?
5. Do you think others would find this question difficult? If so, why?
8
10. Conduct surveys with between 10-15 respondents per language group
Sampling should be done to maximize variance among respondents
At least two rounds of cognitive testing should be conducted
Enumerators need to be appropriately trained in cognitive interviewing
Audio-record the interviews
2 enumerators should be present for each individual interview
There is a large degree of flexibility in designing a cognitive testing that will depend
on the survey and the context of the testing (structured script vs fully or partially
improvised; concurrent vs retrospective; think aloud vs probing;)
10
11. What are vignettes?
• Research method where respondents respond to a set of stories describing different
scenarios related to the topic for a hypothetical person/household
• The vignette provides enough context and information for participants to have an
understanding of the scenario being depicted, but needs to be vague in ways that
compel participants to ‘fill in’ detail
• Reveals perceptions and values, as well as social norms in the community
• Allows researchers to get at topics that might otherwise be challenging to ask about
• Can be used as an ice breaker, a way to close the interview, a stand-alone technique
or part of a multi-method approach
11
12. ENUMERATOR: This set of questions is very important. I am going to
give you some reasons why you act as you do in the aspects of
household life I just mentioned. You might have several reasons for
doing what you do and there is no right or wrong answer. Please tell
me how true it would be to say:
[If household does not engage in that particular activity, enter 98 and
proceed to next activity.]
My actions in [ASPECT] are
partly because I will get in
trouble with someone if I
act differently.
[READ OPTIONS: Always
True, Somewhat True, Not
Very True, or Never True]
Regarding [ASPECT] I do
what I do so others don’t
think poorly of me.
[READ OPTIONS: Always
True, Somewhat True, Not
Very True, or Never True]
Regarding [ASPECT] I do
what I do because I
personally think it is the
right thing to do.
[READ OPTIONS: Always
True, Somewhat True, Not
Very True, or Never True]
G5.03 G5.04 G5.05
A Getting inputs for agricultural production
B The types of crops to grow for agricultural production
C Taking crops to the market (or not)
D Livestock raising
G5.03/G5.04/G5.05: Motivation for activity
Never true …………………………………..1
Not very true …………………………………..2
Somewhat true …………………………………..3
Always true …………………………………..4
Household does not engage in activity/Decision not made……………98
12
13. “Now I am going to read you some stories about different farmers and their situations regarding different agricultural
activities. This question format is different from the rest so take your time in answering. For each I will then ask you how
much you are like or not like each of these people. We would like to know if you are completely different from them, similar
to them or somewhere in between. There are no right or wrong answers to these questions.”
STORY QUESTION RESPONSE
A
The types of
crops to grow
for
agricultural
production
G4.A1 “[PERSON’S NAME] can’t grow other
types of crops here for agricultural
production. These are the only things
that grow here.”
To what extent does [PERSON’S
NAME]’s story describe your
situation?
Completely different………………….1
Not very similar…………………………..2
Quite similar……………………………….3
Describes my situation too …………4
Don’t know…………………………………97
G4.A2 “[PERSON’S NAME] is a farmer and
grows – [INSERT LOCAL CROPS]–
because her spouse, or another person
or group in her community tell her she
must raise these crops. She does what
they tell her to do.”
Whatever crops you grow for your
production, are you like [PERSON’S
NAME], doing what you are told by
others to do?
Completely different…………………..1
Not very similar…………………………..2
Quite similar………………………………3
Describes my situation too…………4
Don’t know…………………………..….97
13
14. 1. Culturally and contextually appropriate
2. Focus
• Can make some parts more detailed or direct their attention to it (Braun & Clarke 2013)
• Vignettes should focus on “mundane occurrences” rather than disastrous events (Finch 1987, Hughes 1998)
3. Complexity
• Stay away from overly complex vignettes with too many characters (Braun & Clarke 2013)
• Ensure that the vignette is tapping a single one-dimensional concept (King 2014)
4. Ambiguity
Can intentionally make certain parts vague to explore assumptions (Braun & Clarke 2013)
5. Single vignette vs. staged vignettes
Presenting character or plot development in “stages” (Braun & Clarke 2013)
6. Number
Generally use 5-7 vignettes per concept to be measured (King 2014)
14
15. 15
Example storyline on bargaining power: “Hope is a cassava farmer in a nearby
village. She has her own small plot that she works on, though her husband owns it.
When it comes time to bring her cassava to market, her husband demands that she give
him at least 80 percent of whatever she earns.
Example of a staged vignette:
First, have the respondent answer a question relating to the first stage of the story (i.e. –
“What should Hope do?)
Then, build off the first stage of the story: “Hope decides to give her husband half of what she
earns and keep the other half for herself. After some time, her husband finds out that she has
been keeping half of the money for her own purposes and he becomes angry. He threatens to
beat Hope and kick her out of their home. What should Hope do in this situation?”
16. “Puja wants to visit her parents, who live in another village 20 kilometers away, over a road
that is potholed and hard to travel, especially in the rainy season. She wants to go to the
village to care for her elderly mother, Sadia, and to bring her maize and sweet potato to
cook. Her husband, Sumit, will only allow her to go if she has finished her housework, which
consists of cooking and washing the laundry, and if she is accompanied by a male relative.
Her brother, Hasan, will sometimes come to the village to accompany her home to their
parents. How much power does Puja have to travel when and where she wants? Response
categories: a lot; some; a little; none.”
1. Too complex! Too many unnecessary examples.
2. Too many characters – hard to keep everyone straight.
3. Not contextually appropriate – the name Puja indicates a South Asian context, so
maize and sweet potato are not appropriate crops.
16
17. 1. Open or close-ended question
Asking the respondent his/her thoughts on the vignette. Or giving response options (use an
even number of categories)
2. If close-ended, what kinds of response categories?
Response categories relating to the hypothetical situation vs/ relating to respondent:
Responses relate to how character in situation should or would act or relate to how
respondent should or would act if he/she were in the same situation (WEAI 2.0)
3. Anchoring questions: “a technique designed to ameliorate problems that occur when
different groups of respondents understand and use ordinal response categories” (King & Wand
2006)
4. Using “should” versus “would”: When asking about how a character might react you may
want to get at moral aspects of the situation or the pragmatic (Braun & Clarke 2013)
17 Enumerators need to be well-trained and comfortable with technique
18.
19. Sites: Bangladesh & Uganda
Sample size: Consisted of 120
interviews in Uganda and 70 interviews
in Bangladesh
Sample composition: 2/3 women, 1/3
men; from DHH and FHH; various age
ranges
Questionnaire: A series of ~100
questions were developed based off
Johnson et al.’s (2013) paper on
cognitively testing the original WEAI in
Haiti Photo credit: Chiara Kovarik
19
20. Cognitive testing revealed issues with
the following areas:
Distinction between different concepts
Time frame and recall issues
Abstract terms or concepts
Discrepancies between identifying
something as challenging versus saying
others would find it challenging
Photo credit: Katie Sproule 20
21. Original survey question: “Did you yourself
participate in [ACTIVITY] in the past 12 months (that
is, during the last [one/two] cropping seasons)?”
Cognitive question: “What timeframe did you include
in your response?”
Problem: 35% of respondents in Uganda either could
not come up with the recall period used or referred to
a timeframe other than 12 months
Modified survey question: “Did you yourself
participate in [ACTIVITY] in the past 12 months (that
is, during the last [one/two] cropping seasons), from
[PRESENT MONTH] last year to [PRESENT MONTH]
this year?”
Results of modification: Timeframe recall errors
dropped to just 6% in Uganda Photo credit: Katie Sproule
21
22. Original survey question: In the original WEAI, time use was collected using a 24-hour recall
module. For WEAI 2.0, a one-week recall was proposed and tested as an alternative.
Cognitive question: For each version of the module, we asked respondents, “How well do you
remember the specific activities you were doing during the past week/24 hours?” Remember very
well or do not remember very well. And, “in general, do your activities vary from day to day or
remain the same?”
Results:
Time Module Comparison
Uganda Bangladesh
Round 1 Round 2 Round 1 Round 2
24-hour recall difficulty 13.5% 0% 3.2% 4.8%
7-day recall difficulty 32.7% 6.3% 12.9% 21.1%
Activities vary daily 59% 59% 50 % 66%
22
23. Original survey question: “Do you feel comfortable
speaking up in public about any issue that is important
to you, your family or your community?”
Cognitive questions: “Did you find this question
difficult?” “What does the word issue mean to you?”
Problem: In Uganda that the word “issue” translates to
problem or challenge and thus has a negative
connotation. 15.4% of respondents cited a definition for
“issue” that included both positive and negative topics.
Modified survey question: Therefore, any issue was
changed to read anything.
Results: With this small change in wording, 62.5% of
respondents gave a neutral definition of “issues”.
Credit: LEAD Africa
Location: Uganda
23
24. Original survey question: original WEAI autonomy section (as seen previously). Statements like “My actions in
[ASPECT] are partly because I will get in trouble with someone if I act differently.” Replaced with vignettes.
Cognitive question: “Did you find this question difficult?” and “Do you think others would find this question
difficult?”
Problem: Large discrepancies in percentage of respondents who found the question difficult themselves versus
how difficult they thought others would find it. In Uganda, between 7-14% said they found the questions
difficult, versus 29-60% saying they thought others would find the questions difficult. In Bangladesh, very few
respondents noted these questions as being difficult to answer but between 29-39% said they thought others
would find the question difficult.
Modification: Better training of enumerators
Results of modification: In Uganda, the rates dropped dramatically for the second round of cognitive
interviews with just one respondent (3.1%) reporting difficulty and only 3.1-12.5% of respondents saying others
would have difficulty. In Bangladesh respondents again did not find questions difficult and the number of
respondents reporting others would find it difficult dropped.
We attribute the reduction in comprehension issues in large part to a greater familiarly and comfort in telling
the stories amongst the enumerators, which in turn, translated to a clearer understanding and perceived
easiness of the questions.
24
25. Pros from our experience:
Vignettes are fun and new!
Cons from our experience:
Some respondents found it challenging to understand the concept of a hypothetical situation
It was challenging for both enumerators and respondents to grasp what part of the story they
were trying to relate to
Other thoughts:
Ambiguous results
Responses were often much longer and more descriptive than anticipated, even when posed in
a close-ended manner (Bangladesh). This may have been due to not understanding the
question.
Vignettes made an ideal candidate to cognitively test
What we might do differently next time
Vignettes as part of qualitative work?
25
26. Lessons learned and ideas for future
research
• Vignettes and cognitive testing
are not for every questionnaire
• They take extra time, resources,
and enumerator training
• Cognitive testing was valuable in that it allowed us to understand
what is wrong with a question in a very specific way, rather than just
knowing the question is poor and should be changed; it answers the
how it should be changed
• Cognitive testing is not necessarily a stand alone technique; there
were areas where we are unsure what to make of the results (i.e.
effectiveness of vignettes vs traditional autonomy questions)
• While doing multiple iterations of testing may not always be feasible,
doing either a single iteration or a more extended pre-test could be
beneficial to survey designers (e.g. Haiti WEAI cognitive testing)
• It was especially important to cognitively test the WEAI, because it is
administered in 19 countries; similar testing should be considered
with other large multi-country surveys
26
27. Hopkins, D.J., King, G. (2010). Improving Anchoring Vignettes: Designing Surveys to
Correct Interpersonal Incomparability. Public Opinion Quarterly. pp. 1-22.
Johnson, K. (2014). “Cognitive Pretesting of Cross-nationally Comparable Survey
Instruments in a Developing Country Context Seminar.” International Food Policy
Research Institute. Washington, DC. 9 May 2014.
King, G. (2009). Anchoring Vignettes FAQs and Examples.
http://gking.harvard.edu/vign/eg/ [Accessed November 6, 2014].
Wand, J. (2007). Credible Comparisons Using Interpersonally Incomparable Data:
Ranking Self-Evaluations Relative to Anchoring Vignettes or Other Common Survey
Questions. Available at http://wand.standford.edu.
Willis, G. B. (2005). Cognitive Interviewing: A Tool for Improving Questionnaire
Design. Sage, Thousand Oaks, CA.
27
28. Thank you!
Any questions?
Contact Katie Sproule (k.Sproule@cgiar.org)
or Chiara Kovarik (c.kovarik@cgiar.org)
28
-Point out that cognitive testing and vignettes don’t necessarily go together; they just happen to be two methods used to deal with challenges that arose with the WEAI
-Point out that while this is a gender methods seminar, these methods can be widely used in surveys
After the 2012-2013 baselines, it became obvious that the WEAI needed to undergo some revisions and streamlining
Key indicators that were identified as problematic were: time use, autonomy in decision making, group membership, and miscellaneous questions on asset ownership and production decisions
Decision was made to develop a second version of the WEAI (WEAI 2.0)
Vignettes were included to see if they would be a better way of getting at issues of autonomy in decision making
Cognitive testing was conducted to ensure that the questions were capturing the various dimensions of empowerment, as intended by the research team, and also to ensure that the index remained standardized despite being implemented in various country contexts
Cognitive testing is a qualitative method that is paired with a (quantitative) survey
The purpose of cognitive testing is to systematically identify and analyze sources of response error in surveys, and to use that information to improve the quality and accuracy of survey instruments (Johnson, 2013)
Basically you’re checking to see whether the question you’re asking is generating the intended information
Cognitive testing can be especially important for new/revised instruments, or those that will be used in multiple country contexts (Johnson, 2013)
This is in theory the process someone goes through when asked a question
-You can go through this process with each survey question you identify to have one or more cognitive breakdown issues. You don’t have to ask this many questions, these are just some examples that will help you to get at whether any comprehension, retrieval, judgment or response issues exist. For the WEAI, while we tested the entire revised version of the instrument, we did have more probing questions for modules that had been identified as problematic during the first round.
-Example page from the WEAI cognitive test
-1st part is observations for enumerators to fill out on difficulties respondent had with the questionnaire module just administered
-2nd part is questions asked to the respondent about the module they just completed
The way we did it used a semi-retrospective technique, so after each module we asked the cognitive questions; the other options would be to ask the cognitive questions immediately after the survey question (we thought that would be disruptive to the interview) or you can wait until the end of the survey to do the cognitive interview, but we were afraid people would forget by that point
Other things of note here, some questions are coded responses, others are left with blank space for the enumerator to write (verbatim) what the respondent said; these comments are helpful when you are trying to figure out how to change a question (why was this question difficult? Word they didn’t understand? Too long? Etc.)
Cognitive testing should be done for between 10-15 respondents per language group
Doing more than 15 interviews leads to diminishing marginal returns
Sampling should be done to maximize variance among respondents
Have young/old, men/women, educated/non-educated; you want to see if there are problems specific to any of the sub-groups within your sample
Ideally, at least two rounds of cognitive testing should be conducted
Between each round, you should make time for revisions of the instrument and clarifications to enumerators
Enumerators need to be appropriately trained in cognitive testing
Generally speaking, the more experienced the enumerator the better, especially with qualitative methods; but with good training any good enumerator can do this
If possible, audio-record the interviews
This is helpful if you need to go back to further understand the nuance of a problem; however, transcribing and analyzing this additional data obviously requires additional resources
Ideally 2 enumerators should be present for each individual interview
1 to ask the questions and the second to take notes and observe the behavior of the individual
There is a large degree of flexibility in designing a cognitive testing that will depend survey and context of the testing
For instance, you can ask standardized or improvised questions; you can ask concurrently after each question, wait until the end of the module or end of the survey; and you can use a thinking aloud technique, asking respondents to reveal their thought process in answering the question, or use a probing technique in which the enumerator guides the respondent to reveal certain information about their cognitive and response process; you can have a completely structured script, as we did, or allow the enumerators to fully or partially improvise (requires more skill – definitely good qual interviewer).
This segways nicely into talking about vignettes, which from our experience with the WEAI, are very important to cognitively test and can help with refinement of this type of questioning
Vignettes can be described as carefully contrived stories about individuals and situations which make reference to important points in the study of perceptions, beliefs and attitudes (Hughes 1998)
NOTE: we did not develop these vignettes. These were suggested by a colleague at OPHI.
Can be open- or close-ended questions
Does not have to have standardized response codes – could be open-ended.
Should have an even number of response code options (if not people will tend to the middle/neutral option) (Alkire)
There are numerous way to design an anchoring vignette (research by Hopkins and King tests these different methods)
1) self assessment, then hypothetical assessment
2) hypothetical assessment, then self assessment
3) combined, as we did – more efficient because less questions (though research should be done to test the validity against other methods)
Sample size consisted of 120 interviews in Uganda and 70 interviews in Bangaldesh
Split in half for each round, then within each, about 2/3 of the interviews were with women and the other 1/3 with men; women were selected from both dual headed and female headed households and at various age ranges
A series of roughly 100 questions were developed based off Johnson et al.’s (2013) paper on cognitively testing the original WEAI in Haiti
Each cognitive question corresponded with a particular survey question
Used a semi-retrospective standardized probing technique: After each module of the WEAI 2.0, the set of cognitive questions that corresponded with those WEAI questions were asked.
Done so that WEAI questions were as fresh in respondent’s minds as possible when they answered follow-up questions
Testing concluded with a set of cognitive questions on the WEAI survey process in general
We’re going to go through each of these in more depth but broadly speaking, cognitive testing revealed issues with the following areas:
Distinction between different concepts –have versus own
Time frame and recall issues – 24 hr vs 7 day; can people recall a 1 year period
Abstract terms or concepts – issues, autonomy in decision making
Discrepancies between identifying something as challenging versus saying others would find it challenging – sometimes people would say they didn’t think the question was difficult but that others would
The original survey question in the productive decision making module asks: “Did you yourself participate in [ACTIVITY-food crop farming, livestock raising, etc] in the past 12 months (that is, during the last [one/two] cropping seasons)?”
When asked how much time respondents included in their response, answers in Uganda ranged anywhere from 3-12 months; 35% of respondents either could not come up with the recall period they used or referred to a timeframe other than 12 months.
Modification of the question to, “Did you yourself participate in [ACTIVITY] in the past 12 months (that is, during the last [one/two] cropping seasons), from [PRESENT MONTH] last year to [PRESENT MONTH] this year?”, resulted in much less recall error with only 6% of respondents in Uganda stating periods of less than one year.
Interestingly, this question was not nearly as challenging in the Bangladesh pre-test.
In the original WEAI, time use was collected using a 24-hour recall module
For WEAI 2.0, a one-week recall was proposed and tested as an alternative.
For each version of the module, we asked respondents, “How well do you remember the specific activities you were doing during the past week/24 hours?”
In Uganda, roughly 59% of respondents in both rounds said their activities change from day to day and 16.1% of these respondents cite seasonality specifically as a reason for their change in daily activity. In Bangladesh, 50% of respondents in the first pre-test and 66% of respondents in the second pre-test said their schedule changes day to day, which supports the finding in Uganda that respondents’ daily schedules often change and that developing a method for accurately capturing this variability is key.
Original survey question: We knew that the speaking in public module was sensitive in nature but thought that respondents understood the question, “Do you feel comfortable speaking up in public about any issue that is important to you, your family or your community?”
Problem: However, we learned in Uganda that the word “issue” translates to problem or challenge in local languages and thus has a negative connotation. Not surprisingly, only 15.4% of respondents cited a definition for “issue” that included both positive and negative topics.
Modified survey question: Therefore, any issue was changed to read anything.
Results: With this small change in wording, 62.5% of respondents gave a neutral definition of “issues” during the subsequent round of testing.
New autonomy section tested
Problems identified with the old autonomy section included:
A sensitive topic
Hard to get a via standardized survey questions
Many problems implementing in various countries – challenging both for enumerators and for respondents
Feedback from teams on the vignettes was mixed. The Uganda team preferred the original questions at first but then became more comfortable with the vignettes, while the Bangladesh team preferred the vignettes. However, they noted that even with the original autonomy questions they were doing some “storytelling”, so it was not that different of a technique for them (Nice thing about this is that they are standardized stories)
Problems that arose with the vignettes:
Some respondents found it challenging to understand the concept of a hypothetical situation. (i.e. – said they did not know the person in the story)
It was challenging for both enumerators and respondents to grasp what part of the story they were trying to relate to.
Other thoughts:
Responses were often much longer and more descriptive than anticipated. This provided lots of interesting information, but may not be what you are looking for if you are trying to get a quick module.
Cognitive testing was, for the most part, extremely revealing and beneficial to do in the context of the WEAI
It allows you to understand what is wrong with a question in a very specific way, rather than just knowing the question is poor and should be changed; it answers the how it should be changed
Cognitive testing is not necessarily a stand alone technique; there were areas where we are unsure what to make of the results; primarily the autonomy vignettes; analysis of the full data will help us to make this decision
While doing multiple iterations of testing may not always be feasible, doing either a single iteration or a more extended pre-test could be beneficial to survey designers (e.g. Haiti WEAI cognitive testing)
It was especially important to cognitively test the WEAI, because it is administered in 19 countries; similar testing should be considered with other large multi-country surveys