When looking at data science approaches to studying the needs of platform workers, most people use a methodology centred around mining social media. In this brief presentation at an Alan Turing Institute Workshop, I argue that epidemiological data sets and large social science surveys can shed light on aspects of platform workers' experience that are not disclosed on public forums.
Epidemiology versus Data Collection Bias - Studying the Needs of Platform Workers
1. Workshop: Using Data Science to Study Platform
Workers
Epidemiology vs
Data Collection Bias
Maria Wolters
Reader in Design Informatics
University of Edinburgh /
Turing
@mariawolters
2. Will the Workers Tell You Who They Are?
Any textual information on forums and social media that
is sufficiently public for researchers to scrape
can be seen by agents who can use it against workers.
We need to supplement textual data with survey data
to understand workers’ situation and health needs
quantitatively.
3. Epidemiology
❖ Who is getting ill, where, and when?
❖ incidence: number of new cases
❖ prevalence: number of cases in the population at any
one time
❖ What are the symptoms?
❖ validity: do our instruments accurately detect
symptoms?
4. Example: Psychoses
❖ Who gets ill, when, and where?
❖ In England, 32 per 100,000 new cases of psychotic disorders in people aged
16-64 per year.
❖ More prevalent in Black and Minority people, and (before age 45) in men.
❖ 4 in 1000 have or have had experienced a psychotic episode in that year
❖ What are the symptoms? What is the illness?
❖ Assessed by standardised research criteria, e.g. Schedule for the Clinical
Assessment in Neuropsychology [SCAN]
❖ Only non-organic psychoses, i.e. psychoses that are not induced by physical
health problems, are included
❖Kirkbride, J. B., Errazuriz, A., Croudace, T. J., Morgan, C., Jackson, D., Boydell, J., … Jones, P. B. (2012). Incidence of Schizophrenia and Other Psychoses in England, 1950–2009: A
Systematic Review and Meta-Analyses. PLoS ONE, 7(3), e31660. http://doi.org/10.1371/journal.pone.0031660
Kirkbride JB, Errazuriz A, Croudace TJ, Morgan C, Jackson D, et al. Systematic Review of the Incidence and Prevalence of Schizophrenia and Other Psychoses in England, 1950–2009:
Executive Summary. London: Department of Health Policy Research Programme; In Press.
5. Disclosure Advice
„In assessing whom to tell, there are a number of useful questions
that you can ask beforehand about the person, such as
• Are they likely to be sympathetic or hostile?
• Will they be supportive in the future?
• Are they likely to talk to anyone else about it?1
• Are they likely to use the information against you?7
You may also consider a couple of important issues, such as:
• Is the person likely to find out anyway?
• If you do not tell them will they be able to
trust you on other things?
• Will not telling them make it more difficult to relate to them
in the future?3“
https://www.livingwithschizophreniauk.org/advice-sheets/disclosure-telling-other-people-about-your-schizophrenia-2/
6. So, What Struggles Will They Tell Us About?
❖ common, non-stigmatised experiences vs. experiences
that are due to a stigmatised condition
❖ stigma varies depending on the platform where issues
are discussed
❖ Forum for people with schizophrenia vs. Quora or
Twitter
7. What Can We Use?
❖ Epidemiological surveys
❖ Social science surveys
❖ Longitudinal studies
… and of course the published literature …
8. Disadvantages
❖ very high level metrics
❖ no qualitative data
❖ does not tell us whether people actually are platform
workers
9. Advantages
❖ Well documented
❖ More likely to accurately assess prevalence
❖ Properly anonymised, privacy protection
❖ Quite a few surveys include economic, social, and
health data
10. Where Can We Get Surveys?
The UK Data Archive is a wonderfully rich source of data
❖ English Longitudinal Survey of Ageing
❖ Health Surveys
❖ Understanding Society
11. How can we benefit from surveys?
❖ Define search queries that surface relevant struggles
❖ Develop a critical perspective on textual analysis
findings
❖ Define sampling strategy to understand different types
of workers
❖ Systematically investigate barriers to inclusion
Wolters, M. K., Hanson, V. L., & Moore, J. D. (2011). Leveraging large data sets for user requirements analysis. In The proceedings of the 13th international ACM
conference on Computers and accessibility (pp. 67-74). ACM.
Inclusive Design Toolkit, University of Cambridge
12. Summary
❖ Large-scale surveys of prevalence and incidence of
health conditions can help highlight aspects of platform
workers’ lives that they won’t tell us about on social
media
❖ Check your bias - disclosure and stigma
❖ Contact: Maria Wolters maria.wolters@ed.ac.uk
@mariawolters