Louise Bezuidenhout, Institute for Science, Innovation and Society, Oxford:
Projects such as the CODATA-RDA School for Research Data Science highlight the need for building capacity in research data skills around the world. Indeed, without these key skills it is likely that many disciplines and communities will continue to miss out on the benefits of a growing pool of open data resources online. Educating researchers in data skills is thus fundamental in maximizing the benefits of Open Science, but it is also an opportunity to shape the future by educating for responsible data science.
This talk will examine the ethics/Open Science component of the CODATA-RDA school and highlight how the commitment to responsible research underpins all areas of instruction. It will also discuss some of the difficulties of educating for data ethics and responsible practice in a field that is multi-disciplinary and multi-national. Finally, the talk will cover the practice-oriented, modular approach to ethics that has been developed in the CODATA-RDA school to specifically address these challenges.
Louise Bezuidenhout - OpenCon Oxford, 1st Dec 2017
1. Embedding Openness in Practice
Lessons from the CODATA/RDA School for Research Data
Science
LOUISE BEZUIDENHOUT
INSTITUTE FOR SCIENCE, INNOVATION AND SOCIETY, UNIVERSITY OF OXF ORD
2. Educating for Responsible Data Scientists
• Evolution of data-centric science needs specialist data scientists
• Competence in discipline (type of data) and meta-discipline (tools for data usage)
• Key understandings of tools and structures supporting data-centric sciences
• Responsible data scientists thus:
• Understand ethical issues relating to their discipline
• Scrutinize the development of data infrastructures
• Highlight ethical issues with the application of data tools
Key for the future of the Open Science movement
Monitor potential injustice in the evolution of an open science landscape
3. The Challenge …
1. Deciding on a content
• Little consensus on what an “ethics of data science” is
• Need to make content relevant to scientists from a wide range of disciplinary backgrounds
2. The challenges of teaching ethics
• Aiming for awareness, consensus, or internalization
• Translating ethics teaching into in situ daily research practices
3. CODATA/RDA SRDS-specific
• Multidisciplinary – different ethical concerns
• Many data types and sources
• Different cultural and legal backgrounds
4. The Challenge …
4. Attitudes to ethics:
• Ethics happens once during an REC review
• I don’t need ethics – I don’t work with humans/animals
• I didn’t collect the data, so ethics is not my problem
• I’m not once of the bad guys …
• Ethics is what other people worry about
•Making transition from “it’s a nice idea” to “I can see how it works in practice”
5. The Aim …
• To make ethics an integral and value-adding component of the CODATA/RDA SRDS
• To make students aware of the key concepts driving Open Data and Responsible Research and
Innovation (RRI) movements
• To initiate discussion about responsibilities
• To enable students to make the transition from openness in theory to openness in practice
• To encourage students to integrate openness into all aspects of their research
6. Stopping the Compartmentalization of
Openness and RRI
Data
management
Online
presence
Responsible/ethical
research
• Hands on – practical
• Bottom-up ethics
• Avoid ”stand alone” courses
7. Teaching an Ethics of Openness
Practice-Oriented
Data Ethics
Lecture on
Open Science
Lecture on using the
RRI toolkit
Exercises associating ethics
with learnt tools
• Evolution of OS movement
• Benefits of OS
• Key ethical concepts:
• justice, responsibility, beneficence
• How does ethics fit into broader scheme of
RRI
• Ethics, gender equality, governance, OA,
public engagement, science education
• How does an RRI research programme look
• Introduce rri-tools.eu
What ethical conundrums come up that are relevant to ALL data scientists?
9. R question 2
The Association for Computing Machinery Code of Conduct details a number of ethical duties that
professionals with regards to the public. Please choose the three you think are most important.
http://www.acm.org/about/se-code#full
10.
11. Lessons From CODATA/RDA SRDS
• Modular works well
• Ethics ”prompts” associated with modular skill teaching integrated ethics into daily research activities
• Important to follow up theoretical ethics lectures with practical tasks: students need to see how key concepts
of openness translate into ALL aspects of daily research
• Transitioning from theory to practice is scary
• RRI toolkit enabled students to think beyond “retro-fitting” openness to projects
• Need to assist students to see how ethics, regulations, and expectations impact on daily research practices
• Eliminating the “it’s not me” in ethics discussions
• Stop students from thinking that ethics doesn’t apply to them (didn’t create data, not human data, no animal
work etc)
• Expand horizons: drill down to the ethical implications of the “nitty gritty” of daily research
• Ethical research is something anyone can do
• Highlight flexibility, contextuality, diversity: ethics is not something that is “set in stone”
• Foster enthusiasm: students are more receptive when they feel they can contribute
• As data experts, students need to recognize they are in the best position to safeguard science
12. Thank you
Special thanks to:
• Sarah Jones (HATII) and Gail Clement
(Caltech)
• Hugh Shanahan and Rob Quick
• CODATA and RDA
Hinweis der Redaktion
As Hugh has been talking about, the way that science is being conducted is changing rapidly.
Our ability to generate and process data is increasing exponentially, and the advent of fields such as genomics has given us the rise of data-centric science
All this has led to a critical need for specialist data scientists to navigate this increasingly complicated landscape
Being a data scientist is challenging, however, and usually requires both a competence in discipline (to understand the types of data and analyses necessary) as well as competence in the meta-discipline of data science (so as to have expertise in the tools for data usage)
Data scientists thus represent a key element not only of future science, but particularly the future of responsible science
In addition to understanding and addressing the ethical issues relating to their discipline and discipline-specific data,
Data scientists – through their expertise on tools – are able to
Scrutinize the development of data infrastructures
Highlight ethical issues with the application of data tools
They therefore represent a key stakeholder in the future of the Open Science movement by their ability to monitor potential injustice in the evolution of an open science landscape
How we design data sharing platforms
What algorithms are used to process data
How some data are selected preferentially over others
So, while data scientists play a key role in the future of Open and responsible Science, educating them about their role is not straightforward
Teaching data scientists ethics has a number of different problems:
1. Deciding on a content
Little consensus on what an “ethics of data science” is – is it just the normal issues relating to data sharing, or does it include the infraethics of infrastructures?
How can content be both specific to disciplinary backgrounds and also broadly pertinent?
2. The challenges of teaching ethics
First and foremost is the perennial question of what is aimed for in ethics teaching – are you just aiming for awareness of key issues, do you expect consensus, or should ethics training facilitate the internalization of key norms?
How does one help students to translate ethics teaching into in situ daily research practices – as it is often very difficult to see how abstract principles play out in daily work
3. CODATA/RDA SRDS-specific
In a way, the data school encapsulates some of these problems, as the group was both multidisciplinary and from different cultural and legal backgrounds
They were therefore working with many data types and sources, and governed by a wide range of different legislations and expectations
A further challenge relates to general attitudes to ethics in the science community.
As we have seen an increasingly bureaucratic attitude to ethics – through RECs and legislation – students often compartmentalize ethics out of their daily research
I’ve had many conversations in which scientists will say:
Ethics happens once during an REC review
I don’t need ethics – I don’t work with humans/animals
I didn’t collect the data, so ethics is not my problem
I’m not once of the bad guys …
Ethics is what other people worry about
So, the challenge was to assist students to make the transition from “it’s a nice idea” to “I can see how it works in practice”
Setting up the ethics component of the CODATA/RDA SRDS thus faced a number of challenges
We wanted to make ethics an integral and value-adding component of the CODATA/RDA SRDS and thus needed to navigate some of the challenges of teaching data science ethics
Importantly, we were committed to doing more than just preparing students to field REC ethics reviews. We wanted:
To make students aware of the key concepts driving Open Data and Responsible Research and Innovation (RRI) movements
To initiate discussion about responsibilities
To enable students to make the transition from openness in theory to openness in practice
To encourage students to integrate openness into all aspects of their research
3 interlinking areas
We tried very hard to make sure that all areas were hands on – practical
This gave a kind of bottom-up ethics where responsible/ethical practice was closely linked to the tools that the students were learning for use in daily research
All three areas were driven by the ethical principles underpinning the Open Science movement – openness, responsibility, beneficence and justice
The first was openness and data management
FAIR
Practical tools
The second was managing openness and online presence
Author carpentry
To track and monitor their work online, but also to ensure that it had the maximum impact
The third was responsible/ethical research
In teaching the ethics component of the course we made a couple of strategic decisions
First, due to the wide range of disciplinary backgrounds, we did not engage specifically in the more traditional aspects of data sharing such as privacy and ownership
We felt that these would be covered in disciplinary-specific discussions
Instead, we focused on the ethics of the tools of data science
Specifically relating to the tools used to build infrastructures and to design and roll-out data analysis tools
And relating to the nitty gritty of coding and programming
This approach involved a three part approach – two more formal lectures and one integrated stream
To develop the exercises associating ethics with learnt tools we decided to exploit the modular teaching format of the school
As Hugh explained, the school was constructed using the data carpentry structure of modular and discrete teaching packages
In each module students were introduced to a specific tool for data science – such as
As the use of each tool raises ethical questions, it was possible to construct a series of small ethics exercises that were associated with each module
Thus, for . . . we asked the question . . .
This enabled students to directly see how the use of ethics operated on a microlevel in research, and how the use of research tools required a commitment to responsibility
Some made use of online surveying tools
Some involved mind-mapping, and others involved open questions
The results of each exercise were put up on boards for the class to look at
Good responses – interest in openness and ethics in daily research
Modular works well
Ethics ”prompts” associated with modular skill teaching integrated ethics into daily research activities
Important to follow up theoretical ethics lectures with practical tasks: students need to see how key concepts of openness translate into ALL aspects of daily research
Transitioning from theory to practice is scary
RRI toolkit enabled students to think beyond “retro-fitting” openness to projects
Need to assist students to see how ethics, regulations, and expectations impact on daily research practices
Eliminating the “it’s not me” in ethics discussions
Stop students from thinking that ethics doesn’t apply to them (didn’t create data, not human data, no animal work etc)
Expand horizons: drill down to the ethical implications of the “nitty gritty” of daily research
Ethical research is something anyone can do
Highlight flexibility, contextuality, diversity: ethics is not something that is “set in stone”
Foster enthusiasm: students are more receptive when they feel they can contribute
As data experts, students need to recognize they are in the best position to safeguard science