This document summarizes a presentation about establishing an ethics framework for predictive analytics using student data in higher education. It discusses how technology has enabled more data collection and predictive modeling of student behavior. However, few guidelines exist for these practices. The presentation advocates developing an ethics framework that safeguards student privacy, promotes transparency, considers unintended consequences, and involves consultation. It also examines existing principles and discusses challenges like opaque predictive models that work against students' interests. The presenter argues universities should internalize norms of respecting trust and serving students, not just avoiding legal issues.
Cyber Summit 2016: Establishing an Ethics Framework for Predictive Analytics in Higher Education
1. Establishing an Ethics
Framework for Predictive
Analytics in Higher Education
Cyber Summit 2016, Banff
Stephen Childs, Institutional Analyst
October 27, 2016
2. Disclamer
The content of this presentation represents my views only.
and not that of my employer, the University of Calgary.
I am not qualified to accurately describe University of Calgary
policy in the areas discussed in this talk.
Please contact the University if you have policy questions.
4. Big Data, Big Problems
Advancing technology
—Better data collection
—Handle more data
—Apply algorithms to data
We know more about our students
Can make predictions about their behavior
Very few guidelines about this practice
5. Solutions
Develop an ethics framework around student data.
Build on existing guidelines.
Build on the norms of service to students
Do this now while these practices are new.
12. Student Data
Application
Student Information System
LMS
Unicard
Surveys
Residence
Facilities
Awarding Degrees
Grades
USRI
IT usage
Others…
13. Student Data
Students can opt out of some data collection, but not all
Student give us their data because they trust us
We need to deserve that trust!
—Respect student privacy
—Transparency about how data is used
—Accountability
—Consultation
—Consider the Consequences
16. Transparency and Accountability
Internalize norms is not enough!
How Universities use data should be known
—We aren’t corporations with competitive secrets
—We need to set up ways to report and share
We need to be able describe what happened!
Log events
Version control your software
Develop reporting methods
20. Best Practices using Predictive Analytics
Have to carefully present information to students
—Present a positive outlook
—Don’t personalize it – talk about a group of similar
students.
The factors in the model may be less deterministic than
unobserved factors.
Difference between causality and correlation.
Beware the self-fulfilling prophecy
22. Weapons of Math Destruction
Three factors make a model a WMD:
—Is the participant aware of the model? Is the model
opaque or invisible?
—Does the model work against the participant’s interest? Is
it unfair? Does it create feedback loops?
—Can the model scale?
23. Student Data Principles
http://studentdataprinciples.org/
Purpose and use of student data
Timely access to data
Data should not replace professional judgement.
Data governance, security, breach notification
24. Student Data Pledge
http://www.edtechmagazine.com/k12/article/2015/03/prote
ct-personal-student-information-pair-organizations-
recommends-commitment
Don’t sell student data, use data to target ads, or profile
students for non-educational purposes
Don’t collect more information or retain information longer
than necessary.
Do disclose how, what and why
25. uCalgary Data Rules
Freedom of Information and Privacy Act (1999)
—Students must be able to correct own info
—University must provide own info upon confirmation of ID
Categories of Data Confidentiality
Research Ethics Boards
—Data collection for University operations does not
generally fall under REB jurisdiction.
26. Financial Modeler’s Manifesto
https://www.wilmott.com/financial-modelers-manifesto/
Emanuel Derman and Paul Wilmott – January 7, 2009
The Modelers’ Hippocratic Oath
— I will remember that I didn’t make the world, and it doesn’t satisfy my
equations.
— Though I will use models boldly to estimate value, I will not be overly
impressed by mathematics.
— I will never sacrifice reality for elegance without explaining why I have
done so.
— Nor will I give the people who use my model false comfort about its
accuracy. Instead, I will make explicit its assumptions and oversights.
— I understand that my work may have enormous effects on society and
the economy, many of them beyond my comprehension.
27. Responsible Use of Student Data in Higher Education
http://gsd.su.domains/
Opportunity to understand student learning and enhance
educational attainment.
New questions about the ethical collection, use, and sharing
of information.
Commitments to honor the integrity, discretion, and humanity
of students.
Improve practice in light of accumulating information and
knowledge.
28. Maciej Cegłowski
https://pinboard.in and @pinboard
http://idlewords.com/talks/
Two talks on Data in particular:
—http://idlewords.com/talks/deep_fried_data.htm
—http://idlewords.com/talks/haunted_by_data.htm
29. Basic Framework
Safeguard Student Privacy
—Vendors; Monetizing Data
Strong internal norms around data
Consider and Measure Outcomes
Work with Data Owners and Stewards
Responsibility to Educate
Consult with Students and Stakeholders
Data should have a clear purpose
30. Next Steps
Write down your norms/expectations for working with
Student data
Set up a discussion with your co-workers about it.
Seek out others who perform a similar role and discuss it.
Discuss with the Student Data Steward at your institution.
Send me your comments!
31. Continue the Conversation
Follow me on twitter: @sechilds
Stephen.Childs@ucalgary.ca or sechilds@gmail.com
https://oia.ucalgary.ca/Contact
https://www.meetup.com/PyData-Calgary/
Editor's Notes
Higher Education Institutions have a lot of data.
We have a lot of data relating to our students.
Asymmetric power and information relationship
Research Ethics Boards – don’t typically cover this kind of work.
How do we know if our models are BAD and if they are harming students?
Talk Objectives:
Understand why responsible use of student data matters
Give you resources and language to talk about it
Start a conversation about student data
Analyst and Researcher - MA Economics from WLU
Handle Student Data and Build models of student behavior
Parts of my job are becoming easier
Better data science tools – mention PyData Calgary
Not much guidance
Office of Institutional Analysis
Reports to Vice-Provost (Planning & Resource Allocation)
…Who reports to the Provost & VP (Academic)
…Who reports to the President
We don’t own the institution’s data – but we become experts in it. (Particularly student data)
Create reports for: SLT, Internal stakeholders, Government, Public
Fact Book, USRI (instruction ratings), University Rankings, U-15
Institutional research is great – work with data, wide variety of tasks
50th Anniversary Year
Eyes High
Energizing Eyes High
Better serve students and community
Central focus of the University
Attend for various reasons: job, skills, education -- University is one of the best paths to the life they want. Student pay the University to act as a gatekeeper
Demographic changes, Different support needs
Asymmetrical Power and Information
Enter the system as soon as they apply
Privacy is just the starting point
Employees should only have access to data they need
Security – need to mention @SwiftOnSecurity
De-Identify Data where necessary
Access to data his a hard problem
To restrictive and nothing gets done
To open and problems result
Employees need to internalize norms about data
Highly specific to institutional culture
Need employees to use their own judgment
Version control - Software Carpentry
Are we hearing the students about how we use their data?
Do they care about this?
How can we consult with students and get a useful result?
Worth reading - but don't watch the TV movie.
Technology used in the business world
Statistics, Econometrics, Machine Learning
Universities have experts on all three.
Expertise starting to move into admin
Bad use of aggregate statistics can lead to bad policy.
Bad use of individual data can lead to bad individual outcomes
Individual outcomes can be systematically bad
https://ischool.syr.edu/infospace/2013/11/13/using-predictive-analytics-to-understand-your-business/
https://ischool.syr.edu/infospace/wp-content/files/2013/12/crystal-ball-e1385997891512.jpg
Talk about examples of WMD at this point: prison recidivism models used in sentencing,