SlideShare a Scribd company logo
1 of 51
Download to read offline
WHAT CAN
MACHINE
LEARNING DO
FOR OPEN
EDUCATION?
Geoff Gordon
CMU Machine Learning
ggordon@cs.cmu.edu
Civilization advances by
extending the number of
important operations which
we can perform without
thinking about them.
—Alfred North Whitehead, 1911
Geoff Gordon—OCWC—April 2014
CONTRIBUTION OF ML
Machine learning can help us understand how students learn
4
Geoff Gordon—OCWC—April 2014
CONTRIBUTION OF ML
Machine learning can help us understand how students learn
‣ Not just any ML, but latent variable (“hidden feature”) discovery
4
Geoff Gordon—OCWC—April 2014
CONTRIBUTION OF ML
Machine learning can help us understand how students learn
‣ Not just any ML, but latent variable (“hidden feature”) discovery
‣ Not just any latent variables, but highly structured ones
4
Geoff Gordon—OCWC—April 2014
WHY BOTHER?
Student feedback
‣ what does the student know?
‣ what are common causes for the mistake the student just made?
Instructor feedback
‣ what do the students know?
‣ what skills does this course content address?
‣ what skills doesn’t this course content address?
Evaluation
‣ help design rubrics for (peer, instructor) grading
‣ cluster submissions by similar approach, skill level, …
Etc…
5
Geoff Gordon—OCWC—April 2014
GOAL: UNDERSTAND HOW STUDENTS
LEARN SOMETHING
6
10m
4m
5m
10m
3x + 4 = x + 10
John took Joe for a ride on his boat.
___ boat was blue with a red stripe.
A/The/[]
Geoff Gordon—OCWC—April 2014
GOAL: UNDERSTAND HOW STUDENTS
LEARN SOMETHING
6
10m
4m
5m
10m
3x + 4 = x + 10
John took Joe for a ride on his boat.
___ boat was blue with a red stripe.
70m2
x = 3
The
A/The/[]
Geoff Gordon—OCWC—April 2014
EX: GEOMETRY TUTOR
7
http://www.carnegielearning.com/
Geoff Gordon—OCWC—April 2014
STEP-LEVEL DATA
Record right/wrong, timing, use of hints, …
8
one step = fill in a box
Geoff Gordon—OCWC—April 2014
SIMPLEST MODEL: RASCH /
1-PARAMETER ITEM RESPONSE THEORY
9
ln
✓
pt
1 pt
◆
= ✓it
+ jt
θ = student mean (knowledge level)

β = item mean (easy/difficult)
it = student ID jt = step ID
pt = P(correct answer)
Geoff Gordon—OCWC—April 2014
SIMPLEST MODEL: RASCH /
1-PARAMETER ITEM RESPONSE THEORYStudents
Steps in tutor Each entry: does student i get step j right?
1
10
θ
β
predict 1 if θi+βj > 0
Geoff Gordon—OCWC—April 2014
STRUCTURE: SIMILARITY AMONG STEPS
Learn a “step map”: each point = 1 step
Steps over here are more
similar to each other…
… than to steps over here
Geoff Gordon—OCWC—April 2014
STRUCTURE: SIMILARITY AMONG STEPS
Learn a “step map”: each point = 1 step
Steps over here are more
similar to each other…
… than to steps over here
hic sunt
dracones
Geoff Gordon—OCWC—April 2014
HOW PRINCIPAL COMPONENTS ANAYSIS
GOT FAMOUS
Y1
Y2
Y3
.

.

.

Yn
Users
Movies
Each entry: how many stars does user i give
to movie j?
4
12
Geoff Gordon—OCWC—April 2014
RESULT OF FACTORING
u1
u2
u3
.

.

.

un
v1
…
vk
Users
MoviesBasis weights
Basisvectors
Low-d basis = latent variables

!
Basis vectors represent latent properties of
movies, e.g.,“is a comedy”
13
Geoff Gordon—OCWC—April 2014
IN OUR CASE (STUDENT-STEP DATA)
U1
U2
U3
.
.
.
UN
V1
…
VK
Students
Stepsbasis weights
basisvectors
Basis vectors are candidate “eigenskills”
Weights are students’ knowledge
levels
14
Geoff Gordon—OCWC—April 2014
DOES IT WORK?
15
steps about pentagons
steps about circles
other steps.
Learned features let us
predict held-out data better
than chance
(ρ = .3, p < 0.0001)
step map
Geoff Gordon—OCWC—April 2014
DOES IT WORK?
15
steps about pentagons
steps about circles
other steps.
Learned features let us
predict held-out data better
than chance
(ρ = .3, p < 0.0001)
Yes, sort of …
step map
Geoff Gordon—OCWC—April 2014
STRUCTURE: PRACTICE MAKES PERFECT
PCA ignores step order — clearly wrong…
Add model of student learning to PCA
‣ based on “additive factor model” [Draney et al., 1995]
16
Geoff Gordon—OCWC—April 2014
STRUCTURE: PRACTICE MAKES PERFECT
PCA ignores step order — clearly wrong…
Add model of student learning to PCA
‣ based on “additive factor model” [Draney et al., 1995]
Result: predictions of held-out data get slightly better
‣ ρ = .45 (p < 0.01 vs. plain PCA)
Step map still looks the same
Meh…
17
Geoff Gordon—OCWC—April 2014
WHAT WE REALLY WANT
To be understandable to us humans, latents need to be sparse and
binary (‘is about circles’,‘requires subtracting areas’)
Can’t do this fully automatically from this small data set (only 59
students, 370 steps)
Challenge: can we discover sparse, binary, understandable latents
automatically from MOOC-scale data?
18
Geoff Gordon—OCWC—April 2014
“KC HYPOTHESIS”
Knowledge comes in atomic units (“KCs”)
Each KC is learned independently (no transfer)
‣ transfer among steps mediated by common KCs
‣ or prerequisite structure (can’t learn algebra w/o knowing arithmetic)
Each student has a (latent, scalar) proficiency level for each KC
‣ learn/forget = transition to a higher/lower proficiency level
Learning a KC happens only through exposure to that KC
‣ problem, worked example, lecture, real life, …
19
step 17: {A, B}

step 23: {A, C}
[Koedinger, Corbett, Perfetti. Cognitive Science, 2012]
Geoff Gordon—OCWC—April 2014
CONSEQUENCES OF KC HYPOTHESIS
Mistakes are at KC level: select wrong KC; apply right KC to wrong
data; mistake in application of KC
‣ identifying the KC at fault makes it easier to give student feedback
If we can accurately
‣ determine list of KCs
‣ label instructional activities by KCs
…then we immediately know the quality/coverage of our content
20
Geoff Gordon—OCWC—April 2014
COMPOSE-BY-ADDITION
21
[Stamper & Koedinger,AIED 2011]
Geoff Gordon—OCWC—April 2014
COMPOSE-BY-ADDITION
22
[Stamper&Koedinger,AIED2011]
Geoff Gordon—OCWC—April 2014
WHY ARE SOME COMPOSE-BY-ADDITION
STEPS HARDER?
23
compose by
addition
Geoff Gordon—OCWC—April 2014
WHY ARE SOME COMPOSE-BY-ADDITION
STEPS HARDER?
24
hard
easy
medium
compose by
addition
Geoff Gordon—OCWC—April 2014
WHY ARE SOME COMPOSE-BY-ADDITION
STEPS HARDER?
24
hard
easy
medium
compose by
addition
Geoff Gordon—OCWC—April 2014
HYPOTHESIS: DIFFERENCE IS IN HOW
MUCH PLANNING IS NEEDED
25
plan to
compose
subtract
compose by
addition
Geoff Gordon—OCWC—April 2014
KC DISCOVERY
26
t [4]. Other problems were “unscaffolded” and did not start with such
hus students had to pose these subgoals themselves. Indeed the blips for
y-addition (seen in the learning curve in Figure 2) do correspond with a
ency of these unscaffolded problems.
[Stamper&Koedinger,AIED2011]
Geoff Gordon—OCWC—April 2014
USE DATA-DRIVEN MODEL TO
REDESIGN TUTOR
New skill bars for planning skills
‣ skill bars are a tutor interface to
show students where they are in
acquiring skills
Sequence for gentle slope
‣ adaptive fading of scaffolding
New problems that focus on planning
‣ next slide…
27
Combine areas
Enter given values
Find regular area
Plan to combine areas
Combine areas
Subtract
Enter given values
Find regular area
Geoff Gordon—OCWC—April 2014
NEW PROBLEMS: ISOLATE PRACTICE ON
PLANNING STEP
Decompose complex problem into simpler ones
28
Geoff Gordon—OCWC—April 2014
NEW PROBLEMS: ISOLATE PRACTICE ON
PLANNING STEP
Decompose complex problem into simpler ones
28
Geoff Gordon—OCWC—April 2014
RESULTS
More efficient: 25% less student time
‣ instructional time by step type
Better learning of planning skills
‣ post-test %correct by item type
29
428 K.R. Koedinger et al
(a)
Fig. 4. Students using the rede
28 minutes) while actually spe
learned these decomposition sk
tion problems
5 Discussion and C
Following our past demon
discovered from data [8; 1
model to redesign an adapt
ports the hypothesis. In p
reached mastery (as demon
.
esigned tutor reached master
ending more time on the criti
kills as demonstrated by bett
Conclusion
nstrations that better cogn
1], we have tested the h
tive tutor yields better stu
particular, we found stud
nstrated within the tutor a
428 K.R. Koedinger et al
(a)
.
(b)
time:minutes%correct
[Stamper & Koedinger,AIED 2011]
Geoff Gordon—OCWC—April 2014
MORE STRUCTURE: WHAT’S IN A KC?
So far, each KC is just present or absent in a student or problem
Nothing to distinguish algebra KCs from ESL KCs
What’s going on under the hood as a student solves a problem?
30
Geoff Gordon—OCWC—April 2014
RULE-BASED COGNITIVE MODEL
3(2x – 5) = 9
6x – 15 = 9 2x – 5 = 3 6x – 5 = 9
IF GOAL IS SOLVE A(BX+C) = D
THEN REWRITE AS ABX + AC = D
IF GOAL IS SOLVE A(BX+C) = D
THEN REWRITE AS ABX + C = D
IF GOAL IS SOLVE A(BX+C) = D
THEN REWRITE AS BX+C = D/AKCs
bug
31
What does it look like inside the student’s brain?
‣ … maybe a rule-based system
‣ … in which case KCs might correspond to rules
‣ :- president of US is Obama
‣ constant C on LHS of equation E :- move C to RHS of E
Geoff Gordon—OCWC—April 2014
RULE-BASED SYSTEM
Aka production system:
‣ declarative knowledge held in working memory
‣ production rules match declarative knowledge
‣ and act on WM or external world
Much cognitive modeling work endorses this claim explicitly or implicitly
‣ ACT-R, SimStudent, Russell & Norvig, …
!
But two problems: uncertainty handling, representation learning
‣ here’s where more ML research can help!
32
“I see 3x+5 = 8”
“if LHS has constant C…”
“… then subtract C from both sides”
Geoff Gordon—OCWC—April 2014
PROBLEM 1: UNCERTAINTY
A DAY IN THE LIFE OF A RAT
33
Trial Bell? Light? Food?
1 × ✓ ✓
2 ✓ × ×
3 × ✓ ×
4 × ✓ ✓
… … … …
Geoff Gordon—OCWC—April 2014
RAT AS BAYESIAN
Priors over: how many trial types, sparsity of connections, reliability of
connections, …
(This is a common architecture for medical diagnosis systems)
34
bell light food …
1 2Trial types
Observables
…
Geoff Gordon—OCWC—April 2014
QUIZ: ARE YOU SMARTER THAN A RAT?
35
Trial Bell? Light? Food?
1 × ✓ ✓
2 × ✓ ×
3 × ✓ ✓
4 × ✓ ✓
… … … …
100 × ✓ ✓
Geoff Gordon—OCWC—April 2014
QUIZ: ARE YOU SMARTER THAN A RAT?
35
Trial Bell? Light? Food?
1 × ✓ ✓
2 × ✓ ×
3 × ✓ ✓
4 × ✓ ✓
… … … …
100 × ✓ ✓
101 ✓ ✓ ×
Geoff Gordon—OCWC—April 2014
AND THE RAT SAYS…
Both right! With more light-bell trials, evidence increases for a
separate trial type.
36
Effect name
2nd-order
conditioning
Conditioned
inhibition
light-food trials many many
bell-light trials few many
test: bell predicts food? ↑ ↓
Geoff Gordon—OCWC—April 2014
BAYESIAN RULE LEARNING IN CLASSICAL
CONDITIONING
Only fully Bayesian inference/learning
captured both effects
[Courville, Daw, Gordon,Touretzky, NIPS 2003]
Few bell-light trials, 1 trial type:
(bell, light, food) all associated
More trials: (bell, light, no food) v.
(light, food, no bell)
37
Number of bell-light trials
0 10 20 30 40 50 60
0
0.2
0.4
0.6
0.8
1
Number of A−X trials
P(US | A, D )
P(US | X, D )
(a) Second-order Cond.
P(food | light)
P(food | bell)
Geoff Gordon—OCWC—April 2014
PROBLEM 2: REPRESENTATION LEARNING
38
Flaw with
“KC = rule”:
Many bugs
come from
weak
features
Geoff Gordon—OCWC—April 2014
REPRESENTATION LEARNING
Some student errors come from failure to correctly interpret
(internally represent) a problem
As student sees more and more examples like 3x + 5 = 8, gets better
and better “language model” to explain them (build internal
representation)
—> some KCs must correspond to features of the improved language
model
39
Geoff Gordon—OCWC—April 2014
EXPERIMENT
Present algebra examples to a machine learning system
As part of learning, induce a language model (an unsupervised
probabilistic context free grammar) for algebra equations
Make output of language model (grammar nonterminals, e.g.,
SignedNumber) available as features of each example
Use these features in simulated problem-solving to discover KCs
[Li, Cohen, Koedinger, Matsuda, 2010]
Geoff Gordon—OCWC—April 2014
NEW COGNITIVE MODELS ARE
MORE ACCURATE
41
[Li, Cohen, Koedinger, Matsuda, 2010]
Geoff Gordon—OCWC—April 2014
OPEN RESEARCH QUESTION
Can we build a new generation of rule-based system that has
‣ rich uncertainty handling
‣ integrated representation learning
… and use it to help us model student learning?
42
Geoff Gordon—OCWC—April 2014
SUMMARY
A key contribution of machine learning to education will be to help
understand the educational content we’re creating and delivering
Essential idea: ML models of structured latent variables
Specifically, build and test hypotheses about the knowledge,
procedures and representations students use to solve problems
‣ latents = KCs, rules, representations, strategies, …
Need to link uncertainty handling (traditional domain of ML) to new,
harder situations encountered in understanding student knowledge
Exciting time for research in ML and education!
43

More Related Content

Similar to Dr. Geoffrey J. Gordon: What can machine learning do for open education?

Teacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loats
Teacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loatsTeacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loats
Teacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loats
Jeff Loats
 
Cognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent TutorsCognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent Tutors
Cody Ray
 
Unit 1 Webinar
Unit 1 WebinarUnit 1 Webinar
Unit 1 Webinar
jwalts
 

Similar to Dr. Geoffrey J. Gordon: What can machine learning do for open education? (20)

Teacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loats
Teacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loatsTeacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loats
Teacher-Scholar Forum - Just in Time Teaching - feb 2013 - jeff loats
 
UMR - My ongoing projects with Technology - Rochester - 2015
UMR - My ongoing projects with Technology - Rochester - 2015 UMR - My ongoing projects with Technology - Rochester - 2015
UMR - My ongoing projects with Technology - Rochester - 2015
 
Learning Analytics Dashboards
Learning Analytics DashboardsLearning Analytics Dashboards
Learning Analytics Dashboards
 
Lecture 1: Deep Learning Fundamentals - Full Stack Deep Learning - Spring 2021
Lecture 1: Deep Learning Fundamentals - Full Stack Deep Learning - Spring 2021Lecture 1: Deep Learning Fundamentals - Full Stack Deep Learning - Spring 2021
Lecture 1: Deep Learning Fundamentals - Full Stack Deep Learning - Spring 2021
 
Cognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent TutorsCognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent Tutors
 
Automating Feedback & Assessment in WebLab
Automating Feedback & Assessment in WebLabAutomating Feedback & Assessment in WebLab
Automating Feedback & Assessment in WebLab
 
Data Science Reinvents Learning?
Data Science Reinvents Learning?Data Science Reinvents Learning?
Data Science Reinvents Learning?
 
Key Terms PBL by CAE CI Winter 14 Educators
Key Terms PBL by CAE CI Winter 14 EducatorsKey Terms PBL by CAE CI Winter 14 Educators
Key Terms PBL by CAE CI Winter 14 Educators
 
Bulkley valley.leadership.dec2011
Bulkley valley.leadership.dec2011Bulkley valley.leadership.dec2011
Bulkley valley.leadership.dec2011
 
The College Classroom Fa15 Meeting 2: Developing Expertise
The College Classroom Fa15 Meeting 2: Developing ExpertiseThe College Classroom Fa15 Meeting 2: Developing Expertise
The College Classroom Fa15 Meeting 2: Developing Expertise
 
Explainer Videos
Explainer VideosExplainer Videos
Explainer Videos
 
Just in time teaching a 21st century brain-based technique - jeff loats - l...
Just in time teaching   a 21st century brain-based technique - jeff loats - l...Just in time teaching   a 21st century brain-based technique - jeff loats - l...
Just in time teaching a 21st century brain-based technique - jeff loats - l...
 
Chounta@paws
Chounta@pawsChounta@paws
Chounta@paws
 
Why today's students need a 3D education
Why today's students need a 3D educationWhy today's students need a 3D education
Why today's students need a 3D education
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014
 
Narayana School olympiad Program ppt.pptx
Narayana School olympiad Program ppt.pptxNarayana School olympiad Program ppt.pptx
Narayana School olympiad Program ppt.pptx
 
Unit 1 Webinar
Unit 1 WebinarUnit 1 Webinar
Unit 1 Webinar
 
The College Classroom Week 3: Developing Expertise through Deliberate Practice
The College Classroom Week 3: Developing Expertise through Deliberate PracticeThe College Classroom Week 3: Developing Expertise through Deliberate Practice
The College Classroom Week 3: Developing Expertise through Deliberate Practice
 
Controlled Assessment vs IGCSE
Controlled Assessment vs IGCSEControlled Assessment vs IGCSE
Controlled Assessment vs IGCSE
 
ALASI15 Writing Analytics Workshop
ALASI15 Writing Analytics WorkshopALASI15 Writing Analytics Workshop
ALASI15 Writing Analytics Workshop
 

More from The Open Education Consortium

OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...
The Open Education Consortium
 
Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...
The Open Education Consortium
 
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
The Open Education Consortium
 

More from The Open Education Consortium (20)

Moving Beyond OER: USNH
Moving Beyond OER: USNHMoving Beyond OER: USNH
Moving Beyond OER: USNH
 
Universities for the Future (OE Global 2015)
Universities for the Future (OE Global 2015) Universities for the Future (OE Global 2015)
Universities for the Future (OE Global 2015)
 
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
 
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
 
Open license for MOOCs - Martijn Ouwehand (OE global 2015)
Open license for MOOCs - Martijn Ouwehand (OE global 2015) Open license for MOOCs - Martijn Ouwehand (OE global 2015)
Open license for MOOCs - Martijn Ouwehand (OE global 2015)
 
OE Global 2015 - Paradis
OE Global 2015 -  ParadisOE Global 2015 -  Paradis
OE Global 2015 - Paradis
 
Inclusive design (oe global action lab)
Inclusive design (oe global action lab)Inclusive design (oe global action lab)
Inclusive design (oe global action lab)
 
OE global2015 Pre-conference workshop
OE global2015 Pre-conference workshopOE global2015 Pre-conference workshop
OE global2015 Pre-conference workshop
 
Open Education 101 (OE Global 2015 Pre-conference workshop)
Open Education 101 (OE Global 2015 Pre-conference workshop)Open Education 101 (OE Global 2015 Pre-conference workshop)
Open Education 101 (OE Global 2015 Pre-conference workshop)
 
OER Roadmap (OE Global Pre-conference workshop)
OER Roadmap (OE Global Pre-conference workshop)OER Roadmap (OE Global Pre-conference workshop)
OER Roadmap (OE Global Pre-conference workshop)
 
Collaborating across borders: OER use and open educational practices within t...
Collaborating across borders: OER use and open educational practices within t...Collaborating across borders: OER use and open educational practices within t...
Collaborating across borders: OER use and open educational practices within t...
 
OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...
 
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
 
Teachers time is valuable (OE global2015)
Teachers time is valuable (OE global2015) Teachers time is valuable (OE global2015)
Teachers time is valuable (OE global2015)
 
Open Assembly (Global 2015 Action Lab)
Open Assembly (Global 2015 Action Lab)Open Assembly (Global 2015 Action Lab)
Open Assembly (Global 2015 Action Lab)
 
Atieno Adala
Atieno Adala Atieno Adala
Atieno Adala
 
OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...
 
Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...
 
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
 
Openlib03
Openlib03Openlib03
Openlib03
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 

Dr. Geoffrey J. Gordon: What can machine learning do for open education?

  • 1. WHAT CAN MACHINE LEARNING DO FOR OPEN EDUCATION? Geoff Gordon CMU Machine Learning ggordon@cs.cmu.edu
  • 2. Civilization advances by extending the number of important operations which we can perform without thinking about them. —Alfred North Whitehead, 1911
  • 3.
  • 4. Geoff Gordon—OCWC—April 2014 CONTRIBUTION OF ML Machine learning can help us understand how students learn 4
  • 5. Geoff Gordon—OCWC—April 2014 CONTRIBUTION OF ML Machine learning can help us understand how students learn ‣ Not just any ML, but latent variable (“hidden feature”) discovery 4
  • 6. Geoff Gordon—OCWC—April 2014 CONTRIBUTION OF ML Machine learning can help us understand how students learn ‣ Not just any ML, but latent variable (“hidden feature”) discovery ‣ Not just any latent variables, but highly structured ones 4
  • 7. Geoff Gordon—OCWC—April 2014 WHY BOTHER? Student feedback ‣ what does the student know? ‣ what are common causes for the mistake the student just made? Instructor feedback ‣ what do the students know? ‣ what skills does this course content address? ‣ what skills doesn’t this course content address? Evaluation ‣ help design rubrics for (peer, instructor) grading ‣ cluster submissions by similar approach, skill level, … Etc… 5
  • 8. Geoff Gordon—OCWC—April 2014 GOAL: UNDERSTAND HOW STUDENTS LEARN SOMETHING 6 10m 4m 5m 10m 3x + 4 = x + 10 John took Joe for a ride on his boat. ___ boat was blue with a red stripe. A/The/[]
  • 9. Geoff Gordon—OCWC—April 2014 GOAL: UNDERSTAND HOW STUDENTS LEARN SOMETHING 6 10m 4m 5m 10m 3x + 4 = x + 10 John took Joe for a ride on his boat. ___ boat was blue with a red stripe. 70m2 x = 3 The A/The/[]
  • 10. Geoff Gordon—OCWC—April 2014 EX: GEOMETRY TUTOR 7 http://www.carnegielearning.com/
  • 11. Geoff Gordon—OCWC—April 2014 STEP-LEVEL DATA Record right/wrong, timing, use of hints, … 8 one step = fill in a box
  • 12. Geoff Gordon—OCWC—April 2014 SIMPLEST MODEL: RASCH / 1-PARAMETER ITEM RESPONSE THEORY 9 ln ✓ pt 1 pt ◆ = ✓it + jt θ = student mean (knowledge level) β = item mean (easy/difficult) it = student ID jt = step ID pt = P(correct answer)
  • 13. Geoff Gordon—OCWC—April 2014 SIMPLEST MODEL: RASCH / 1-PARAMETER ITEM RESPONSE THEORYStudents Steps in tutor Each entry: does student i get step j right? 1 10 θ β predict 1 if θi+βj > 0
  • 14. Geoff Gordon—OCWC—April 2014 STRUCTURE: SIMILARITY AMONG STEPS Learn a “step map”: each point = 1 step Steps over here are more similar to each other… … than to steps over here
  • 15. Geoff Gordon—OCWC—April 2014 STRUCTURE: SIMILARITY AMONG STEPS Learn a “step map”: each point = 1 step Steps over here are more similar to each other… … than to steps over here hic sunt dracones
  • 16. Geoff Gordon—OCWC—April 2014 HOW PRINCIPAL COMPONENTS ANAYSIS GOT FAMOUS Y1 Y2 Y3 . . . Yn Users Movies Each entry: how many stars does user i give to movie j? 4 12
  • 17. Geoff Gordon—OCWC—April 2014 RESULT OF FACTORING u1 u2 u3 . . . un v1 … vk Users MoviesBasis weights Basisvectors Low-d basis = latent variables ! Basis vectors represent latent properties of movies, e.g.,“is a comedy” 13
  • 18. Geoff Gordon—OCWC—April 2014 IN OUR CASE (STUDENT-STEP DATA) U1 U2 U3 . . . UN V1 … VK Students Stepsbasis weights basisvectors Basis vectors are candidate “eigenskills” Weights are students’ knowledge levels 14
  • 19. Geoff Gordon—OCWC—April 2014 DOES IT WORK? 15 steps about pentagons steps about circles other steps. Learned features let us predict held-out data better than chance (ρ = .3, p < 0.0001) step map
  • 20. Geoff Gordon—OCWC—April 2014 DOES IT WORK? 15 steps about pentagons steps about circles other steps. Learned features let us predict held-out data better than chance (ρ = .3, p < 0.0001) Yes, sort of … step map
  • 21. Geoff Gordon—OCWC—April 2014 STRUCTURE: PRACTICE MAKES PERFECT PCA ignores step order — clearly wrong… Add model of student learning to PCA ‣ based on “additive factor model” [Draney et al., 1995] 16
  • 22. Geoff Gordon—OCWC—April 2014 STRUCTURE: PRACTICE MAKES PERFECT PCA ignores step order — clearly wrong… Add model of student learning to PCA ‣ based on “additive factor model” [Draney et al., 1995] Result: predictions of held-out data get slightly better ‣ ρ = .45 (p < 0.01 vs. plain PCA) Step map still looks the same Meh… 17
  • 23. Geoff Gordon—OCWC—April 2014 WHAT WE REALLY WANT To be understandable to us humans, latents need to be sparse and binary (‘is about circles’,‘requires subtracting areas’) Can’t do this fully automatically from this small data set (only 59 students, 370 steps) Challenge: can we discover sparse, binary, understandable latents automatically from MOOC-scale data? 18
  • 24. Geoff Gordon—OCWC—April 2014 “KC HYPOTHESIS” Knowledge comes in atomic units (“KCs”) Each KC is learned independently (no transfer) ‣ transfer among steps mediated by common KCs ‣ or prerequisite structure (can’t learn algebra w/o knowing arithmetic) Each student has a (latent, scalar) proficiency level for each KC ‣ learn/forget = transition to a higher/lower proficiency level Learning a KC happens only through exposure to that KC ‣ problem, worked example, lecture, real life, … 19 step 17: {A, B} step 23: {A, C} [Koedinger, Corbett, Perfetti. Cognitive Science, 2012]
  • 25. Geoff Gordon—OCWC—April 2014 CONSEQUENCES OF KC HYPOTHESIS Mistakes are at KC level: select wrong KC; apply right KC to wrong data; mistake in application of KC ‣ identifying the KC at fault makes it easier to give student feedback If we can accurately ‣ determine list of KCs ‣ label instructional activities by KCs …then we immediately know the quality/coverage of our content 20
  • 28. Geoff Gordon—OCWC—April 2014 WHY ARE SOME COMPOSE-BY-ADDITION STEPS HARDER? 23 compose by addition
  • 29. Geoff Gordon—OCWC—April 2014 WHY ARE SOME COMPOSE-BY-ADDITION STEPS HARDER? 24 hard easy medium compose by addition
  • 30. Geoff Gordon—OCWC—April 2014 WHY ARE SOME COMPOSE-BY-ADDITION STEPS HARDER? 24 hard easy medium compose by addition
  • 31. Geoff Gordon—OCWC—April 2014 HYPOTHESIS: DIFFERENCE IS IN HOW MUCH PLANNING IS NEEDED 25 plan to compose subtract compose by addition
  • 32. Geoff Gordon—OCWC—April 2014 KC DISCOVERY 26 t [4]. Other problems were “unscaffolded” and did not start with such hus students had to pose these subgoals themselves. Indeed the blips for y-addition (seen in the learning curve in Figure 2) do correspond with a ency of these unscaffolded problems. [Stamper&Koedinger,AIED2011]
  • 33. Geoff Gordon—OCWC—April 2014 USE DATA-DRIVEN MODEL TO REDESIGN TUTOR New skill bars for planning skills ‣ skill bars are a tutor interface to show students where they are in acquiring skills Sequence for gentle slope ‣ adaptive fading of scaffolding New problems that focus on planning ‣ next slide… 27 Combine areas Enter given values Find regular area Plan to combine areas Combine areas Subtract Enter given values Find regular area
  • 34. Geoff Gordon—OCWC—April 2014 NEW PROBLEMS: ISOLATE PRACTICE ON PLANNING STEP Decompose complex problem into simpler ones 28
  • 35. Geoff Gordon—OCWC—April 2014 NEW PROBLEMS: ISOLATE PRACTICE ON PLANNING STEP Decompose complex problem into simpler ones 28
  • 36. Geoff Gordon—OCWC—April 2014 RESULTS More efficient: 25% less student time ‣ instructional time by step type Better learning of planning skills ‣ post-test %correct by item type 29 428 K.R. Koedinger et al (a) Fig. 4. Students using the rede 28 minutes) while actually spe learned these decomposition sk tion problems 5 Discussion and C Following our past demon discovered from data [8; 1 model to redesign an adapt ports the hypothesis. In p reached mastery (as demon . esigned tutor reached master ending more time on the criti kills as demonstrated by bett Conclusion nstrations that better cogn 1], we have tested the h tive tutor yields better stu particular, we found stud nstrated within the tutor a 428 K.R. Koedinger et al (a) . (b) time:minutes%correct [Stamper & Koedinger,AIED 2011]
  • 37. Geoff Gordon—OCWC—April 2014 MORE STRUCTURE: WHAT’S IN A KC? So far, each KC is just present or absent in a student or problem Nothing to distinguish algebra KCs from ESL KCs What’s going on under the hood as a student solves a problem? 30
  • 38. Geoff Gordon—OCWC—April 2014 RULE-BASED COGNITIVE MODEL 3(2x – 5) = 9 6x – 15 = 9 2x – 5 = 3 6x – 5 = 9 IF GOAL IS SOLVE A(BX+C) = D THEN REWRITE AS ABX + AC = D IF GOAL IS SOLVE A(BX+C) = D THEN REWRITE AS ABX + C = D IF GOAL IS SOLVE A(BX+C) = D THEN REWRITE AS BX+C = D/AKCs bug 31 What does it look like inside the student’s brain? ‣ … maybe a rule-based system ‣ … in which case KCs might correspond to rules ‣ :- president of US is Obama ‣ constant C on LHS of equation E :- move C to RHS of E
  • 39. Geoff Gordon—OCWC—April 2014 RULE-BASED SYSTEM Aka production system: ‣ declarative knowledge held in working memory ‣ production rules match declarative knowledge ‣ and act on WM or external world Much cognitive modeling work endorses this claim explicitly or implicitly ‣ ACT-R, SimStudent, Russell & Norvig, … ! But two problems: uncertainty handling, representation learning ‣ here’s where more ML research can help! 32 “I see 3x+5 = 8” “if LHS has constant C…” “… then subtract C from both sides”
  • 40. Geoff Gordon—OCWC—April 2014 PROBLEM 1: UNCERTAINTY A DAY IN THE LIFE OF A RAT 33 Trial Bell? Light? Food? 1 × ✓ ✓ 2 ✓ × × 3 × ✓ × 4 × ✓ ✓ … … … …
  • 41. Geoff Gordon—OCWC—April 2014 RAT AS BAYESIAN Priors over: how many trial types, sparsity of connections, reliability of connections, … (This is a common architecture for medical diagnosis systems) 34 bell light food … 1 2Trial types Observables …
  • 42. Geoff Gordon—OCWC—April 2014 QUIZ: ARE YOU SMARTER THAN A RAT? 35 Trial Bell? Light? Food? 1 × ✓ ✓ 2 × ✓ × 3 × ✓ ✓ 4 × ✓ ✓ … … … … 100 × ✓ ✓
  • 43. Geoff Gordon—OCWC—April 2014 QUIZ: ARE YOU SMARTER THAN A RAT? 35 Trial Bell? Light? Food? 1 × ✓ ✓ 2 × ✓ × 3 × ✓ ✓ 4 × ✓ ✓ … … … … 100 × ✓ ✓ 101 ✓ ✓ ×
  • 44. Geoff Gordon—OCWC—April 2014 AND THE RAT SAYS… Both right! With more light-bell trials, evidence increases for a separate trial type. 36 Effect name 2nd-order conditioning Conditioned inhibition light-food trials many many bell-light trials few many test: bell predicts food? ↑ ↓
  • 45. Geoff Gordon—OCWC—April 2014 BAYESIAN RULE LEARNING IN CLASSICAL CONDITIONING Only fully Bayesian inference/learning captured both effects [Courville, Daw, Gordon,Touretzky, NIPS 2003] Few bell-light trials, 1 trial type: (bell, light, food) all associated More trials: (bell, light, no food) v. (light, food, no bell) 37 Number of bell-light trials 0 10 20 30 40 50 60 0 0.2 0.4 0.6 0.8 1 Number of A−X trials P(US | A, D ) P(US | X, D ) (a) Second-order Cond. P(food | light) P(food | bell)
  • 46. Geoff Gordon—OCWC—April 2014 PROBLEM 2: REPRESENTATION LEARNING 38 Flaw with “KC = rule”: Many bugs come from weak features
  • 47. Geoff Gordon—OCWC—April 2014 REPRESENTATION LEARNING Some student errors come from failure to correctly interpret (internally represent) a problem As student sees more and more examples like 3x + 5 = 8, gets better and better “language model” to explain them (build internal representation) —> some KCs must correspond to features of the improved language model 39
  • 48. Geoff Gordon—OCWC—April 2014 EXPERIMENT Present algebra examples to a machine learning system As part of learning, induce a language model (an unsupervised probabilistic context free grammar) for algebra equations Make output of language model (grammar nonterminals, e.g., SignedNumber) available as features of each example Use these features in simulated problem-solving to discover KCs [Li, Cohen, Koedinger, Matsuda, 2010]
  • 49. Geoff Gordon—OCWC—April 2014 NEW COGNITIVE MODELS ARE MORE ACCURATE 41 [Li, Cohen, Koedinger, Matsuda, 2010]
  • 50. Geoff Gordon—OCWC—April 2014 OPEN RESEARCH QUESTION Can we build a new generation of rule-based system that has ‣ rich uncertainty handling ‣ integrated representation learning … and use it to help us model student learning? 42
  • 51. Geoff Gordon—OCWC—April 2014 SUMMARY A key contribution of machine learning to education will be to help understand the educational content we’re creating and delivering Essential idea: ML models of structured latent variables Specifically, build and test hypotheses about the knowledge, procedures and representations students use to solve problems ‣ latents = KCs, rules, representations, strategies, … Need to link uncertainty handling (traditional domain of ML) to new, harder situations encountered in understanding student knowledge Exciting time for research in ML and education! 43