AIM Analytics: U-M Community Presentations

AIM Analytics: U-M
Community Presentations

● Perry Samson - Measuring the Pros and Cons of a Blended Course
● Heather Newman - Sentiment Analysis of Student Evaluations, and (separately) the Impact of Peer
Feedback/Grades on TA Feedback/Grades
● Steve Lonn - The U-M Learning Analytics Architecture (LARC) Dataset: What is it, How to Access it,
and How it Enables LA Research
● SungJin Nam - Predicting Short- and Long-Term Vocabulary Learning via Semantic Features of
Partial Word Knowledge
● Heeryung Choi - Social Comparison Theory as Applied to MOOC Student Writing: Constructs for
Opinion and Ability
● Phoebe(Hui) Liang - Scale MOOC Discourse Analysis with In Situ Coding

Measuring the Pros and Cons of a Blended Course

Perry Samson, Climate & Space Sciences & Engineering

Heather Newman (newmanh@umich.edu)
● A quick summary of two papers in submission
● Sentiment Analysis of Student Evaluations
○ Exploring the use of sentiment analysis for reviewing student evaluations of teaching for
themes and satisfaction
● The Impact of Peer Feedback/Grades on TA Feedback/Grades
○ How access to peer feedback and grades does/does not affect TA grades and feedback for
that assignment.

The U-M Learning Analytics Architecture (LARC) Dataset:
What is it, How to Access it, and How it Enables LA Research
Steven Lonn, PhD
Office of Enrollment Management
@stevelonn

Acknowledgements
10
• Andy Cameron
• Ann Rodgers
• Gus Evrard
• Ben Koester
• Chris Teplovs
• Chris Brooks
• Daniela Morar
• Glenn Auerbach
AIM Analytics - Dec 4, 2017
@stevelonn
• Kris Steinhoff
• Lisa Emery
• Paul Courant
• Peter Hurley
• Preet Mandair
• Rob Wilke
• Tim McKay
• Tracy Pattok
• Bill Gehring
• Cassandra Callaghan
• Stephanie Riegle
• Patricia Marts
• Mark Umbricht
• Vijay Thiruvengadam
• Cindy Shindledecker
• Many, many others...

11
Reduce the time spent on
non-value-added activities
in order to increase time
available to be spent on
value-added activities.
The
Challenge
@stevelonn

Learning Analytics ARChitecture
12
• Guiding principles:
– The 80% solution
– Researcher is the central persona
– Continued user-driven enhancement
@stevelonn

Structure
13
• Student Info Table
• Student Term Info Table
• Student Course Info Table
• Student Term Transfer Table
• PII Table
(SQL Only)
@stevelonn

Data Dictionary
14
bit.ly/larc-dd
@stevelonn

Snapshot Updates
15
• Timing
• Availability
• Shortcuts
@stevelonn

Informational Websites, Forum, & Data Dictionary
16
http://enrollment.umich.edu/data-
research/learning-analytics-data-architecture-larc
http://www.mais.umich.edu/reporting/teachinglea
rning-faq.html
bit.ly/larc-info
@stevelonn

Mock Dataset
17
@stevelonn

Access to
LARC
18
Multi-office workflow
SQL
Flat CSV Files
@stevelonn

Example Types of Analyses
19
• Student-level
• Course-level
• Institution-level
link
@stevelonn

Predicting Short- and Long-Term Vocabulary Learning via
Semantic Features of Partial Word Knowledge
SungJin Nam (sjnam@umich.edu)
● Multidimensionality of word’s meaning
○ It is NOT all-or-nothing
○ Knowledge is acquired incrementally
● A fine-grained word representations in an
intelligent tutoring system
○ Interpretable measures of student’s progress
○ To create more personalized and engaging contents
● Our goals
○ Combining cognitive psychology insights (Osgood,
1957) to a natural language processing tool (Mikolov
et al., 2013)
○ Predicting short- and long-term learning in a
vocabulary learning system

Research questions and experiment
● Research questions
○ RQ1: Can semantic similarity scores from Word2Vec be used to predict students’ short-term
and long-term learning?
○ RQ2: Compared to using regular Word2Vec scores, how does the model using Osgood’s
semantic scales as features perform for predicting short-term and long-term learning?
● DSCoVAR (Frishkoff et al., 2016)
○ Dynamic Support of Contextual Vocabulary Acquisition for Reading
○ Teach students how to infer the meaning of an unknown word from a sentence
○ 280 middle school students (6-8th graders)
○ 60 English words (SAT level)
○ Pre-test - Training - Post-test
● Learning measures
○ Short-term learning (immediate post-test)
○ Long-term retention (delayed post-test)

Predictive features and models
● “The Measurement of Meaning”
○ Word2Vec scores (W2V) [RQ1]: cosine similarity score of numeric vector word representations
○ Osgood scores (OSG) [RQ2]: introducing anchor terms for more interpretable W2V scores
● Derived features
○ Dist, Resp, C-hull
● Mixed-effect logistic regression
○ Item- and subject-level variances

Results and Conclusion
● OSG model is similarly effective
○ W2V model was more effective in predicting short-
term learning
○ OSG model was marginally better in long-term
learning case
● More predictive Osgood scales than others
● Future works
○ Better scaling design
○ Generalizability of the method

Social Comparison Theory as Applied to MOOC Student Writing:
Constructs for Opinion and Ability
Heeryung Choi (heeryung@umich.edu)
● Peer feedback in MOOCs
○ Challenges of peer assessment lies in interpersonal variables such as how assessors
and assesses perceived each other and themselves (Kulkarni et al., 2013; Suen, 2014; van Gennip,
Segers, & Tillema, 2010)
○ Studies on how influential diversity attributes are on other learners’ learning behavior in
peer feedback activities...
■ Can deepen our understanding of learners in MOOCs (Kulkarni et al., 2013; Suen, 2014; van
Gennip, Segers, & Tillema, 2010).
■ Can facilitate a potentially more beneficial student matching framework
● Social Comparison Theory
○ In the absence of objective criteria, people will choose other people to compare with
themselves on… (Festinger, 1954)
■ Ability (performance, socioeconomic status (SES), gender, …)
■ Opinion

RQ. Is there evidence of social comparison during peer feedback activities in MOOCs?

● H1. Learners from high SES countries will construct responses that are longer and more
linguistically complex, with less positive, and more negative features when responding to
cases of disagreement with peers, regardless of the country disclosure of the peer.
● H2. Learners from high SES countries will write shorter, less positive, less negative and more
linguistically complex response when responding to lower SES peers.

● H1. Learners from high SES countries will construct responses that are longer and more
linguistically complex, with less positive, and more negative features when responding to
cases of disagreement with peers, regardless of the country disclosure of the peer.
● H2. Learners from high SES countries will write shorter, less positive, less negative and
more linguistically complex response when responding to lower SES peers.

Scale MOOC Discourse Analysis with In Situ Coding - Phoebe Liang
Introduction:
In the field of learning analytics and learning
science, researchers typically hire human coders
to read and annotate text data for discourse
analysis. Through such analysis, researchers can
study the cognitive behaviors and peer interactions
in online learning environments and create
innovative methods to personalized education at
scale.
To help enable and engage these researchers, we
created a tool named Innotate, a Chrome browser
extension which anyone can use with minimal
technical background.

Scale MOOC Discourse Analysis with In Situ Coding
To demonstrate how researchers can use this tool to replicate their studies and demonstrate
the variety of annotation possibilities, several experiments have designed as follows:
Experiment 1: This experiment intends to replicate the work Bringing Order to Chaos in
MOOC Discussion Forums with Content-Related Thread Identification by Wise et al.
(2016) to test the generalizability of the original modeling methodology
Experiment 2: This experiment intends to replicate the work Towards automated
content analysis of discussion transcripts: a cognitive presence case by Kovanović et
al. (2016) to evaluate and validate the cross-course generalizability of the results.
Experiment 3: This experiment intends to demonstrate how researchers can use the
Chrome extension to build a model to predict the best answer corresponding to the
question in the starting post within a thread.

Scale MOOC Discourse Analysis with In Situ Coding - Phoebe Liang
In experiment 2, the researchers focused on the problem of coding discussion transcripts
for the levels of cognitive presence, one of the three constructs in the Community of Inquiry
(CoI) model of distance education.
Experiment 2 and
Results
Comparison
Original Study Replication Study
Course a masters level, research-intensive
online course in software engineering
at a Canadian open public university
Introduction to Data Science in Python
on Coursera
Coding Methodology Two experienced coders manually code
1747 messages
Three trained coders annotate 500
messages for replication purposes. The
final label is taken by majority vote.
Model and Accuracy 70.3% classification accuracy using
random forest with LIWC and coh-
metrix features in addition to context
features
71.86% classification accuracy with 10-
fold cv and use SMOTE to create
synthetic data for imbalanced classes.

AIM Analytics: U-M Community Presentations

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie AIM Analytics: U-M Community Presentations

Ähnlich wie AIM Analytics: U-M Community Presentations (20)

Mehr von Sungjin Nam

Mehr von Sungjin Nam (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

AIM Analytics: U-M Community Presentations