Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Chi-Un Lei "Text Mining and Educational Discourse"
1. Text Mining and
Educational Discourse
Dr. Chi-Un Lei, Dept. of Electrical and Electronic Eng.
LASI-HK 2014
(Adopted from LASI workshop 2014)
1
2. Words from the Speaker
“The key insight communicated through this
workshop is that …
If we can understand the connection between socio-
psychological processes and language by means of
the social signals encoded in them, we can
structure computational models of language
interactions more effectively.”
--- Carolyn Penstein Rosé
2
3. Outline
Theoretical: Connection between discourse and
learning
From rich but implicit constructs to explicit features
that capture the essence for machine learning
Hands-on: Machine learning for text extraction
and classification
3
Automatic
Analysis
Of
Conversation
Conversational
Interventions
Positive
Learning
Outcomes
6. Transactivity
Building on an idea expressed earlier in a
conversation
Using a reasoning statement
6
We don't want tmax
to be at 570 both
for the material and
[the Environment]
Well, for power and
efficiency, we want a
high tmax, but
environmentally, we
want a lower one.
9. System of Engagement
Showing openness to the existence of other
perspectives
Examples
Nuclear is a good choice
I consider nuclear to be a good choice
There’s no denying that nuclear is a superior choice
Is nuclear a good choice?
9
12. What is machine learning?
Automatically or semi-automatically
Inducing concepts (i.e., rules) from data
Finding patterns in data (For human and computer)
Explaining data
Making predictions
12
Data Learning Algorithm Model
New Data
Prediction
Classification Engine
14. Keep this picture in mind…
Machine learning isn’t magic
But it can be useful for identifying meaningful
patterns in your data when used properly
Proper use requires insight into your data
Otherwise, GIGO (Garbage In Garbage Out)
Think like a computer!
14
15. Machine Learning for Text Mining
Basic features: “Bag of Words”
Represent text as a vector where each position
corresponds to a term
15
• Cows make cheese. (110010)
• Cheese make cows. (110010)
• Hamsters eat seeds. (001101)
Cheese
Cows
Eat
Hamsters
Make
Seeds
16. Basic Types of Features
Unigram
Single words (e.g. prefer, sandwhich)
Bigram
Pairs of words next to each other (e.g. eat bread)
Simple lexical patterns
e.g. “common denominator” versus “common multiple”
Punctuation
“You think the answer is 9?” vs. “You think the answer is 9.”
16
17. Part of Speech (POS) Tagging
POS bigrams capture syntactic or stylistic
information
e.g. “the answer which is …” vs “which is the answer”
Pairs of POS (Part-of-Speech) tags next to each other
DT_NN: "Determiner"_"Noun, singular or mass “
NNP_NNP: “Proper noun, singular”_“Proper noun,
singular”
Examples
JJR: Adjective, comparative
17
http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
18. Feature Space Customizations
Machine learning algorithms look for features
that are good predictors, NOT features that
are necessarily meaningful
Look for approximations
e.g. Don’t need to do a complete syntactic
analysis for questions
Look for question marks
Look for wh-terms that occur immediately before an
auxilliary verb --- Combined features
18
19. LightSide
Easy UI
Feature Extraction
Model Building /
Machine Learning
Error Analysis
Data Structuring
Free/open-source for
adoption and extension
19
22. Recap …
“The key insight communicated through this
workshop is that …
If we can understand the connection between socio-
psychological processes and language by means of
the social signals encoded in them, we can
structure computational models of language
interactions more effectively.”
--- Carolyn Penstein Rosé
22
23. Examples of Part of Speech Tagging
1. CC Coordinating conjunction
2. CD Cardinal number
3. DT Determiner
4. EX Existential there
5. FW Foreign word
6. IN Preposition/subord
7. JJ Adjective
8. JJR Adjective, comparative
9. JJS Adjective, superlative
10.LS List item marker
11.MD Modal
12.NN Noun, singular or mass
13.NNS Noun, plural
14.NNP Proper noun, singular
15.NNPS Proper noun, plural
16.PDT Predeterminer
17.POS Possessive ending
18.PRP Personal pronoun
19.PP Possessive pronoun
20.RB Adverb
21.RBR Adverb, comparative
22.RBS Adverb, superlative23
http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
24. Examples of Part of Speech Tagging
23.RP Particle
24.SYM Symbol
25.TO to
26.UH Interjection
27.VB Verb, base form
28.VBD Verb, past tense
29.VBG Verb,
gerund/present participle
30.VBN Verb, past participle
31.VBP Verb, non-3rd ps.
sing. present
32.VBZ Verb, 3rd ps. sing.
present
33.WDT wh-determiner
34.WP wh-pronoun
35.WP Possessive wh-
pronoun
36.WRB wh-adverb
http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
25. Findings
Transactivity (Berkowitz & Gibbs, 1983)
Moderating effect on learning (Joshi & Rosé, 2007; Russell, 2005;
Kruger & Tomasello, 1986; Teasley, 1995)
Moderating effect on knowledge sharing in working groups (Gweon et
al., 2011)
Engagement (Martin & White, 2005)
Correlational analysis: Strong correlation between displayed openness
of group members and articulation of reasoning (R = .72) (Dyke et al.,
in press)
Intervention study: Causal effect on propensity to articulate ideas in
group chats (effect size .6 standard deviations) (Kumar et al., 2011)
Mediating effect of idea contribution on learning in scientific inquiry
(Wang et al., 2011)
25