Pedagogic application of regular expressions

John Blake
Japan Advanced Institute of Science and Technology
Pedagogic application
of regular expressions
/bbetweenW+(?:w+W+){1,2}?tob/gi;

Overview
02
Introduction
• Probabilistic parsing
• Rule-based pattern matching
• Regular expressions
Pedagogic applications
• Modality detector
• Error detector
• Other: tagged corpora, pronunciation of “ed”

Probabilistic parsing
03
• Dynamic algorithms
• Machine learning
• Training sets
(e.g. Stanford POS parser)
Extremely powerful, but
requires significant knowledge
of computational linguistics and
huge time investment so…

Rule-based pattern matching
04
1. There is a man on your left. T / F
If true, a man is on your left. Stop.
If false, proceed to 2.
2. There is a woman on your left. T / F
If true, there is a woman on your left. Stop.
If false, there is nobody on your left. Stop.
True/false statements

05
Decision-tree algorithm
There is a man on your left.
There is a woman on your left.
No.Yes. STOP
Yes. STOP No.
There is nobody on your left. STOP
Assumptions:
1. Only adults are present
2. There is no third gender

06
There is a man. /bmanb/;
There is a woman. /bwomanb/;
Regular expressions (regexp|regex)
The discrete words “man” and “woman” will
be identified, generating a “true” result.

Regular expressions (Regex)
07
e.g. /bmaybeb/gi;
– escape (from normal characters)
i – case insensitive
b – boundary
g – greedy
1. I think that maybe he can understand. T/F
2. He may be able to understand T/F
3. Maybe, he can understand. T/F
4. Maybelline is a company name. T/F
5. Maybe, he said maybe. T/F

Pedagogic applications
08
Modality detector
Online error detectors
- Common error detector (Morrall, 2000-14)
- Corpus-based error detector (Blake, 2012-15)
Other applications
- Annotation highlighter
- Ideas for pronunciation, grammar and vocab

09
Situation
App. 1
Students
graduate students, researchers
Aim
write research articles
Problems
lack of familiarity of genre,
lack of language,
lack of content.

10
Tentative language
& approximation
Type Examples
Modal verbs may, might, would, can
Lexical verbs seem, appear, suggest
Modal adverbs perhaps, probably, possibly,
Modal adjectives probable, possible, uncertain
Modal nouns assumption, claim, possibility
# Approximation
49% Almost a half, nearly 50%, less than 1 in 2
App. 1

11
Material mismatch
Students from different faculties studying
tentative language (hedging) and
approximation in academic writing use
generic materials prepared by teacher.
App. 1

12
Lack of face validity
Some students do not want to “waste
time” dealing with materials not
appropriate to their major. They expect
materials tailored to their exact needs.
App. 1

13
Solution: Modality detector
App. 1

14
Solution: Modality detector
Individualized instruction
• Student selects appropriate text
• Student inputs relevant text
• Regex identifies hedges & approximation
• Execute command labels & highlights
App. 1

15
Warning: False positives
More complex regex reduce false positives
App. 1

16
Piles of unmarked homework
Responding to written work takes too
much time, and is repetitive since many
students make the same surface-level
mistakes.
App. 2

17
No time to respond
Teachers are expected to:
• Identify the location of errors
• Explain the errors (if necessary)
• Correct the errors (if necessary)
All of which take lots of time.
App. 2

18
Solution: Error detector
Identification
Student inputs own work
Regex identifies expected errors
Explanation
Execute command selects and displays
prepared explanation
Correction
Student corrects work and submits
improved version
App. 2

19
Error classification
App. 2
Type Description
Accuracy factual and language errors
Brevity too many words
Clarity vague or ambiguous terms
Objectivity emotive language
Formality abbreviations, contractions, &
informal terms
An ethnographic survey of the literature on writing scientific research articles
revealed five key criteria (Blake & Blake, 2015)

21
Specific example
Error
• One of the + singular noun
Regex
• /bone of theb/gi;
Execute
• Check that the phrase one of the
is followed by a plural noun
App. 2

22
False positives harnessed in learning
process by forcing student engagement
App. 2

23
Difficult-to-read tags
Introduction Purpose Method Results Discussion
<segment features='problem;introduction;rhetorical_moves' state='active'>We
address the problem of model-based object recognition.</segment> <segment
features='purpose;rhetorical_moves' state='active'>Our aim is to localize and
recognize road vehicles from monocular images or videos in calibrated traffic
scenes.</segment> <segment features='method;rhetorical_moves' state='active'>A
3-D deformable vehicle model with 12 shape parameters is set up as prior
information, and its pose is determined by three parameters, which are its position
on the ground plane and its orientation about the vertical axis under ground-plane
constraints.</segment> <segment features='purpose;rhetorical_moves'
state='active'>An efficient local gradient-based method is proposed to evaluate the
fitness between the projection of the vehicle model and image data, which is
combined into a novel evolutionary computing framework to estimate the 12 shape
parameters and three pose parameters by iterative evolution.</segment> <segment
features='background;introduction;rhetorical_moves' state='active'>The recovery of
pose parameters achieves vehicle localization, whereas the shape parameters are
used for vehicle recognition.</segment> <segment
features='method;rhetorical_moves' state='active'>Numerous experiments are
App. 3

24
Difficult-to-read tags
<segment features='problem;introduction;rhetorical_moves' state='active'>We
address the problem of model-based object recognition.</segment> <segment
features='purpose;rhetorical_moves' state='active'>Our aim is to localize and
recognize road vehicles from monocular images or videos in calibrated traffic
scenes.</segment> <segment features='method;rhetorical_moves' state='active'>A
3-D deformable vehicle model with 12 shape parameters is set up as prior
information, and its pose is determined by three parameters, which are its position
on the ground plane and its orientation about the vertical axis under ground-plane
constraints.</segment> <segment features='purpose;rhetorical_moves'
state='active'>An efficient local gradient-based method is proposed to evaluate the
fitness between the projection of the vehicle model and image data, which is
combined into a novel evolutionary computing framework to estimate the 12 shape
parameters and three pose parameters by iterative evolution.</segment> <segment
features='background;introduction;rhetorical_moves' state='active'>The recovery of
pose parameters achieves vehicle localization, whereas the shape parameters are
used for vehicle recognition.</segment> <segment
features='method;rhetorical_moves' state='active'>Numerous experiments are
App. 3

25
Easy-to-read tags
http://www.jaist.ac.jp/~johnb/Movehighlighter.html
App. 3

26
Ideas for you and your students
Pronunciation: Regular “ed”
• Regular “ed” /t/, /d/, /id/
• th [voiced or voiceless]
Grammar:
• Tenses: e.g. perfect continuous: been + ing
• Quantifiers : [U] much, little; [C] many, few; [U/C] lots of , a lot of
Vocabulary:
• Colours: red, blue crimson red, cobalt blue,
• Body parts: hand, eyes, leg hand out, eye up, leg it

28
Pronunciation of “th”
Pron Feature Potential regex
/𝜹/ Voiced initial th /btha(n|t|) b/gi;
/bthe(b|ir|m|re|se|y) b/gi;
/bthisb/gi;
/btho(se|ugh|) b/gi;
/bthusb/gi;
/𝜽/ Voiceless initial th /bth/gi;
/t/ th pronounced as t /bthomas|thames|thyme/gi;
Pronunciation of “th” can be predicted by the law that for function words
the initial th is pronounced as a voiced sound.

References
29
Blake, J. (2012, November 28-30). Corpus-based academic written error
detector. Conference proceedings of the 20th International Conference on
Computers in Education. Nanyang Technological University, Singapore.
Blake, X. and Blake, J. (2015, January 29-31). Academic literacy: Mentor and
mentee perspectives. Poster presented at 35th International Conference of
ThaiTESOL, Bangkok, Thailand.
Morrall, A. (2000-2014). Common Error Detector. [Online tool]
http://www2.elc.polyu.edu.hk/cill/errordetector.htm

Any questions, comments or
suggestions?
johnb@jaist.ac.jp

Pedagogic application of regular expressions

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (19)

Ähnlich wie Pedagogic application of regular expressions

Ähnlich wie Pedagogic application of regular expressions (20)

Mehr von john6938

Mehr von john6938 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Pedagogic application of regular expressions