Idiom Token Classification using Sentential Distributed Semantics (Giancarlo D. Salton)

Idiom Token Classiﬁcation using Sentential
Distributed Semantics
Giancarlo D. Salton Robert J. Ross John D. Kelleher
Applied Intelligence Research Centre
School of Computing

Idiom Token Classification using Sentential Distributed Semantics NLP Dublin Meetup
Outline
Idioms
Distributed Representations
“Per-expression” classification
“General” classification
Conclusions
Future Work on Idiom Token Classification
Idiom Classification on Machine Translation Pipeline
2/45

Outline
Idioms
Conclusions
Idioms 3/45

Idioms
Idioms are multiword expressions (MWE)
Their meaning is non-compositional
No linguistic agreement upon the set of characteristics deﬁning
idioms
Idioms 4/45

Idiomatic and Literal Usages
Literally...
Actually...
How to distinguish between a literal and idiomatic usage?
Idiom token classiﬁcation
Idioms 5/45

Previous Work
Previous work used “per-expression” models
– diﬀerent set of features for each expression
in general, these features are not reusable
– i.e., a model is trained for each particular expression
In our opinion the state-of-the-art is Peng et al. (2014)
– Also “per-expression” classiﬁcation
– Topic models
– Up to 5 paragraphs of context!
Idioms 6/45

Per-expression classiﬁers
Per-expression classiﬁers
– Expensive
Idioms samples are rare
– Time-consuming
Feature engineering
Idioms 7/45

General Classifiers?
Can we find a common set of features?
Can we train a general classifier?
hold+horses
vs.
break+ice
vs.
spill+beans
Idioms 8/45

Outline
Idioms
Conclusions
Distributed Representations 9/45

Distributed Representations of Words
Word2vec (Mikolov et al., 2013)

Skip-thought Vectors (or Sent2Vec)
(Kiros et al., 2015)
Encoder/Decoder Framework

Distributed representations = features!

– Encoder learns to encode information about the context of an
input sentence

Distributed Representations vs Idioms
Distributed representations cluster words (word2vec) or
sentences (sent2vec) with similar semantics
– Empirical results have shown that
Idiomatic vs. literal usages
– Idioms should alse be in a diﬀerent part of space than literal
expressions (at least when considering the same expression)

Outline
Idioms
Conclusions
“Per-expression” classiﬁcation 17/45

“Per-expression” settings
Following baseline evaluation (Peng et al., 2014)
4 expressions from VNC-Tokens dataset:
– blow+whistle, lose+head, make+scene and take+heart
Balanced training sets
Imbalanced test sets

“Per-expression” classiﬁers
K-Nearest Neighbours
– 2, 3, 5 and 10 neighbours
Support Vector Machines
– Linear SVM: linear kernel and grid search for best parameters
– Grid SVM: grid search for best kernel/parameters
– SGD SVM: linear kernel trained with Stochastic Gradient Descent

blow+whistle results
Models Precision Recall F1-Score
Peng et al. (2014)
FDA-Topics 0.62 0.60 0.61
FDA-Topics+A 0.47 0.44 0.45
FDA-Text 0.65 0.43 0.52
FDA-Text+A 0.45 0.49 0.47
SVMs-Topics 0.07 0.40 0.12
SVMs-Topics+A 0.21 0.54 0.30
SVMs-Text 0.17 0.90 0.29
SVMs-Text+A 0.24 0.87 0.38
KNN-2 0.61 0.41 0.49
KNN-3 0.84 0.32 0.46
KNN-5 0.79 0.28 0.41
KNN-10 0.83 0.30 0.44
Linear SVM 0.77 0.50 0.60
Grid SVM 0.80 0.51 0.62
SGD SVM 0.70 0.40 0.51

lose+head results
Peng et al. (2014)
FDA-Topics 0.76 0.97 0.85
FDA-Topics+A 0.74 0.93 0.82
FDA-Text 0.72 0.73 0.72
FDA-Text+A 0.67 0.88 0.76
SVMs-Topics 0.60 0.83 0.70
SVMs-Topics+A 0.66 0.77 0.71
SVMs-Text 0.30 0.50 0.38
SVMs-Text+A 0.66 0.85 0.74
KNN-2 0.30 0.64 0.41
KNN-3 0.58 0.65 0.61
KNN-5 0.57 0.65 0.61
KNN-10 0.28 0.68 0.40
Linear SVM 0.72 0.84 0.77
Grid SVM 0.83 0.89 0.85
SGD SVM 0.73 0.79 0.76

make+scene results
Peng et al. (2014)
FDA-Topics 0.79 0.95 0.86
FDA-Topics+A 0.82 0.69 0.75
FDA-Text 0.79 0.95 0.86
FDA-Text+A 0.80 0.99 0.88
SVMs-Topics 0.46 0.57 0.51
SVMs-Topics+A 0.42 0.29 0.34
SVMs-Text 0.10 0.01 0.02
SVMs-Text+A 0.07 0.01 0.02
KNN-2 0.55 0.89 0.68
KNN-3 0.88 0.88 0.88
KNN-5 0.87 0.83 0.85
KNN-10 0.85 0.83 0.84
Linear SVM 0.81 0.91 0.86
Grid SVM 0.80 0.91 0.85
SGD SVM 0.85 0.91 0.88

take+heart results
Peng et al. (2014)
FDA-Topics 0.93 0.99 0.96
FDA-Topics+A 0.92 0.98 0.95
FDA-Text 0.46 0.40 0.43
FDA-Text+A 0.47 0.29 0.36
SVMs-Topics 0.90 1.00 0.95
SVMs-Topics+A 0.91 1.00 0.95
SVMs-Text 0.65 0.21 0.32
SVMs-Text+A 0.74 0.13 0.22
KNN-2 0.46 0.96 0.62
KNN-3 0.72 0.94 0.81
KNN-5 0.73 0.94 0.82
KNN-10 0.78 0.94 0.85
Linear SVM 0.73 0.96 0.83
Grid SVM 0.72 0.96 0.82
SGD SVM 0.61 0.95 0.74

“Per-expression” evaluation
No single model performed best for all expressions
SVM consistently outperformed K-NNs
Peng et al. (2014) features may capture a diﬀerent set of
dimensions
Combination with baseline model may result in stronger classiﬁer

Outline
Idioms
Conclusions
“General” classiﬁcation 25/45

“General classiﬁer” settings
Simulation of expected behaviour on real data
27 expressions of “balanced” part of VNC-Tokens dataset
Imbalanced training set
Imbalanced test set

“General classiﬁer” classiﬁers
SVMs only
– Linear SVM: linear kernel and grid search for best parameters
– Grid SVM: grid search for best kernel/parameters
– SGD SVM: linear kernel trained with Stochastic Gradient Descent

“General classiﬁer” results
Linear SVM Grid SVM SGD SVM
Expressions Pr. Rec. F1 Pr. Rec. F1 Pr. Rec. F1
blow+whistle 0.84 0.67 0.75 0.84 0.68 0.75 0.67 0.59 0.63
lose+head 0.78 0.66 0.72 0.75 0.64 0.69 0.75 0.67 0.71
make+scene 0.92 0.84 0.88 0.92 0.81 0.86 0.78 0.81 0.79
take+heart 0.94 0.79 0.86 0.94 0.80 0.86 0.86 0.80 0.83
Total 0.84 0.80 0.83 0.84 0.80 0.83 0.79 0.79 0.78

“General classifier” evaluation
Expected behaviour on “real world”
– Consider imbalances of real data
2 classifiers had high performance
– Same general precision, recall and F1
– Deviations occurred across individual expressions
Performance is still not consistent over all classifiers and across
expressions

PCA Analysis of Distributed Representations on
“General” classiﬁer

Outline
Idioms
Conclusions
Conclusions 31/45

Conclusions
Our approach needs less resources to achieve roughly the same
performance
SVM generally perform better than KNNs
“General classiﬁer” is feasible
“Per-expression” does achieve better results in some cases
Conclusions 32/45

Outline
Idioms
Conclusions
Future Work on Idiom Token Classiﬁcation 33/45

Apply to other languages than English
Apply to other datasets
– e.g., the IDX Corpus
What are the main sources of error for the “general classiﬁer”?
– Better understanding of representations is needed
Future Work on Idiom Token Classiﬁcation 34/45

Outline
Idioms
Conclusions
Idiom Classiﬁcation on Machine Translation Pipeline 35/45

Idiom Token Classiﬁcation on Machine Translation
Pipeline
(Salton et al., 2014b)

Pipeline

References
Ryan Kiros, Yukun Zhu, Ruslan R Salakhutdinov, Richard Zemel, Raquel Urtasun,
Antonio Torralba, and Sanja Fidler. 2015. Skip-thought vectors. In Advances in
Neural Information Processing Systems 28, pages 3276–3284.
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeﬀ Dean. 2013.
Distributed representations of words and phrases and their compositionality. In
Advances in Neural Information Processing Systems 26, pages 3111–3119.
Jing Peng, Anna Feldman, and Ekaterina Vylomova. 2014. Classifying idiomatic and
literal expressions using topic models and intensity of emotions. In Proceedings of the
2014 Conference on Empirical Methods in Natural Language Processing (EMNLP),
pages 2019–2027, October.
Giancarlo D. Salton, Robert J. Ross, and John D. Kelleher. 2014a. An Empirical
Study of the Impact of Idioms on Phrase Based Statistical Machine Translation of
English to Brazilian-Portuguese. In Third Workshop on Hybrid Approaches to
Translation (HyTra), pages 36–41.
Giancarlo D. Salton, Robert J. Ross, and John D. Kelleher. 2014b. Evaluation of a
substitution method for idiom transformation in statistical machine translation. In The
10th Workshop on Multiword Expressions (MWE 2014), pages 38–42.
44/45

Thank you!
Giancarlo D. Salton would like to thank CAPES (“Coordenao de
Aperfeioamento de Pessoal de Nvel Superior”) for his Science Without
Borders scholarship, proc n. 9050-13-2
45/45

Idiom Token Classification using Sentential Distributed Semantics (Giancarlo D. Salton)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Idiom Token Classification using Sentential Distributed Semantics (Giancarlo D. Salton)

Ähnlich wie Idiom Token Classification using Sentential Distributed Semantics (Giancarlo D. Salton) (9)

Mehr von Sebastian Ruder

Mehr von Sebastian Ruder (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Idiom Token Classification using Sentential Distributed Semantics (Giancarlo D. Salton)