AINL 2016: Kravchenko

Solution for workshop
AINL FRUCT:
Artificial Intelligence and Natural
Language Conference
10-12 NOVEMBER 2016
SAINT-PETERSBURG
http://ainlconf.ru/

Paraphrase Detection using
Semantic Similarity Algorithms
Dmitry Kravchenko
Ben-Gurion University of the Negev

Tasks description
Input:
2 files with list of pairs of sentences in Russian in XML format:
a) training set
b) test set
Output:
Task 1:
Algorithm should classify each pair into one of three classes: Non-
paraphrase, Near-paraphrase, Precise-paraphrase
Task 2:
Algorithm should classify each pair into one of three classes: Non-
paraphrase, Paraphrase

Algorithm Data-Flow
SEMILAR Toolkit
DKPro Similarity
Python difflib
NLTK WordNet
Swoogle
BLEU algorithms
Google
Yandex
Microsoft
Gradient
Boosting
Classifier
Input
substitution
of acronyms
using online
dictionary:
wiktionary.org
Output

Classification algorithm
● GradientBooster Classifier
● Task 1:
– Feature vector which contain 77 features:
● 18 features: 6 scores of SEMILAR toolkit * 3 translation
engines
● 39 features: 13 scores of DKPro Similarity toolkit * 3
translation engines
● 3 features: 1 python difflib similarity score * 3 translation
engines
● 6 features: 2 scores of sentence similarity scores (Yuhua Li,
David McLean, etc. et al) * 3 translation engines
● 3 features: 1 score of Swoogle comparator * 3 translations
● 8 BLEU scores on source sentences (in Russian)

Classification algorithm
● Task 2:
– Feature vector which contain 69 features:
● 18 features: 6 scores of SEMILAR toolkit * 3 translation
engines
●
39 features: 13 scores of DKPro Similarity toolkit * 3
translation engines
● 3 features: 1 python difflib similarity score * 3 translation
engines
● 6 features: 2 scores of sentence similarity scores (Yuhua Li,
David McLean, etc. et al) * 3 translation engines
● 3 features: 1 score of Swoogle comparator * 3 translations
● (without BLEU scores)

6 scores
of SEMILAR toolkit
● greedyComparerWNLin
● optimumComparerLSATasa
● dependencyComparerWnLeskTanim
● cmComparer
● bleuComparer
● lsaComparer

greedyComparerWNLin
This score refers to a sentence to sentence similarity method
which greedily aligns words between given sentences. The
word alignment method used is WordNet based method
proposed by Lin in 1998: article name is “An information-
theoretic definition of similarity”.
Please refer to:
A Comparison of Greedy and Optimal Assessment of Natural
Language Student Input Using Word-to-Word Similarity Metrics
http://www.aclweb.org/website/old_anthology/W/W12/W12-
20.pdf#page=175

optimumComparerLSATasa
Similar to greedyComparerWNLin, but the words are
aligned optimally (similar to job assignment problem) and
the word-to-word similarity method
Article name is: Latent Semantic Analysis Models on
Wikipedia and TASA
http://deeptutor2.memphis.edu/Semilar-
Web/public/downloads/LSA-Models-
LREC014/LSAModelsOnWikipediaAndTASADanEtAl-
LREC014.pdf

dependencyComparerWnLeskTanim
Please see:
● https://www.aaai.org/ocs/index.php/FLAIRS/200
9/paper/viewFile/55/298.
The word-to-word similarity method used.
It is WordNet based method proposed by Lesk
and Tanim

cmComparer
Method proposed by Corley and Mihalcea.
(article name is: SEMILAR: The Semantic Similarity Toolkit)

lsaComparer
LSA based word representation are summed up
for each sentence and the similarity is
calculated using the resultant representation.
● (resultant Vector based method is described in
the article: NeRoSim: A System for Measuring
and Interpreting Semantic Textual Similarity
http://alt.qcri.org/semeval2015/cdrom/pdf/SemE
val030.pdf)

Word-to-word Similarity score
Article: NeRoSim: A System for Measuring and Interpreting Semantic Textual Similarity

13 scores
of DKPro Similarity toolkit
● CosineSimilarity,
● ExactStringMatchComparator,
● GreedyStringTiling2-gram, GreedyStringTiling 4-gram,
● JaroSecondStringComparator,
● JaroWinklerSecondStringComparator,
● normalized LevenshteinComparator,
● LongestCommonSubsequenceNormComparator,
● SubstringMatchComparator,
● WordNGramContainmentMeasure,
● WordNGram-JaccardMeasure 2-gram, WordNGramJaccardMeasure
3-gram, WordNGramJaccardMeasure 4-gram

Four rest Toolkits
● Python difflib comparator
● NLTK WordNet. Sentence similarity scores
(Yuhua Li, David McLean, etc. et al)
● Swoogle comparator
● BLEU scores (for Russian language, no need
for English translation): bleu def 1-gram, bleu
def 2-gram, bleu def 3-gram, bleu def 4-gram,
bleu lin 1-gram, bleu lin 2-gram, bleu lin 3-gram,
bleu lin 4-gram

Results on Test Set
Task number Accuracy F1 macro Place
First Task Standard 0.5695 0.5437 4 out of 11
Second Task Standard 0.7153 0.7853 6 out of 10

Which impact Toolkits gave?
SEMILAR DKPro Similarity Swoogle NLTK WordNet Python difflib
66.00
68.00
70.00
72.00
74.00
76.00
78.00
80.00
82.00
80.13
79.52
78.94 78.76
75.92
77.02
75.78
75.03 75.02
71.36
Accuracy F1 macro
5-fold cross validation results on the Training Set Second Task

Which Translation Engine is Better?
5-fold cross validation results on the Training Set Second Task
Symbols for Toolkit on X axis:
1: SEMILAR 2: DKPro Similarity 3: Python difflib 4: NLTK WordNet 5: Swoogle
6: All 5 Toolkits together

Conclusion
By using this algorithm we can detect semantic
similarity not only for Russian language, but for
any other language, which translation is
available via translation engines.

AINL 2016: Kravchenko

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie AINL 2016: Kravchenko

Ähnlich wie AINL 2016: Kravchenko (20)

Mehr von Lidia Pivovarova

Mehr von Lidia Pivovarova (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

AINL 2016: Kravchenko