SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
LATEX TikZposter
Adversarial Training for
Unsupervised Bilingual Lexicon Induction
Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun
Tsinghua University, Beijing, China
Adversarial Training for
Unsupervised Bilingual Lexicon Induction
Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun
Tsinghua University, Beijing, China
Overview
• Task input: Separate monolingual embeddings trained on non-parallel data
• Task output: A bilingual lexicon
• Challenge: Zero supervision: Can we link separate monolingual embeddings
without any cross-lingual signal?
• Solution: Formulate as an adversarial game
• Outcome: Successful learning with proper model design and training tech-
niques
Background
caballo (horse)
cerdo (pig)
gato (cat)
horse
pig
cat
Spanish English
Although monolingual word embeddings are trained separately on non-parallel
data, they appear approximately isomorphic. Therefore a linear transformation
can be used to align the two embedding spaces. But previous works typically
require seed word translation pairs to supervise its learning.
Models
(b)
D1
0/1
G GT
D2
1/0
(a)
D
0/1
G
(c)
D
0/1
G GT
LR
~
(a) Model 1 (unidirectional transformation): The generator G is a linear trans-
formation that tries to transform source word embeddings (squares) to make
them seem like target ones (dots), while the discriminator D tries to classify
whether the input embeddings are generated by G or real samples from the
target embedding distribution.
(b) Model 2 (bidirectional transformation): If G transforms the source word em-
bedding space into the target language space, its transpose G should trans-
form the target language space back to the source.
(c) Model 3 (adversarial autoencoder): After the generator G transforms a
source word embedding x into a target language representation Gx, we should
be able to reconstruct the source word embedding x by mapping back with G .
• Model 2 and 3 can be seen as relaxations of an orthogonal constraint on G.
Training Techniques
Regularizing the discriminator
• All forms of regularization help training.
• Multiplicative Gaussian injected into the input is the most effective. On top
of that, hidden layer noise helps slightly.
Model selection
• Sharp drops of the generator loss correspond to good models.
• Reconstruction loss LR and the value of G G − I F
drop synchronously →
Good models are indeed close to orthogonality.
Experiments
Chinese-English
method # seeds accuracy (%)
MonoGiza w/o emb. 0 0.05
MonoGiza w/ emb. 0 0.09
TM
50 0.29
100 21.79
IA
50 18.71
100 32.29
Model 1 0 39.25
Model 1 + ortho. 0 28.62
Model 2 0 40.28
Model 3 0 43.31
Table 2: Chinese-English top-1 accuracies of the
MonoGiza baseline and our models, along with
the translation matrix (TM) and isometric align-
ment (IA) methods that utilize 50 and 100 seeds.
4.1 Experiments on Chinese-English
Data
For this set of experiments, the data for training
word embeddings comes from Wikipedia com-
parable corpora.5 Following (Vuli´c and Moens,
2013), we retain only nouns with at least 1,000
occurrences. For the Chinese side, we first use
OpenCC6 to normalize characters to be simplified,
and then perform Chinese word segmentation and
POS tagging with THULAC.7 The preprocessing
of the English side involves tokenization, POS tag-
ging, lemmatization, and lowercasing, which we
carry out with the NLTK toolkit.8 The statistics
of the final training data is given in Table 1, along
with the other experimental settings.
As the ground truth bilingual lexicon for evalua-
tion, we use Chinese-English Translation Lexicon
Version 3.0 (LDC2002L27).
Overall Performance
Table 2 lists the performance of the MonoGiza
baseline and our four variants of adversarial train-
ing. MonoGiza obtains low performance, likely
due to the harsh evaluation protocol (cf. Sec-
tion 4.4). Providing it with syntactic information
can help (Dou and Knight, 2013), but in a low-
resource scenario with zero cross-lingual informa-
tion, parsers are likely to be inaccurate or even un-
available.
5
http://linguatools.org/tools/corpora/wikipedia-
comparable-corpora
6
https://github.com/BYVoid/OpenCC
7
http://thulac.thunlp.org
8
http://www.nltk.org
城市
chengshi x
city
town
suburb
area c
proximity
Table 3: Top-5 Eng
posed by our appr
The ground truth is
0
0
10
20
30
Accuracy(%)
Figure 4: Top-1
isometric alignmen
(TM), with the num
100, 200, 500, 100
The unidirection
reasonable accurac
is rather sensitive
ization. This train
thogonal constraint
onal constraint hur
20 times slower eve
parametrization in
tion. The last two m
ations of the orthog
sarial autoencoder
mance. We therefo
iments. Table 3 list
ples given by the a
Comparison With
In this section, w
TM and IA require
of our approach. T
translation pairs fo
removed from the t
this experiment. W
for TM and IA.
Figure 4 shows
Comparison with seed-based methods
0 500 1000
# seeds
0
10
20
30
Accuracy(%)
Ours
IA
TM
Spanish-English, Italian-English, Japanese-Chinese, Turkish-English
method # seeds es-en it-en ja-zh tr-en
MonoGiza w/o embeddings 0 0.35 0.30 0.04 0.00
MonoGiza w/ embeddings 0 1.19 0.27 0.23 0.09
TM
50 1.24 0.76 0.35 0.09
100 48.61 37.95 26.67 11.15
IA
50 39.89 27.03 19.04 7.58
100 60.44 46.52 36.35 17.11
Ours 0 71.97 58.60 43.02 17.18
Table 4: Top-1 accuracies (%) of the MonoGiza baseline and our approach on Spanish-English, Ital
English, Japanese-Chinese, and Turkish-English. The results for translation matrix (TM) and isom
alignment (IA) using 50 and 100 seeds are also listed.
50 100 150 200
Embedding dimension
30
40
50
Accuracy(%)
Figure 5: Top-1 accuracies of our approach with
respect to the input embedding dimensions in {20,
50, 100, 200}.
S. When the seeds are few, the seed-based meth-
ods exhibit clear performance degradation. In this
case, we also observe the importance of the or-
thogonal constraint from the superiority of IA to
TM, which supports our introduction of this con-
straint as we attempt zero supervision. Finally,
in line with the finding in (Vuli´c and Korhonen,
2016), hundreds of seeds are needed for TM to gen-
eralize. Only then do seed-based methods catch up
with our approach, and the performance difference
is marginal even when more seeds are provided.
Effect of Embedding Dimension
4.2 Experiments on Other Language Pair
Data
We also induce bilingual lexica from Wikip
comparable corpora for the following langu
pairs: Spanish-English, Italian-English, Japan
Chinese, and Turkish-English. For Span
English and Italian-English, we choose to
TreeTagger9 for preprocessing, as in (Vuli´c
Moens, 2013). For the Japanese corpus, we
MeCab10 for word segmentation and POS
ging. For Turkish, we utilize the preproces
tools (tokenization and POS tagging) provide
LORELEI Language Packs (Strassel and Tra
2016), and its English side is preprocessed
NLTK. Unlike the other language pairs, the
quency cutoff threshold for Turkish-Englis
100, as the amount of data is relatively small.
The ground truth bilingual lexica for Span
English and Italian-English are obtained f
Open Multilingual WordNet11 through NLTK.
Japanese-Chinese, we use an in-house lexi
For Turkish-English, we build a set of ground t
translation pairs in the same way as how we ob
seed word translation pairs from Google Trans
described above.
Results
Large-scale settings
method # seeds Wikipedia Gigaword
TM
50 0.00 0.01
100 4.79 2.07
IA
50 3.25 1.68
100 7.08 4.18
Ours 0 7.92 2.53
Table 5: Top-1 accuracies (%) of our approach
to inducing bilingual lexica for Chinese-English
from Wikipedia and Gigaword. Also listed are
results for translation matrix (TM) and isometric
alignment (IA) using 50 and 100 seeds.
performance on Japanese-Chinese is lower, on a
comparable level with Chinese-English (cf. Table
2), and these languages are relatively distantly re-
lated. Turkish-English represents a low-resource
scenario, and therefore the lexical semantic struc-
ture may be insufficiently captured by the embed-
dings. The agglutinative nature of Turkish can also
add to the challenge.
4.3 Large-Scale Settings
We experiment with large-scale Chinese-English
data from two sources: the whole Wikipedia dump
and Gigaword (LDC2011T13 and LDC2011T07).
We also simplify preprocessing by removing the
noun restriction and the lemmatization step (cf.
preprocessing decisions for the above experi-
ments).
metho
MonoGiza w/o e
MonoGiza w/ e
(Cao et al.,
Ours
Table 6: Top-5 acc
frequent words in t
figures for the base
2016).
configuration used
sarial autoencoder
train for 500k mini
4.4 Comparison
In order to compare
et al. (2016), whic
signal to connect
replicate their Fren
our approach.12 Th
portant differences
the evaluation prot
accuracy as the eva
bilingual lexicon is
alignment on a par
matically construct
ier than the ones
pairs; it often lists
word. This lenient
Conclusion
• Feasible to connect the word embeddings of different languages without any
cross-lingual signal
• Comparable performance with methods that require seeds to train
• Code available at http://thunlp.org/~zm/UBiLexAT/

Weitere ähnliche Inhalte

Andere mochten auch

Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology miningEstelle Delpech
 
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeDealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeEstelle Delpech
 
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Estelle Delpech
 
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Association for Computational Linguistics
 
Chelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionSarvnaz Karimi
 
Applicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesApplicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesEstelle Delpech
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Association for Computational Linguistics
 
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...Extraction of domain-specific bilingual lexicon from comparable corpora: comp...
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...Estelle Delpech
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Tobias Wunner
 
Bilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsBilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsAlberto Simões
 
Macro economische analyse van brazilië
Macro economische analyse van braziliëMacro economische analyse van brazilië
Macro economische analyse van braziliëJan-Willem Lammens
 
Embedded Human Computation for Knowledge Extraction and Evaluation
Embedded Human Computation for Knowledge Extraction and EvaluationEmbedded Human Computation for Knowledge Extraction and Evaluation
Embedded Human Computation for Knowledge Extraction and EvaluationwebLyzard technology
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaHaithem Afli
 
Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Adrien Barbaresi
 
A cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconA cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconİrem Tümer
 
Word Formation in English
Word Formation in EnglishWord Formation in English
Word Formation in Englishteflang
 

Andere mochten auch (17)

Bilingual terminology mining
Bilingual terminology miningBilingual terminology mining
Bilingual terminology mining
 
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeDealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchange
 
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
 
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
Michael Bloodgood - 2017 - Acquisition of Translation Lexicons for Historical...
 
Chelo Vargas-Sierra
Chelo Vargas-SierraChelo Vargas-Sierra
Chelo Vargas-Sierra
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
 
Applicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologiesApplicative evaluation of bilingual terminologies
Applicative evaluation of bilingual terminologies
 
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
Philippe Langlais - 2017 - Users and Data: The Two Neglected Children of Bili...
 
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...Extraction of domain-specific bilingual lexicon from comparable corpora: comp...
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...
 
Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...Cross-lingual ontology lexicalisation, translation and information extraction...
Cross-lingual ontology lexicalisation, translation and information extraction...
 
Bilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation PatternsBilingual Terminology Extraction based on Translation Patterns
Bilingual Terminology Extraction based on Translation Patterns
 
Macro economische analyse van brazilië
Macro economische analyse van braziliëMacro economische analyse van brazilië
Macro economische analyse van brazilië
 
Embedded Human Computation for Knowledge Extraction and Evaluation
Embedded Human Computation for Knowledge Extraction and EvaluationEmbedded Human Computation for Knowledge Extraction and Evaluation
Embedded Human Computation for Knowledge Extraction and Evaluation
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corpora
 
Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...Challenges in the linguistic exploitation of specialized republishable web co...
Challenges in the linguistic exploitation of specialized republishable web co...
 
A cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexiconA cognitive view of the bilingual lexicon
A cognitive view of the bilingual lexicon
 
Word Formation in English
Word Formation in EnglishWord Formation in English
Word Formation in English
 

Ähnlich wie Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon Induction

Fast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural NetworksFast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural NetworksSDL
 
result analysis for deep leakage from gradients
result analysis for deep leakage from gradientsresult analysis for deep leakage from gradients
result analysis for deep leakage from gradients國騰 丁
 
Neural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsTae Hwan Jung
 
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deren Lei
 
NEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATION
NEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATIONNEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATION
NEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATIONcscpconf
 
Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsVincenzo Lomonaco
 
Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...
Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...
Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...Fwdays
 
Summary distributed representations_words_phrases
Summary distributed representations_words_phrasesSummary distributed representations_words_phrases
Summary distributed representations_words_phrasesYue Xiangnan
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015RIILP
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningMLAI2
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...Lifeng (Aaron) Han
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine TranslationRIILP
 
Comparative Analysis of Transformer Based Pre-Trained NLP Models
Comparative Analysis of Transformer Based Pre-Trained NLP ModelsComparative Analysis of Transformer Based Pre-Trained NLP Models
Comparative Analysis of Transformer Based Pre-Trained NLP Modelssaurav singla
 

Ähnlich wie Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon Induction (20)

Fast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural NetworksFast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural Networks
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
 
result analysis for deep leakage from gradients
result analysis for deep leakage from gradientsresult analysis for deep leakage from gradients
result analysis for deep leakage from gradients
 
Plug play language_models
Plug play language_modelsPlug play language_models
Plug play language_models
 
Neural machine translation of rare words with subword units
Neural machine translation of rare words with subword unitsNeural machine translation of rare words with subword units
Neural machine translation of rare words with subword units
 
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
 
NEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATION
NEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATIONNEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATION
NEURAL SYMBOLIC ARABIC PARAPHRASING WITH AUTOMATIC EVALUATION
 
Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experiments
 
“Neural Machine Translation for low resource languages: Use case anglais - wo...
“Neural Machine Translation for low resource languages: Use case anglais - wo...“Neural Machine Translation for low resource languages: Use case anglais - wo...
“Neural Machine Translation for low resource languages: Use case anglais - wo...
 
Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...
Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...
Kostiantyn Omelianchuk, Oleksandr Skurzhanskyi "Building a state-of-the-art a...
 
Summary distributed representations_words_phrases
Summary distributed representations_words_phrasesSummary distributed representations_words_phrases
Summary distributed representations_words_phrases
 
Albert
AlbertAlbert
Albert
 
arttt.pdf
arttt.pdfarttt.pdf
arttt.pdf
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual LearningSequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
Sequential Reptile_Inter-Task Gradient Alignment for Multilingual Learning
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
12_applications.pdf
12_applications.pdf12_applications.pdf
12_applications.pdf
 
Comparative Analysis of Transformer Based Pre-Trained NLP Models
Comparative Analysis of Transformer Based Pre-Trained NLP ModelsComparative Analysis of Transformer Based Pre-Trained NLP Models
Comparative Analysis of Transformer Based Pre-Trained NLP Models
 

Mehr von Association for Computational Linguistics

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 

Mehr von Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Kürzlich hochgeladen

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 

Kürzlich hochgeladen (20)

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Meng Zhang - 2017 - Adversarial Training for Unsupervised Bilingual Lexicon Induction

  • 1. LATEX TikZposter Adversarial Training for Unsupervised Bilingual Lexicon Induction Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun Tsinghua University, Beijing, China Adversarial Training for Unsupervised Bilingual Lexicon Induction Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun Tsinghua University, Beijing, China Overview • Task input: Separate monolingual embeddings trained on non-parallel data • Task output: A bilingual lexicon • Challenge: Zero supervision: Can we link separate monolingual embeddings without any cross-lingual signal? • Solution: Formulate as an adversarial game • Outcome: Successful learning with proper model design and training tech- niques Background caballo (horse) cerdo (pig) gato (cat) horse pig cat Spanish English Although monolingual word embeddings are trained separately on non-parallel data, they appear approximately isomorphic. Therefore a linear transformation can be used to align the two embedding spaces. But previous works typically require seed word translation pairs to supervise its learning. Models (b) D1 0/1 G GT D2 1/0 (a) D 0/1 G (c) D 0/1 G GT LR ~ (a) Model 1 (unidirectional transformation): The generator G is a linear trans- formation that tries to transform source word embeddings (squares) to make them seem like target ones (dots), while the discriminator D tries to classify whether the input embeddings are generated by G or real samples from the target embedding distribution. (b) Model 2 (bidirectional transformation): If G transforms the source word em- bedding space into the target language space, its transpose G should trans- form the target language space back to the source. (c) Model 3 (adversarial autoencoder): After the generator G transforms a source word embedding x into a target language representation Gx, we should be able to reconstruct the source word embedding x by mapping back with G . • Model 2 and 3 can be seen as relaxations of an orthogonal constraint on G. Training Techniques Regularizing the discriminator • All forms of regularization help training. • Multiplicative Gaussian injected into the input is the most effective. On top of that, hidden layer noise helps slightly. Model selection • Sharp drops of the generator loss correspond to good models. • Reconstruction loss LR and the value of G G − I F drop synchronously → Good models are indeed close to orthogonality. Experiments Chinese-English method # seeds accuracy (%) MonoGiza w/o emb. 0 0.05 MonoGiza w/ emb. 0 0.09 TM 50 0.29 100 21.79 IA 50 18.71 100 32.29 Model 1 0 39.25 Model 1 + ortho. 0 28.62 Model 2 0 40.28 Model 3 0 43.31 Table 2: Chinese-English top-1 accuracies of the MonoGiza baseline and our models, along with the translation matrix (TM) and isometric align- ment (IA) methods that utilize 50 and 100 seeds. 4.1 Experiments on Chinese-English Data For this set of experiments, the data for training word embeddings comes from Wikipedia com- parable corpora.5 Following (Vuli´c and Moens, 2013), we retain only nouns with at least 1,000 occurrences. For the Chinese side, we first use OpenCC6 to normalize characters to be simplified, and then perform Chinese word segmentation and POS tagging with THULAC.7 The preprocessing of the English side involves tokenization, POS tag- ging, lemmatization, and lowercasing, which we carry out with the NLTK toolkit.8 The statistics of the final training data is given in Table 1, along with the other experimental settings. As the ground truth bilingual lexicon for evalua- tion, we use Chinese-English Translation Lexicon Version 3.0 (LDC2002L27). Overall Performance Table 2 lists the performance of the MonoGiza baseline and our four variants of adversarial train- ing. MonoGiza obtains low performance, likely due to the harsh evaluation protocol (cf. Sec- tion 4.4). Providing it with syntactic information can help (Dou and Knight, 2013), but in a low- resource scenario with zero cross-lingual informa- tion, parsers are likely to be inaccurate or even un- available. 5 http://linguatools.org/tools/corpora/wikipedia- comparable-corpora 6 https://github.com/BYVoid/OpenCC 7 http://thulac.thunlp.org 8 http://www.nltk.org 城市 chengshi x city town suburb area c proximity Table 3: Top-5 Eng posed by our appr The ground truth is 0 0 10 20 30 Accuracy(%) Figure 4: Top-1 isometric alignmen (TM), with the num 100, 200, 500, 100 The unidirection reasonable accurac is rather sensitive ization. This train thogonal constraint onal constraint hur 20 times slower eve parametrization in tion. The last two m ations of the orthog sarial autoencoder mance. We therefo iments. Table 3 list ples given by the a Comparison With In this section, w TM and IA require of our approach. T translation pairs fo removed from the t this experiment. W for TM and IA. Figure 4 shows Comparison with seed-based methods 0 500 1000 # seeds 0 10 20 30 Accuracy(%) Ours IA TM Spanish-English, Italian-English, Japanese-Chinese, Turkish-English method # seeds es-en it-en ja-zh tr-en MonoGiza w/o embeddings 0 0.35 0.30 0.04 0.00 MonoGiza w/ embeddings 0 1.19 0.27 0.23 0.09 TM 50 1.24 0.76 0.35 0.09 100 48.61 37.95 26.67 11.15 IA 50 39.89 27.03 19.04 7.58 100 60.44 46.52 36.35 17.11 Ours 0 71.97 58.60 43.02 17.18 Table 4: Top-1 accuracies (%) of the MonoGiza baseline and our approach on Spanish-English, Ital English, Japanese-Chinese, and Turkish-English. The results for translation matrix (TM) and isom alignment (IA) using 50 and 100 seeds are also listed. 50 100 150 200 Embedding dimension 30 40 50 Accuracy(%) Figure 5: Top-1 accuracies of our approach with respect to the input embedding dimensions in {20, 50, 100, 200}. S. When the seeds are few, the seed-based meth- ods exhibit clear performance degradation. In this case, we also observe the importance of the or- thogonal constraint from the superiority of IA to TM, which supports our introduction of this con- straint as we attempt zero supervision. Finally, in line with the finding in (Vuli´c and Korhonen, 2016), hundreds of seeds are needed for TM to gen- eralize. Only then do seed-based methods catch up with our approach, and the performance difference is marginal even when more seeds are provided. Effect of Embedding Dimension 4.2 Experiments on Other Language Pair Data We also induce bilingual lexica from Wikip comparable corpora for the following langu pairs: Spanish-English, Italian-English, Japan Chinese, and Turkish-English. For Span English and Italian-English, we choose to TreeTagger9 for preprocessing, as in (Vuli´c Moens, 2013). For the Japanese corpus, we MeCab10 for word segmentation and POS ging. For Turkish, we utilize the preproces tools (tokenization and POS tagging) provide LORELEI Language Packs (Strassel and Tra 2016), and its English side is preprocessed NLTK. Unlike the other language pairs, the quency cutoff threshold for Turkish-Englis 100, as the amount of data is relatively small. The ground truth bilingual lexica for Span English and Italian-English are obtained f Open Multilingual WordNet11 through NLTK. Japanese-Chinese, we use an in-house lexi For Turkish-English, we build a set of ground t translation pairs in the same way as how we ob seed word translation pairs from Google Trans described above. Results Large-scale settings method # seeds Wikipedia Gigaword TM 50 0.00 0.01 100 4.79 2.07 IA 50 3.25 1.68 100 7.08 4.18 Ours 0 7.92 2.53 Table 5: Top-1 accuracies (%) of our approach to inducing bilingual lexica for Chinese-English from Wikipedia and Gigaword. Also listed are results for translation matrix (TM) and isometric alignment (IA) using 50 and 100 seeds. performance on Japanese-Chinese is lower, on a comparable level with Chinese-English (cf. Table 2), and these languages are relatively distantly re- lated. Turkish-English represents a low-resource scenario, and therefore the lexical semantic struc- ture may be insufficiently captured by the embed- dings. The agglutinative nature of Turkish can also add to the challenge. 4.3 Large-Scale Settings We experiment with large-scale Chinese-English data from two sources: the whole Wikipedia dump and Gigaword (LDC2011T13 and LDC2011T07). We also simplify preprocessing by removing the noun restriction and the lemmatization step (cf. preprocessing decisions for the above experi- ments). metho MonoGiza w/o e MonoGiza w/ e (Cao et al., Ours Table 6: Top-5 acc frequent words in t figures for the base 2016). configuration used sarial autoencoder train for 500k mini 4.4 Comparison In order to compare et al. (2016), whic signal to connect replicate their Fren our approach.12 Th portant differences the evaluation prot accuracy as the eva bilingual lexicon is alignment on a par matically construct ier than the ones pairs; it often lists word. This lenient Conclusion • Feasible to connect the word embeddings of different languages without any cross-lingual signal • Comparable performance with methods that require seeds to train • Code available at http://thunlp.org/~zm/UBiLexAT/