SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Aaron L.-F. Han, Derek F. Wong, Lidia S. Chao, and Liangye He
Open source code: https://github.com/aaronlifenghan/aaron-project-hlepor
May 16th, 2012
Natural Language Processing & Portuguese-Chinese Machine Translation Laboratory
Department of Computer and Information Science
University of Macau
TSD 2013, LNAI Vol. 8082, pp. 121-128. Springer Verlag Berlin Heidelberg 2013
 Introduction and some related work in MT Evaluation
 Problem and designed idea for MT evaluation
 Employed linguistic feature
 Designed measuring formula
 Evaluation method of evaluation metric
 Experiment on WMT corpora
 Conclusion
 Reference
• The machine translation (MT) began as early as in the 1950s (Weaver,
1955)
• big progress science the 1990s due to the development of computers
(storage capacity and computational power) and the enlarged bilingual
corpora (Marino et al. 2006)
• Some recent works of MT:
• (Och 2003) presented MERT (Minimum Error Rate Training) for log-linear
SMT
• (Su et al. 2009) used the Thematic Role Templates model to improve the
translation
• (Xiong et al. 2011) employed the maximum-entropy model etc.
• The rule-based and data-driven methods including example-based MT
(Carl and Way 2003) and statistical MT (Koehn 2010) became mainly
approaches in MT literature.
• Due to the wide-spread development of MT systems, the MT evaluation
becomes more and more important to tell us how well the MT systems
perform and whether they make some progress.
• However, the MT evaluation is difficult:
• language variability results in no single correct translation
• the natural languages are highly ambiguous and different languages do
not always express the same content in the same way (Arnold 2003)
• Human evaluation:
• the intelligibility (measuring how understandable the sentence is)
• fidelity (measuring how much information the translated sentence retains
compared to the original) used by the Automatic Language Processing
Advisory Committee (ALPAC) around 1966 (Carroll 1966)
• adequacy (similar as fidelity), fluency (whether the sentence is well-
formed and fluent) and comprehension (improved intelligibility) by
Defense Advanced Research Projects Agency (DARPA) of US (White et al.
1994).
• Problem in manual evaluations :
• time-consuming and thus too expensive to do frequently.
• automatic evaluation metrics :
• word error rate WER (Su et al. 1992) (edit distance between the system
output and the closest reference translation)
• position independent word error rate PER (Tillmann et al. 1997) (variant of
WER that disregards word ordering)
• BLEU (Papineni et al. 2002) (the geometric mean of n-gram precision by
the system output with respect to reference translations)
• NIST (Doddington 2002) (adding the information weight)
• GTM (Turian et al. 2003)
• Recently, many other methods:
• METEOR (Banerjee and Lavie 2005) metric conducts a flexible matching,
considering stems, synonyms and paraphrases.
• The matching process involves computationally expensive word
alignment. There are some parameters such as the relative weight of recall
to precision, the weight for stemming or synonym that should be tuned.
Meteor-1.3 (Denkowski and Lavie 2011), an modified version of Meteor,
includes ranking and adequacy versions and has overcome some
weaknesses of previous version such as noise in the paraphrase matching,
lack of punctuation handling and discrimination between word types.
• Snover (Snover et al. 2006) discussed that one disadvantage of the
Levenshtein distance was that mismatches in word order required the
deletion and re-insertion of the misplaced words.
• They proposed TER by adding an editing step that allows the movement of
word sequences from one part of the output to another. This is something
a human post-editor would do with the cut-and-paste function of a word
processor.
• However, finding the shortest sequence of editing steps is a
computationally hard problem.
• AMBER (Chen and Kuhn 2011) including AMBER-TI and AMBER-NL declare
a modified version of BLEU and attaches more kinds of penalty
coefficients, combining the n-gram precision and recall with the arithmetic
average of F-measure.
• Before the evaluation, it provides eight kinds of preparations on the
corpus by whether the words are tokenized or not, extracting (the stem,
prefix and suffix) on the words, and splitting the words into several parts
with different ratios.
• F15 (Bicici and Yuret 2011) and F15G3 per-form evaluation with the F1
measure (assigning the same weight on precision and recall) over target
features as a metric for evaluating translation quality.
• The target features they defined include TP (be the true positive), TN (the
true negative), FP (the false positive), and FN (the false negative rates) etc.
To consider the surrounding phrase for a missing token in the translation
they employed the gapped word sequence kernels (Taylor and Cristianini
2004) approach to evaluate translations.
• Other related works:
• (Wong and Kit 2008), (Isozaki et al. 2010) and (Talbot et al. 2011) about
the discussion of word order
• ROSE (Song and Cohn 2011), MPF and WMPF (Popovic 2011) about the
employing of POS information
• MP4IBM1 (Popovic et al. 2011) without relying on reference translations,
etc.
• The evaluation methods proposed previously suffer from several main
weaknesses more or less:
• perform well in certain language pairs but weak on others, which we call
the language-bias problem;
• consider no linguistic information (leading the metrics result in low
correlation with human judgments) or too many linguistic features
(difficult in replicability), which we call the extremism problem;
• present incomprehensive factors (e.g. BLEU focus on precision only).
• What to do?
• This paper: to address some of above problems
• How?
• Enhanced factors
• Tunable parameters
• Organic and scientific factors combinations (mathematical)
• Concise linguistic features
• To address the variability phenomenon, researchers used to employ the
synonyms, paraphrasing or text entailment as auxiliary information. All of
these approaches have their advantages and weaknesses, e.g.
• the synonyms are difficulty to cover all the acceptable expressions.
• Instead, in the designed metric, we perform the measuring on the part-of-
speech (POS) information (also applied by ROSE (Song and Cohn 2011),
MPF and WMPF (Popovic 2011)).
• If the translation sentence of system outputs is a good translation then
there is a potential that the output sentence has a similar semantic
information (the two sentence may not contain exactly the same words
but with the words that have similar semantic meaning).
• For example, “there is a big bag” and “there is a large bag” could be the
same expression since “big” and “large” has the similar meaning (with POS
as adjective).
How to measure?
• 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑋1, 𝑋2, … , 𝑋 𝑛 =
𝑛
1
𝑋 𝑖
𝑛
𝑖=1
(1)
• 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝑋1
𝑋1, … , 𝑤 𝑋 𝑛 𝑋 𝑛 =
𝑤 𝑋 𝑖
𝑛
𝑖=1
𝑤 𝑋 𝑖
𝑋 𝑖
𝑛
𝑖=1
(2)
• ℎ𝐿𝐸𝑃𝑂𝑅 =
𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝐿𝑃 𝐿𝑃, 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙, 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅
• =
𝑤 𝑖
𝑛
𝑖=1
𝑤 𝑖
𝐹𝑎𝑐𝑡𝑜𝑟 𝑖
𝑛
𝑖=1
=
𝑤 𝐿𝑃+𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙+𝑤 𝐻𝑃𝑅
𝑤 𝐿𝑃
𝐿𝑃
+
𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙
𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙
+
𝑤 𝐻𝑃𝑅
𝐻𝑃𝑅
(3)
• 𝐿𝑃 =
exp 1 −
𝑟
𝑐
: 𝑐 < 𝑟
1 ∶ 𝑐 = 𝑟
exp 1 −
𝑐
𝑟
: 𝑐 > 𝑟
(4)
• 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 = exp −𝑁𝑃𝐷 (5)
• 𝑁𝑃𝐷 =
1
𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡
|𝑃𝐷𝑖|
𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡
𝑖=1
(6)
Fig. 1. N-gram POS alignment algorithm
Fig. 3. Example of NPD calculation
Fig. 2. Example of n-gram POS alignment
• 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝛼𝑅, 𝛽𝑃 =
𝛼+𝛽
𝛼
𝑅
+
𝛽
𝑃
(7)
• 𝑃 =
𝑎𝑙𝑖𝑔𝑛𝑒𝑑 𝑛𝑢𝑚
𝑠𝑦𝑠𝑡𝑒𝑚 𝑙𝑒𝑛𝑔𝑡ℎ
(8)
• 𝑅 =
𝑎𝑙𝑖𝑔𝑛𝑒𝑑 𝑛𝑢𝑚
𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑙𝑒𝑛𝑔𝑡ℎ
(9)
• ℎ𝐿𝐸𝑃𝑂𝑅 𝐴 =
1
𝑆𝑒𝑛𝑡𝑁𝑢𝑚
ℎ𝐿𝐸𝑃𝑂𝑅𝑖
𝑆𝑒𝑛𝑡𝑁𝑢𝑚
𝑖=1 (10)
• ℎ𝐿𝐸𝑃𝑂𝑅 𝐵 =
𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝐿𝑃 𝐿𝑃, 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙𝑡𝑦, 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅
(11)
How to evaluate the effectiveness of the
algorithms?
• Spearman correlation coefficient:
• 𝜌∅ 𝑋𝑌 = 1 −
6 𝑑 𝑖
2𝑛
𝑖=1
𝑛(𝑛2−1)
(12)
• 𝑋 = 𝑥1, … , 𝑥 𝑛 , 𝑌 = {𝑦1, … , 𝑦𝑛}
Experiment on authoritative corpora
International workshop on
STATISTICAL MACHINE TRANSLATION
• Parameters tuned on WMT08 and tested on WMT11:
• evaluation metric based on mathematical weighted harmonic mean
• tunable weights
• Enhanced factors
• employs concise linguistic feature the POS of the word
• Better performance than the similar metrics using POS, such as ROSE,
MPF, WMPF.
• Performance can be further enhanced by the increasing of POS tools and
the adjusting of the parameter values
• BLEU uses n-gram, other researchers count the number of POS, e.g.
(Avramidis et al. 2011), we combine the n-gram and POS information
together.
• Evaluation methods without the need of reference perform low, e.g. the
MP4IBM1 metric ranked near the bottom in the experiments.
• More language pairs will be tested
• Combination of both word and POS will be explored
• Parameter tuning will be achieved automatically
• Evaluation without golden references will be developed
• 1. Weaver, Warren.: Translation. In William Locke and A. Donald Booth, editors,
• Machine Translation of Languages: Fourteen Essays. John Wiley and Sons, New
• York, pages 15{23 (1955)
• 2. Marino B. Jose, Rafael E. Banchs, Josep M. Crego, Adria de Gispert, Patrik Lambert,
• Jose A. Fonollosa, Marta R. Costa-jussa: N-gram based machine translation,
• Computational Linguistics, Vol. 32, No. 4. pp. 527-549, MIT Press (2006)
• 3. Och, F. J.: Minimum Error Rate Training for Statistical Machine Translation. In
• Proceedings of (ACL-2003). pp. 160-167 (2003)
• 4. Su Hung-Yu and Chung-Hsien Wu: Improving Structural Statistical Machine Translation
• for Sign Language With Small Corpus Using Thematic Role Templates as
• Translation Memory, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE
• PROCESSING, VOL. 17, NO. 7, SEPTEMBER (2009)
• 5. Xiong D., M. Zhang, H. Li: A Maximum-Entropy Segmentation Model for Statistical
• Machine Translation, Audio, Speech, and Language Processing, IEEE Transactions
• on, Volume: 19, Issue: 8, 2011 , pp. 2494- 2505 (2011)
• 6. Carl, M. and A. Way (eds): Recent Advances in Example-Based Machine Translation.
• Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)
• 7. Koehn P.: Statistical Machine Translation, (University of Edinburgh), Cambridge
• University Press (2010)
• 8. Arnold, D.: Why translation is dicult for computers. In Computers and Translation:
• A translator's guide. Benjamins Translation Library (2003)
• 9. Carroll, J. B.: Aan experiment in evaluating the quality of translation, Pierce, J.
• (Chair), Languages and machines: computers in translation and linguistics. A report
• by the Automatic Language Processing Advisory Committee (ALPAC), Publication
• 1416, Division of Behavioral Sciences, National Academy of Sciences, National Research
• Council, page 67-75 (1966)
• 10. White, J. S., O'Connell, T. A., and O'Mara, F. E.: The ARPA MT evaluation
• methodologies: Evolution, lessons, and future approaches. In Proceedings of the
• Conference of the Association for Machine Translation in the Americas (AMTA
• 1994). pp193-205 (1994)
• 11. Su Keh-Yih, Wu Ming-Wen and Chang Jing-Shin: A New Quantitative Quality
• Measure for Machine Translation Systems. In Proceedings of the 14th International
• Conference on Computational Linguistics, pages 433{439, Nantes, France, July
• (1992)
• 12. Tillmann C., Stephan Vogel, Hermann Ney, Arkaitz Zubiaga, and Hassan Sawaf:
• Accelerated DP Based Search For Statistical Translation. In Proceedings of the 5th
• European Conference on Speech Communication and Technology (EUROSPEECH97)
• (1997)
• 13. Papineni, K., Roukos, S., Ward, T. and Zhu, W. J.: BLEU: a method for automatic
• evaluation of machine translation. In Proceedings of the (ACL 2002), pages 311-318,
• Philadelphia, PA, USA (2002)
• 14. Doddington, G.: Automatic evaluation of machine translation quality using ngram
• co-occurrence statistics. In Proceedings of the second international conference
• on Human Language Technology Research(HLT 2002), pages 138-145, San Diego,
• California, USA (2002)
• 15. Turian, J. P., Shen, L. and Melanmed, I. D.: Evaluation of machine translation
• and its evaluation. In Proceedings of MT Summit IX, pages 386-393, New Orleans,
• LA, USA (2003)
• 16. Banerjee, S. and Lavie, A.: Meteor: an automatic metric for MT evaluation with
• high levels of correlation with human judgments. In Proceedings of ACL-WMT,
• pages 65-72, Prague, Czech Republic (2005)
• 17. Denkowski, M. and Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization
• and evaluation of machine translation systems. In Proceedings of (ACL-WMT),
• pages 85-91, Edinburgh, Scotland, UK (2011)
• 18. Snover, M., Dorr, B., Schwartz, R., Micciulla, L. and Makhoul, J.: A study of
• translation edit rate with targeted human annotation. In Proceedings of the Conference
• of the Association for Machine Translation in the Americas (AMTA), pages
• 223-231, Boston, USA (2006)
• 19. Chen, B. and Kuhn, R.: Amber: A modied bleu, enhanced ranking metric. In
• Proceedings of (ACL-WMT), pages 71-77, Edinburgh, Scotland, UK (2011)
• 20. Bicici, E. and Yuret, D.: RegMT system for machine translation, system combination,
• and evaluation. In Proceedings ACL-WMT, pages 323-329, Edinburgh,
• Scotland, UK (2011)
• 21. Taylor, J. Shawe and N. Cristianini: Kernel Methods for Pattern Analysis. Cambridge
• University Press 2004.
• 22. Wong, B. T-M and Kit, C.: Word choice and word position for automatic MT
• evaluation. In Workshop: MetricsMATR of the Association for Machine Translation
• in the Americas (AMTA), short paper, 3 pages, Waikiki, Hawai'I, USA (2008)
• 23. Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H.: Automatic evaluation
• of translation quality for distant language pairs. In Proceedings of the 2010
• Conference on (EMNLP), pages 944{952, Cambridge, MA (2010)
• 24. Talbot, D., Kazawa, H., Ichikawa, H., Katz-Brown, J., Seno, M. and Och, F.: A
• Lightweight Evaluation Framework for Machine Translation Reordering. In Proceedings
• of the Sixth (ACL-WMT), pages 12-21, Edinburgh, Scotland, UK (2011)
• 25. Song, X. and Cohn, T.: Regression and ranking based optimisation for sentence
• level MT evaluation. In Proceedings of the (ACL-WMT), pages 123-129, Edinburgh,
• Scotland, UK (2011)
• 26. Popovic, M.: Morphemes and POS tags for n-gram based evaluation metrics. In
• Proceedings of (ACL-WMT), pages 104-107, Edinburgh, Scotland, UK (2011)
• 27. Popovic, M., Vilar, D., Avramidis, E. and Burchardt, A.: Evaluation without references:
• IBM1 scores as evaluation metrics. In Proceedings of the (ACL-WMT),
• pages 99-103, Edinburgh, Scotland, UK (2011)
• 28. Petrov S., Leon Barrett, Romain Thibaux, and Dan Klein: Learning accurate,
• compact, and interpretable tree annotation. Proceedings of the 21st ACL, pages
• 433{440, Sydney, July (2006)
• 29. Callison-Bruch, C., Koehn, P., Monz, C. and Zaidan, O. F.: Findings of the 2011
• Workshop on Statistical Machine Translation. In Proceedings of (ACL-WMT), pages
• 22-64, Edinburgh, Scotland, UK (2011)
• 30. Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M. and Zaidan,
• O. F.: Findings of the 2010 Joint Workshop on Statistical Machine Translation and
• Metrics for Machine Translation. In Proceedings of (ACL-WMT), pages 17-53, PA,
• USA (2010)
• 31. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Findings of the 2009
• Workshop on Statistical Machine Translation. In Proceedings of ACL-WMT, pages
• 1-28, Athens, Greece (2009)
• 32. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Further meta-evaluation
• of machine translation. In Proceedings of (ACL-WMT), pages 70-106, Columbus,
• Ohio, USA (2008)
• 33. Avramidis E., Popovic, M., Vilar, D., Burchardt, A.: Evaluate with Condence
• Estimation: Machine ranking of translation outputs using grammatical features. In
• Proceedings of the Sixth Workshop on Statistical Machine Translation, Association
• for Computational Linguistics (ACL-WMT), pages 65-70, Edinburgh, Scotland, UK
• (2011)
Aaron L.-F. Han, Derek F. Wong, Lidia S. Chao, and Liangye He
Open source code: https://github.com/aaronlifenghan/aaron-project-hlepor
Natural Language Processing & Portuguese-Chinese Machine Translation Laboratory
Department of Computer and Information Science
University of Macau
TSD 2013, LNAI Vol. 8082, pp. 121-128. Springer Verlag Berlin Heidelberg 2013

Weitere ähnliche Inhalte

Andere mochten auch

ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...
ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...
ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...Lifeng (Aaron) Han
 
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...Lifeng (Aaron) Han
 
GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...
GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...
GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...Lifeng (Aaron) Han
 
ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...
ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...
ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...Lifeng (Aaron) Han
 
LEPOR: an augmented machine translation evaluation metric
LEPOR: an augmented machine translation evaluation metric LEPOR: an augmented machine translation evaluation metric
LEPOR: an augmented machine translation evaluation metric Lifeng (Aaron) Han
 
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...Lifeng (Aaron) Han
 

Andere mochten auch (6)

ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...
ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...
ACL-WMT13 poster.Quality Estimation for Machine Translation Using the Joint M...
 
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
 
GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...
GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...
GSCL2013 Poster.A Study of Chinese Word Segmentation Based on the Characteris...
 
ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...
ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...
ACL-WMT Poster.A Description of Tunable Machine Translation Evaluation System...
 
LEPOR: an augmented machine translation evaluation metric
LEPOR: an augmented machine translation evaluation metric LEPOR: an augmented machine translation evaluation metric
LEPOR: an augmented machine translation evaluation metric
 
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
LP&IIS2013 PPT. Chinese Named Entity Recognition with Conditional Random Fiel...
 

Ähnlich wie TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION

Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLifeng (Aaron) Han
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools Lifeng (Aaron) Han
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Lifeng (Aaron) Han
 
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...Lifeng (Aaron) Han
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a surveyLifeng (Aaron) Han
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Lifeng (Aaron) Han
 
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATIONTSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATIONLifeng (Aaron) Han
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Lifeng (Aaron) Han
 
Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...Lifeng (Aaron) Han
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to HindiRajat Jain
 
Automated evaluation of coherence in student essays.pdf
Automated evaluation of coherence in student essays.pdfAutomated evaluation of coherence in student essays.pdf
Automated evaluation of coherence in student essays.pdfSarah Marie
 
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...Lifeng (Aaron) Han
 
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSA COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSkevig
 
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSA COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSkevig
 
Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning
Error Detection and Feedback with OT-LFG for Computer-assisted Language LearningError Detection and Feedback with OT-LFG for Computer-assisted Language Learning
Error Detection and Feedback with OT-LFG for Computer-assisted Language LearningCITE
 
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSA COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSijnlc
 
Evaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsEvaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsSajeed Mahaboob
 

Ähnlich wie TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION (20)

Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a survey
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATIONTSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
TSD2013.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
 
Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
Automated evaluation of coherence in student essays.pdf
Automated evaluation of coherence in student essays.pdfAutomated evaluation of coherence in student essays.pdf
Automated evaluation of coherence in student essays.pdf
 
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
ACL-WMT2013.Quality Estimation for Machine Translation Using the Joint Method...
 
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSA COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
 
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSA COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
 
Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning
Error Detection and Feedback with OT-LFG for Computer-assisted Language LearningError Detection and Feedback with OT-LFG for Computer-assisted Language Learning
Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning
 
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODSA COMPARATIVE STUDY OF FEATURE SELECTION METHODS
A COMPARATIVE STUDY OF FEATURE SELECTION METHODS
 
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
Linguistic Evaluation of Support Verb Construction Translations by OpenLogos ...
 
Evaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsEvaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutions
 

Mehr von Lifeng (Aaron) Han

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterLifeng (Aaron) Han
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...Lifeng (Aaron) Han
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word ExpressionsLifeng (Aaron) Han
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerLifeng (Aaron) Han
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Lifeng (Aaron) Han
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...Lifeng (Aaron) Han
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationLifeng (Aaron) Han
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveyLifeng (Aaron) Han
 
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Lifeng (Aaron) Han
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Lifeng (Aaron) Han
 
PPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual TreebanksPPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual TreebanksLifeng (Aaron) Han
 

Mehr von Lifeng (Aaron) Han (18)

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a survey
 
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
 
Thesis-Master-MTE-Aaron
Thesis-Master-MTE-AaronThesis-Master-MTE-Aaron
Thesis-Master-MTE-Aaron
 
PPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual TreebanksPPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFORMATION

  • 1. Aaron L.-F. Han, Derek F. Wong, Lidia S. Chao, and Liangye He Open source code: https://github.com/aaronlifenghan/aaron-project-hlepor May 16th, 2012 Natural Language Processing & Portuguese-Chinese Machine Translation Laboratory Department of Computer and Information Science University of Macau TSD 2013, LNAI Vol. 8082, pp. 121-128. Springer Verlag Berlin Heidelberg 2013
  • 2.  Introduction and some related work in MT Evaluation  Problem and designed idea for MT evaluation  Employed linguistic feature  Designed measuring formula  Evaluation method of evaluation metric  Experiment on WMT corpora  Conclusion  Reference
  • 3. • The machine translation (MT) began as early as in the 1950s (Weaver, 1955) • big progress science the 1990s due to the development of computers (storage capacity and computational power) and the enlarged bilingual corpora (Marino et al. 2006) • Some recent works of MT: • (Och 2003) presented MERT (Minimum Error Rate Training) for log-linear SMT • (Su et al. 2009) used the Thematic Role Templates model to improve the translation • (Xiong et al. 2011) employed the maximum-entropy model etc. • The rule-based and data-driven methods including example-based MT (Carl and Way 2003) and statistical MT (Koehn 2010) became mainly approaches in MT literature.
  • 4. • Due to the wide-spread development of MT systems, the MT evaluation becomes more and more important to tell us how well the MT systems perform and whether they make some progress. • However, the MT evaluation is difficult: • language variability results in no single correct translation • the natural languages are highly ambiguous and different languages do not always express the same content in the same way (Arnold 2003)
  • 5. • Human evaluation: • the intelligibility (measuring how understandable the sentence is) • fidelity (measuring how much information the translated sentence retains compared to the original) used by the Automatic Language Processing Advisory Committee (ALPAC) around 1966 (Carroll 1966) • adequacy (similar as fidelity), fluency (whether the sentence is well- formed and fluent) and comprehension (improved intelligibility) by Defense Advanced Research Projects Agency (DARPA) of US (White et al. 1994). • Problem in manual evaluations : • time-consuming and thus too expensive to do frequently.
  • 6. • automatic evaluation metrics : • word error rate WER (Su et al. 1992) (edit distance between the system output and the closest reference translation) • position independent word error rate PER (Tillmann et al. 1997) (variant of WER that disregards word ordering) • BLEU (Papineni et al. 2002) (the geometric mean of n-gram precision by the system output with respect to reference translations) • NIST (Doddington 2002) (adding the information weight) • GTM (Turian et al. 2003)
  • 7. • Recently, many other methods: • METEOR (Banerjee and Lavie 2005) metric conducts a flexible matching, considering stems, synonyms and paraphrases. • The matching process involves computationally expensive word alignment. There are some parameters such as the relative weight of recall to precision, the weight for stemming or synonym that should be tuned. Meteor-1.3 (Denkowski and Lavie 2011), an modified version of Meteor, includes ranking and adequacy versions and has overcome some weaknesses of previous version such as noise in the paraphrase matching, lack of punctuation handling and discrimination between word types.
  • 8. • Snover (Snover et al. 2006) discussed that one disadvantage of the Levenshtein distance was that mismatches in word order required the deletion and re-insertion of the misplaced words. • They proposed TER by adding an editing step that allows the movement of word sequences from one part of the output to another. This is something a human post-editor would do with the cut-and-paste function of a word processor. • However, finding the shortest sequence of editing steps is a computationally hard problem.
  • 9. • AMBER (Chen and Kuhn 2011) including AMBER-TI and AMBER-NL declare a modified version of BLEU and attaches more kinds of penalty coefficients, combining the n-gram precision and recall with the arithmetic average of F-measure. • Before the evaluation, it provides eight kinds of preparations on the corpus by whether the words are tokenized or not, extracting (the stem, prefix and suffix) on the words, and splitting the words into several parts with different ratios.
  • 10. • F15 (Bicici and Yuret 2011) and F15G3 per-form evaluation with the F1 measure (assigning the same weight on precision and recall) over target features as a metric for evaluating translation quality. • The target features they defined include TP (be the true positive), TN (the true negative), FP (the false positive), and FN (the false negative rates) etc. To consider the surrounding phrase for a missing token in the translation they employed the gapped word sequence kernels (Taylor and Cristianini 2004) approach to evaluate translations.
  • 11. • Other related works: • (Wong and Kit 2008), (Isozaki et al. 2010) and (Talbot et al. 2011) about the discussion of word order • ROSE (Song and Cohn 2011), MPF and WMPF (Popovic 2011) about the employing of POS information • MP4IBM1 (Popovic et al. 2011) without relying on reference translations, etc.
  • 12. • The evaluation methods proposed previously suffer from several main weaknesses more or less: • perform well in certain language pairs but weak on others, which we call the language-bias problem; • consider no linguistic information (leading the metrics result in low correlation with human judgments) or too many linguistic features (difficult in replicability), which we call the extremism problem; • present incomprehensive factors (e.g. BLEU focus on precision only). • What to do? • This paper: to address some of above problems
  • 13. • How? • Enhanced factors • Tunable parameters • Organic and scientific factors combinations (mathematical) • Concise linguistic features
  • 14. • To address the variability phenomenon, researchers used to employ the synonyms, paraphrasing or text entailment as auxiliary information. All of these approaches have their advantages and weaknesses, e.g. • the synonyms are difficulty to cover all the acceptable expressions. • Instead, in the designed metric, we perform the measuring on the part-of- speech (POS) information (also applied by ROSE (Song and Cohn 2011), MPF and WMPF (Popovic 2011)). • If the translation sentence of system outputs is a good translation then there is a potential that the output sentence has a similar semantic information (the two sentence may not contain exactly the same words but with the words that have similar semantic meaning). • For example, “there is a big bag” and “there is a large bag” could be the same expression since “big” and “large” has the similar meaning (with POS as adjective).
  • 16. • 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑋1, 𝑋2, … , 𝑋 𝑛 = 𝑛 1 𝑋 𝑖 𝑛 𝑖=1 (1) • 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝑋1 𝑋1, … , 𝑤 𝑋 𝑛 𝑋 𝑛 = 𝑤 𝑋 𝑖 𝑛 𝑖=1 𝑤 𝑋 𝑖 𝑋 𝑖 𝑛 𝑖=1 (2) • ℎ𝐿𝐸𝑃𝑂𝑅 = 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝐿𝑃 𝐿𝑃, 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙, 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅 • = 𝑤 𝑖 𝑛 𝑖=1 𝑤 𝑖 𝐹𝑎𝑐𝑡𝑜𝑟 𝑖 𝑛 𝑖=1 = 𝑤 𝐿𝑃+𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙+𝑤 𝐻𝑃𝑅 𝑤 𝐿𝑃 𝐿𝑃 + 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 + 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅 (3)
  • 17. • 𝐿𝑃 = exp 1 − 𝑟 𝑐 : 𝑐 < 𝑟 1 ∶ 𝑐 = 𝑟 exp 1 − 𝑐 𝑟 : 𝑐 > 𝑟 (4) • 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 = exp −𝑁𝑃𝐷 (5) • 𝑁𝑃𝐷 = 1 𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡 |𝑃𝐷𝑖| 𝐿𝑒𝑛𝑔𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡 𝑖=1 (6)
  • 18. Fig. 1. N-gram POS alignment algorithm
  • 19. Fig. 3. Example of NPD calculation Fig. 2. Example of n-gram POS alignment
  • 20. • 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝛼𝑅, 𝛽𝑃 = 𝛼+𝛽 𝛼 𝑅 + 𝛽 𝑃 (7) • 𝑃 = 𝑎𝑙𝑖𝑔𝑛𝑒𝑑 𝑛𝑢𝑚 𝑠𝑦𝑠𝑡𝑒𝑚 𝑙𝑒𝑛𝑔𝑡ℎ (8) • 𝑅 = 𝑎𝑙𝑖𝑔𝑛𝑒𝑑 𝑛𝑢𝑚 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑙𝑒𝑛𝑔𝑡ℎ (9) • ℎ𝐿𝐸𝑃𝑂𝑅 𝐴 = 1 𝑆𝑒𝑛𝑡𝑁𝑢𝑚 ℎ𝐿𝐸𝑃𝑂𝑅𝑖 𝑆𝑒𝑛𝑡𝑁𝑢𝑚 𝑖=1 (10) • ℎ𝐿𝐸𝑃𝑂𝑅 𝐵 = 𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑤 𝐿𝑃 𝐿𝑃, 𝑤 𝑁𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙 𝑃𝑜𝑠𝑃𝑒𝑛𝑎𝑙𝑡𝑦, 𝑤 𝐻𝑃𝑅 𝐻𝑃𝑅 (11)
  • 21. How to evaluate the effectiveness of the algorithms?
  • 22. • Spearman correlation coefficient: • 𝜌∅ 𝑋𝑌 = 1 − 6 𝑑 𝑖 2𝑛 𝑖=1 𝑛(𝑛2−1) (12) • 𝑋 = 𝑥1, … , 𝑥 𝑛 , 𝑌 = {𝑦1, … , 𝑦𝑛}
  • 23. Experiment on authoritative corpora International workshop on STATISTICAL MACHINE TRANSLATION
  • 24. • Parameters tuned on WMT08 and tested on WMT11:
  • 25.
  • 26.
  • 27. • evaluation metric based on mathematical weighted harmonic mean • tunable weights • Enhanced factors • employs concise linguistic feature the POS of the word • Better performance than the similar metrics using POS, such as ROSE, MPF, WMPF. • Performance can be further enhanced by the increasing of POS tools and the adjusting of the parameter values • BLEU uses n-gram, other researchers count the number of POS, e.g. (Avramidis et al. 2011), we combine the n-gram and POS information together. • Evaluation methods without the need of reference perform low, e.g. the MP4IBM1 metric ranked near the bottom in the experiments.
  • 28. • More language pairs will be tested • Combination of both word and POS will be explored • Parameter tuning will be achieved automatically • Evaluation without golden references will be developed
  • 29. • 1. Weaver, Warren.: Translation. In William Locke and A. Donald Booth, editors, • Machine Translation of Languages: Fourteen Essays. John Wiley and Sons, New • York, pages 15{23 (1955) • 2. Marino B. Jose, Rafael E. Banchs, Josep M. Crego, Adria de Gispert, Patrik Lambert, • Jose A. Fonollosa, Marta R. Costa-jussa: N-gram based machine translation, • Computational Linguistics, Vol. 32, No. 4. pp. 527-549, MIT Press (2006) • 3. Och, F. J.: Minimum Error Rate Training for Statistical Machine Translation. In • Proceedings of (ACL-2003). pp. 160-167 (2003) • 4. Su Hung-Yu and Chung-Hsien Wu: Improving Structural Statistical Machine Translation • for Sign Language With Small Corpus Using Thematic Role Templates as • Translation Memory, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE • PROCESSING, VOL. 17, NO. 7, SEPTEMBER (2009) • 5. Xiong D., M. Zhang, H. Li: A Maximum-Entropy Segmentation Model for Statistical • Machine Translation, Audio, Speech, and Language Processing, IEEE Transactions • on, Volume: 19, Issue: 8, 2011 , pp. 2494- 2505 (2011) • 6. Carl, M. and A. Way (eds): Recent Advances in Example-Based Machine Translation. • Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)
  • 30. • 7. Koehn P.: Statistical Machine Translation, (University of Edinburgh), Cambridge • University Press (2010) • 8. Arnold, D.: Why translation is dicult for computers. In Computers and Translation: • A translator's guide. Benjamins Translation Library (2003) • 9. Carroll, J. B.: Aan experiment in evaluating the quality of translation, Pierce, J. • (Chair), Languages and machines: computers in translation and linguistics. A report • by the Automatic Language Processing Advisory Committee (ALPAC), Publication • 1416, Division of Behavioral Sciences, National Academy of Sciences, National Research • Council, page 67-75 (1966) • 10. White, J. S., O'Connell, T. A., and O'Mara, F. E.: The ARPA MT evaluation • methodologies: Evolution, lessons, and future approaches. In Proceedings of the • Conference of the Association for Machine Translation in the Americas (AMTA • 1994). pp193-205 (1994) • 11. Su Keh-Yih, Wu Ming-Wen and Chang Jing-Shin: A New Quantitative Quality • Measure for Machine Translation Systems. In Proceedings of the 14th International • Conference on Computational Linguistics, pages 433{439, Nantes, France, July • (1992)
  • 31. • 12. Tillmann C., Stephan Vogel, Hermann Ney, Arkaitz Zubiaga, and Hassan Sawaf: • Accelerated DP Based Search For Statistical Translation. In Proceedings of the 5th • European Conference on Speech Communication and Technology (EUROSPEECH97) • (1997) • 13. Papineni, K., Roukos, S., Ward, T. and Zhu, W. J.: BLEU: a method for automatic • evaluation of machine translation. In Proceedings of the (ACL 2002), pages 311-318, • Philadelphia, PA, USA (2002) • 14. Doddington, G.: Automatic evaluation of machine translation quality using ngram • co-occurrence statistics. In Proceedings of the second international conference • on Human Language Technology Research(HLT 2002), pages 138-145, San Diego, • California, USA (2002) • 15. Turian, J. P., Shen, L. and Melanmed, I. D.: Evaluation of machine translation • and its evaluation. In Proceedings of MT Summit IX, pages 386-393, New Orleans, • LA, USA (2003) • 16. Banerjee, S. and Lavie, A.: Meteor: an automatic metric for MT evaluation with • high levels of correlation with human judgments. In Proceedings of ACL-WMT, • pages 65-72, Prague, Czech Republic (2005)
  • 32. • 17. Denkowski, M. and Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization • and evaluation of machine translation systems. In Proceedings of (ACL-WMT), • pages 85-91, Edinburgh, Scotland, UK (2011) • 18. Snover, M., Dorr, B., Schwartz, R., Micciulla, L. and Makhoul, J.: A study of • translation edit rate with targeted human annotation. In Proceedings of the Conference • of the Association for Machine Translation in the Americas (AMTA), pages • 223-231, Boston, USA (2006) • 19. Chen, B. and Kuhn, R.: Amber: A modied bleu, enhanced ranking metric. In • Proceedings of (ACL-WMT), pages 71-77, Edinburgh, Scotland, UK (2011) • 20. Bicici, E. and Yuret, D.: RegMT system for machine translation, system combination, • and evaluation. In Proceedings ACL-WMT, pages 323-329, Edinburgh, • Scotland, UK (2011) • 21. Taylor, J. Shawe and N. Cristianini: Kernel Methods for Pattern Analysis. Cambridge • University Press 2004. • 22. Wong, B. T-M and Kit, C.: Word choice and word position for automatic MT • evaluation. In Workshop: MetricsMATR of the Association for Machine Translation • in the Americas (AMTA), short paper, 3 pages, Waikiki, Hawai'I, USA (2008)
  • 33. • 23. Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H.: Automatic evaluation • of translation quality for distant language pairs. In Proceedings of the 2010 • Conference on (EMNLP), pages 944{952, Cambridge, MA (2010) • 24. Talbot, D., Kazawa, H., Ichikawa, H., Katz-Brown, J., Seno, M. and Och, F.: A • Lightweight Evaluation Framework for Machine Translation Reordering. In Proceedings • of the Sixth (ACL-WMT), pages 12-21, Edinburgh, Scotland, UK (2011) • 25. Song, X. and Cohn, T.: Regression and ranking based optimisation for sentence • level MT evaluation. In Proceedings of the (ACL-WMT), pages 123-129, Edinburgh, • Scotland, UK (2011) • 26. Popovic, M.: Morphemes and POS tags for n-gram based evaluation metrics. In • Proceedings of (ACL-WMT), pages 104-107, Edinburgh, Scotland, UK (2011) • 27. Popovic, M., Vilar, D., Avramidis, E. and Burchardt, A.: Evaluation without references: • IBM1 scores as evaluation metrics. In Proceedings of the (ACL-WMT), • pages 99-103, Edinburgh, Scotland, UK (2011) • 28. Petrov S., Leon Barrett, Romain Thibaux, and Dan Klein: Learning accurate, • compact, and interpretable tree annotation. Proceedings of the 21st ACL, pages • 433{440, Sydney, July (2006)
  • 34. • 29. Callison-Bruch, C., Koehn, P., Monz, C. and Zaidan, O. F.: Findings of the 2011 • Workshop on Statistical Machine Translation. In Proceedings of (ACL-WMT), pages • 22-64, Edinburgh, Scotland, UK (2011) • 30. Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M. and Zaidan, • O. F.: Findings of the 2010 Joint Workshop on Statistical Machine Translation and • Metrics for Machine Translation. In Proceedings of (ACL-WMT), pages 17-53, PA, • USA (2010) • 31. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Findings of the 2009 • Workshop on Statistical Machine Translation. In Proceedings of ACL-WMT, pages • 1-28, Athens, Greece (2009) • 32. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Further meta-evaluation • of machine translation. In Proceedings of (ACL-WMT), pages 70-106, Columbus, • Ohio, USA (2008) • 33. Avramidis E., Popovic, M., Vilar, D., Burchardt, A.: Evaluate with Condence • Estimation: Machine ranking of translation outputs using grammatical features. In • Proceedings of the Sixth Workshop on Statistical Machine Translation, Association • for Computational Linguistics (ACL-WMT), pages 65-70, Edinburgh, Scotland, UK • (2011)
  • 35. Aaron L.-F. Han, Derek F. Wong, Lidia S. Chao, and Liangye He Open source code: https://github.com/aaronlifenghan/aaron-project-hlepor Natural Language Processing & Portuguese-Chinese Machine Translation Laboratory Department of Computer and Information Science University of Macau TSD 2013, LNAI Vol. 8082, pp. 121-128. Springer Verlag Berlin Heidelberg 2013