SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Internal Evaluation German to English translation [email_address] ,  [email_address]   Nervo Verdezoto D.
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Problem ,[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Objetives ,[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Literature Review ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Road Map ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Software & Hardware ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Baseline Table 1. Statistics of the Dataset  de: wiederaufnahme der sitzungsperiode en:  resumption of the session de: ich erkläre die am freitag , dem 17. dezember unterbrochene sitzungsperiode des europäischen parlaments für wiederaufgenommen , wünsche ihnen nochmals en:  i declare resumed the session of the european parliament adjourned on friday 17 december 1999 , and i would like once again to wish you a happy new year in  de:  alles gute zum jahreswechsel und hoffe , daß sie schöne ferien hatten . en:  the hope that you enjoyed a pleasant festive period . Figure 1.  Sample of the training corpus German < >English Training Sentences 78524 Words 1581042 1684639 Dev Sentences 2000 Words 55118 58761 Test Sentences 2000 Words 55580 59153
Baseline - Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Table 2. Performance of initial models Measure Model 0 -Baseline Model 1 -Tunning BLUE 23.24% 23.62% NIST 6.5426 6.4539 WER 69.09% 70.90% PER 18.82% 16.43%
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experiments Setup Description Setup 1, 2 Filter sentences Baseline - 40  Setup 1 – 45 Setup 2 -35 Setup 3,4 and 5 Combination with baseline, setup1 and setup2 and lexicalized reordering model (reordering configuration msd-bidirectional-fe and distorsion limit 6   ). Setup 3 – filter(40) Setup 4 –filter (45) Setup 5 –filter (35) Setup 6 I tried to split source data but it does not work Setup 7 and 8 Adding Part Of Speech information  using  Factored translation mode  in the target data (English) / LM: Setup 7 (3gram), Setup 8(5gram) Setup 9 I tried to used Moses for factored translation model in the source (German) but it does not work. I tried to train the suppertagger with a German corpus (TIGGER corpus) and I got a problem with the format of the files  see  http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation
Factored Translation model ,[object Object]
Factored translation model ,[object Object],nervo @nervo-laptop:~/supertagger/candc-1.00$  bin/pos  --model  models/pos  --input data/baseline.de-en.tok.clean.en --output data/baseline.de-en.tok.clean.postag.en nervo @nervo-laptop:~/supertagger/candc-1.00$  bin/super  --model  models/super  --input data/baseline.de-en.tok.clean.postag.lowercased.en --output data/baseline.de-en.tok.clean.postag.lowercased.supertag.en nervo @nervo-laptop:~/MOSESMT/baseline-system/baseline-system$ cat trainingcorpus/baseline.de-en.tok.clean.postag.en | perl ../../moses-scripts/ lowercase . perl  > trainingcorpus/baseline.de-en.tok.clean.postag.lowercased.en Figure 3.  Sample of pre-processing (supertagger) resumption|resumption|nn of|of|in the|the|dt session|session|nn i|i|prp declare|declare|vbp resumed|resumed|vbn the|the|dt session|session|nn of|of|in the|the|dt european|european|nnp parliament|parliament|nnp adjourned|adjourned|vbd on|on|in friday|friday|nnp 17|17|cd december|december|nnp 1999|1999|cd ,|,|, and|and|cc i|i|prp would|would|md like|like|vb once|once|rb again|again|rb to|to|to wish|wish|vb you|you|prp a|a|dt happy|happy|jj new|new|jj year|year|nn in|in|in the|the|dt hope|hope|nn that|that|in you|you|prp enjoyed|enjoyed|vbd a|a|dt pleasant|pleasant|jj festive|festive|jj   period|period|nn .|.|. Figure 4 . Sample of training data with supertags Original corpus resumption of the session   POStagged corpus resumption|resumption|nn of|of|in the|the|dt session|session|nn
Training ,[object Object],[object Object]
Changes in moses.ini ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SETUP7
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experimental results Measure Baseline  40 SETUP 1 - 45 SETUP 2 35 BLUE 23.24% 23.20% 22.88% NIST 6.5426 6.5490 6.4742 WER 69.09% 69.08% 69.58% PER 18.82% 18.80% 18.84% Measure SETUP 3  40 SETUP 4 45 SETUP 5 35 BLUE 23.03% 23.06% 22.56% NIST 6.5168 6.5349 6.4485 WER 68.52% 68.29% 69.07% PER 19.46% 19.72% 19.46% Measure SETUP 7  40 - 3gram SETUP 8 40-5gram SETUP9 BLUE 21.51% 21.59% / NIST 6.1699 6.1754 / WER 73.29% 73.45% / PER 17.74% 17.39% /
MODEL EXAMPLE REFERENCE he wanted the presidency to outline the way forward at nice . BASELINE he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  AHEAD AUFZEIGT  TUNING he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  AHEAD AUFZEIGT  SETUP1 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  *** AHEAD  SETUP2 he  HAS EXPRESSED the WISH THAT THE  PRESIDENCY IN  NICE way AUFZEIGT THE FUTURE  SETUP3 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE ,  the way ***  AHEAD AUFZEIGT SETUP4 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  the way ***  *** AHEAD  SETUP5 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  ,  THE FUTURE  PATH AUFZEIGT  SETUP6 -- SETUP7 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE  ,  THE FUTURE  PATH SHOWS  SETUP8 he  HAS EXPRESSED THE WISH THAT  the presidency IN  NICE ,  the way ***  AHEAD SHOWS
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary and Conclusion ,[object Object],[object Object],[object Object]
REFERENCES ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],Nervo Verdezoto D. [email_address] ,  [email_address]

Weitere ähnliche Inhalte

Ähnlich wie Internal Evaluation for a MT System, German to English

Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfPo-Chuan Chen
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015RIILP
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...IJECEIAES
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.pptbutest
 
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...inventionjournals
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems IJECEIAES
 
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...butest
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationDarian Pruitt
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.pptmilkesa13
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONAM Publications
 
part of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTpart of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTarteimi
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESijnlc
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Sheeyam Shellvacumar
 

Ähnlich wie Internal Evaluation for a MT System, German to English (20)

Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdf
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
How to Translate from English to Khmer using Moses
How to Translate from English to Khmer using MosesHow to Translate from English to Khmer using Moses
How to Translate from English to Khmer using Moses
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
A Simulation Based Approach for Studying the Effect of Buffers on the Perform...
 
Notes on algorithms
Notes on algorithmsNotes on algorithms
Notes on algorithms
 
228-SE3001_2
228-SE3001_2228-SE3001_2
228-SE3001_2
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems
 
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
Part-of-Speech Tagging for Bengali Thesis submitted to Indian ...
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference Annotation
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
Pert2
Pert2Pert2
Pert2
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
 
part of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXTpart of speech tagger for ARABIC TEXT
part of speech tagger for ARABIC TEXT
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal Code
 

Mehr von Nervo Verdezoto

Improving Traffic in Oulu
Improving Traffic in OuluImproving Traffic in Oulu
Improving Traffic in OuluNervo Verdezoto
 
S@VE - an electronic voting system.
S@VE - an electronic voting system.S@VE - an electronic voting system.
S@VE - an electronic voting system.Nervo Verdezoto
 
Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.Nervo Verdezoto
 
Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...Nervo Verdezoto
 
A Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in SpanishA Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in SpanishNervo Verdezoto
 
Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08Nervo Verdezoto
 

Mehr von Nervo Verdezoto (6)

Improving Traffic in Oulu
Improving Traffic in OuluImproving Traffic in Oulu
Improving Traffic in Oulu
 
S@VE - an electronic voting system.
S@VE - an electronic voting system.S@VE - an electronic voting system.
S@VE - an electronic voting system.
 
Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.Intelligent Geographical Information System with decision support capabilities.
Intelligent Geographical Information System with decision support capabilities.
 
Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...Application of formal ontology and semantic techniques to improve the coheren...
Application of formal ontology and semantic techniques to improve the coheren...
 
A Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in SpanishA Method for coordinative syntactic disambiguation in Spanish
A Method for coordinative syntactic disambiguation in Spanish
 
Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08Presentacion Accesibilidad y Posicionamiento 25 08 08
Presentacion Accesibilidad y Posicionamiento 25 08 08
 

Kürzlich hochgeladen

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Internal Evaluation for a MT System, German to English

  • 1. Internal Evaluation German to English translation [email_address] , [email_address] Nervo Verdezoto D.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Baseline Table 1. Statistics of the Dataset de: wiederaufnahme der sitzungsperiode en: resumption of the session de: ich erkläre die am freitag , dem 17. dezember unterbrochene sitzungsperiode des europäischen parlaments für wiederaufgenommen , wünsche ihnen nochmals en: i declare resumed the session of the european parliament adjourned on friday 17 december 1999 , and i would like once again to wish you a happy new year in de: alles gute zum jahreswechsel und hoffe , daß sie schöne ferien hatten . en: the hope that you enjoyed a pleasant festive period . Figure 1. Sample of the training corpus German < >English Training Sentences 78524 Words 1581042 1684639 Dev Sentences 2000 Words 55118 58761 Test Sentences 2000 Words 55580 59153
  • 12.
  • 13.
  • 14. Experiments Setup Description Setup 1, 2 Filter sentences Baseline - 40 Setup 1 – 45 Setup 2 -35 Setup 3,4 and 5 Combination with baseline, setup1 and setup2 and lexicalized reordering model (reordering configuration msd-bidirectional-fe and distorsion limit 6 ). Setup 3 – filter(40) Setup 4 –filter (45) Setup 5 –filter (35) Setup 6 I tried to split source data but it does not work Setup 7 and 8 Adding Part Of Speech information using Factored translation mode in the target data (English) / LM: Setup 7 (3gram), Setup 8(5gram) Setup 9 I tried to used Moses for factored translation model in the source (German) but it does not work. I tried to train the suppertagger with a German corpus (TIGGER corpus) and I got a problem with the format of the files see http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation
  • 15.
  • 16.
  • 17.
  • 18.
  • 20.
  • 21. Experimental results Measure Baseline 40 SETUP 1 - 45 SETUP 2 35 BLUE 23.24% 23.20% 22.88% NIST 6.5426 6.5490 6.4742 WER 69.09% 69.08% 69.58% PER 18.82% 18.80% 18.84% Measure SETUP 3 40 SETUP 4 45 SETUP 5 35 BLUE 23.03% 23.06% 22.56% NIST 6.5168 6.5349 6.4485 WER 68.52% 68.29% 69.07% PER 19.46% 19.72% 19.46% Measure SETUP 7 40 - 3gram SETUP 8 40-5gram SETUP9 BLUE 21.51% 21.59% / NIST 6.1699 6.1754 / WER 73.29% 73.45% / PER 17.74% 17.39% /
  • 22. MODEL EXAMPLE REFERENCE he wanted the presidency to outline the way forward at nice . BASELINE he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** AHEAD AUFZEIGT TUNING he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** AHEAD AUFZEIGT SETUP1 he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** *** AHEAD SETUP2 he HAS EXPRESSED the WISH THAT THE PRESIDENCY IN NICE way AUFZEIGT THE FUTURE SETUP3 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , the way *** AHEAD AUFZEIGT SETUP4 he HAS EXPRESSED THE WISH THAT the presidency IN NICE the way *** *** AHEAD SETUP5 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , THE FUTURE PATH AUFZEIGT SETUP6 -- SETUP7 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , THE FUTURE PATH SHOWS SETUP8 he HAS EXPRESSED THE WISH THAT the presidency IN NICE , the way *** AHEAD SHOWS
  • 23.
  • 24.
  • 25.
  • 26.

Hinweis der Redaktion

  1. Let ’ s now move to the problem
  2. The problem addressed in this paper is related with the coordinative and prepositional syntactic ambiguity in Spanish. Since, Spanish is considered a complex language for its variability structure and some different grammatical rules
  3. So, Turning to the objetives
  4. This paper proposed a method to solve coordinative and prepositional syntactic ambiguity for a written text in natural language. The main aims are: Decrease the number of syntactic representations of a phrase. Definition of a set of heuristic rules to indentify and solve this type of ambiguity. Implementation of this method for syntactic disambiguation for Spanish using the python language (Natural Language Toolkit - NLTK)
  5. So, Turning to the objetives
  6. Next, I will give you a brief explanation about the implementation of this method
  7. Next, I will give you a brief explanation about the implementation of this method
  8. Next, I will give you a brief explanation about the implementation of this method
  9. Next, I will give you a brief explanation about the implementation of this method
  10. Next, I will give you a brief explanation about the implementation of this method
  11. For the extension of the work!!
  12. For the extension of the work!!