SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
© 2011 #1
language technology
for
optimum localization
Diego Bartolomé, CEO
© 2011 #2
optimum workflow
gather in-domain data
train the translation solution
enrich solution with related text
terminology priorization
update the translation solution
add rules to enhance quality
weekly updates
© 2011 #3
data issues 1
<large volume of heterogeneus data>
training with all the data
semantic classification for domain selection
fine tuning for each client
glossary priorization
continuous machine learning
© 2011 #4
data issues 2
<scarce data>
add dictionaries into corpora
complementary segments from memories
balance client data with generic texts
in-domain adaptation of generic system
increase the number of sentences with rules
© 2011 #5
data issues 3
<dirty data>
remove multiple translations
eliminate text in other languages
correct spelling
select sentences with correct grammar
automatic alignment with client terminology
filter out other undesired segments
© 2011 #6
data issues 4
<data creation and enhancement>
final client defined
unaligned translated documents
generic translations
optimum corpus/memories creation
rule-based extension/filtering
© 2011 #7
linguistic issues 1
<untranslated words>
dictionary creation
<grammatical errors>
post-processing rules
<blind quality filtering>
do not translate sentences below threshold
© 2011 #8
linguistic issues 2
<source text cleaning>
spelling and grammar
sentence simplification
terminology homogenization
<special words detection>
people, places, organizations
alphanumeric codes
© 2011 #9
use case
<recurrent small volumes>
frequent translations
clients from different domains
<workflow>
gather as much data as possible
receive a new file for translation
create an ad hoc domain for that file
train the translation solution + basic rules
<output>
optimum adaptation for a file in around 4 hours
© 2011 #10
<tauyou_text> snapshot
<fully customizable>
look and feel + functionality
© 2011 #11
Thanks!
// Diego Bartolomé, PhD
<address> C/ Les Planes 39 – 08201 Sabadell – Spain
<phone> +34 93 711 29 96
<cell> +34 670 331 225
<email> dbc@tauyou.com
<www> tauyou.com

Weitere ähnliche Inhalte

Ähnlich wie 2011 TAUS Executive Forum Barcelona: Language Technology for optimum localization

SAP HANA SPS09 - Text Analysis
SAP HANA SPS09 - Text AnalysisSAP HANA SPS09 - Text Analysis
SAP HANA SPS09 - Text AnalysisSAP Technology
 
Introduction to domino_global_workbench_8.5
Introduction to domino_global_workbench_8.5Introduction to domino_global_workbench_8.5
Introduction to domino_global_workbench_8.5hepeiwei
 
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...TELKOMNIKA JOURNAL
 
Managing risk with consistent terminology for cross-border contracts
Managing risk with consistent terminology for cross-border contractsManaging risk with consistent terminology for cross-border contracts
Managing risk with consistent terminology for cross-border contractsVIA
 
Software maintenance
Software maintenanceSoftware maintenance
Software maintenanceAnsh Kapoor
 
CHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docx
CHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docxCHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docx
CHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docxwalterl4
 
Quality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case studyQuality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case studyAnkica Barisic
 
Presentation1.update.pptx
Presentation1.update.pptxPresentation1.update.pptx
Presentation1.update.pptxsefefehunegnaw1
 
Ch1 language design issue
Ch1 language design issueCh1 language design issue
Ch1 language design issueJigisha Pandya
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 
Sdd Documentation
Sdd DocumentationSdd Documentation
Sdd Documentationgavhays
 
Translation Ally: Document and Audio Translator
Translation Ally: Document and Audio TranslatorTranslation Ally: Document and Audio Translator
Translation Ally: Document and Audio TranslatorIRJET Journal
 
Agile Development with Smalltalk - Short
Agile Development with Smalltalk - ShortAgile Development with Smalltalk - Short
Agile Development with Smalltalk - ShortTomáš Kukol
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?tauyou
 
What Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be CostlyWhat Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be CostlySTC-Philadelphia Metro Chapter
 
Translation management for life sciences
Translation management for life sciencesTranslation management for life sciences
Translation management for life sciencesWordbee S.A
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translationStephen Peacock
 
Is it worth investing in qa processes (1) (1)
Is it worth investing in qa processes (1) (1)Is it worth investing in qa processes (1) (1)
Is it worth investing in qa processes (1) (1)Crossing
 
Software development slides
Software development slidesSoftware development slides
Software development slidesiarthur
 

Ähnlich wie 2011 TAUS Executive Forum Barcelona: Language Technology for optimum localization (20)

SAP HANA SPS09 - Text Analysis
SAP HANA SPS09 - Text AnalysisSAP HANA SPS09 - Text Analysis
SAP HANA SPS09 - Text Analysis
 
Introduction to domino_global_workbench_8.5
Introduction to domino_global_workbench_8.5Introduction to domino_global_workbench_8.5
Introduction to domino_global_workbench_8.5
 
Sw Maintenance.ppt
Sw Maintenance.pptSw Maintenance.ppt
Sw Maintenance.ppt
 
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
Quality Translation Enhancement Using Sequence Knowledge and Pruning in Stati...
 
Managing risk with consistent terminology for cross-border contracts
Managing risk with consistent terminology for cross-border contractsManaging risk with consistent terminology for cross-border contracts
Managing risk with consistent terminology for cross-border contracts
 
Software maintenance
Software maintenanceSoftware maintenance
Software maintenance
 
CHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docx
CHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docxCHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docx
CHAPTER 1 Creating a ProgramOBJECTIVES· Analyze some of the i.docx
 
Quality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case studyQuality in use of domain-specific languages: a case study
Quality in use of domain-specific languages: a case study
 
Presentation1.update.pptx
Presentation1.update.pptxPresentation1.update.pptx
Presentation1.update.pptx
 
Ch1 language design issue
Ch1 language design issueCh1 language design issue
Ch1 language design issue
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
Sdd Documentation
Sdd DocumentationSdd Documentation
Sdd Documentation
 
Translation Ally: Document and Audio Translator
Translation Ally: Document and Audio TranslatorTranslation Ally: Document and Audio Translator
Translation Ally: Document and Audio Translator
 
Agile Development with Smalltalk - Short
Agile Development with Smalltalk - ShortAgile Development with Smalltalk - Short
Agile Development with Smalltalk - Short
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
 
What Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be CostlyWhat Writers Don’t Know About Translation Can Be Costly
What Writers Don’t Know About Translation Can Be Costly
 
Translation management for life sciences
Translation management for life sciencesTranslation management for life sciences
Translation management for life sciences
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Is it worth investing in qa processes (1) (1)
Is it worth investing in qa processes (1) (1)Is it worth investing in qa processes (1) (1)
Is it worth investing in qa processes (1) (1)
 
Software development slides
Software development slidesSoftware development slides
Software development slides
 

Mehr von tauyou

Artificial Intelligence and Machine Learning found in Translation
Artificial Intelligence and Machine Learning found in TranslationArtificial Intelligence and Machine Learning found in Translation
Artificial Intelligence and Machine Learning found in Translationtauyou
 
I can't help falling in love with machine translation
I can't help falling in love with machine translationI can't help falling in love with machine translation
I can't help falling in love with machine translationtauyou
 
Workshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platformWorkshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platformtauyou
 
Mind the gap between what you say and what you deliver
Mind the gap between what you say and what you deliverMind the gap between what you say and what you deliver
Mind the gap between what you say and what you delivertauyou
 
Some Lessons Learned on Machine Translation
Some Lessons Learned on Machine TranslationSome Lessons Learned on Machine Translation
Some Lessons Learned on Machine Translationtauyou
 
From the Lab to the Market
From the Lab to the MarketFrom the Lab to the Market
From the Lab to the Markettauyou
 
APIfying the Translation Industry
APIfying the Translation IndustryAPIfying the Translation Industry
APIfying the Translation Industrytauyou
 
The Discreet Charm of Machine Translation
The Discreet Charm of Machine TranslationThe Discreet Charm of Machine Translation
The Discreet Charm of Machine Translationtauyou
 
Women in Localization UK Webinar with Diego Bartolome
Women in Localization UK Webinar with Diego BartolomeWomen in Localization UK Webinar with Diego Bartolome
Women in Localization UK Webinar with Diego Bartolometauyou
 
TAUS Post-editing webinar. Spanish-to-English Module
TAUS Post-editing webinar. Spanish-to-English ModuleTAUS Post-editing webinar. Spanish-to-English Module
TAUS Post-editing webinar. Spanish-to-English Moduletauyou
 
The Beauty of Machine Translation
The Beauty of Machine TranslationThe Beauty of Machine Translation
The Beauty of Machine Translationtauyou
 
Emerging Technologies Enabling New Business Models
Emerging Technologies Enabling New Business ModelsEmerging Technologies Enabling New Business Models
Emerging Technologies Enabling New Business Modelstauyou
 
Innovating in Translation
Innovating in TranslationInnovating in Translation
Innovating in Translationtauyou
 
Pushing Machine Translation Forward
Pushing Machine Translation ForwardPushing Machine Translation Forward
Pushing Machine Translation Forwardtauyou
 
The State of Post-Editing
The State of Post-EditingThe State of Post-Editing
The State of Post-Editingtauyou
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolometauyou
 
lo que he aprendido (y quiero compartir)
lo que he aprendido (y quiero compartir)lo que he aprendido (y quiero compartir)
lo que he aprendido (y quiero compartir)tauyou
 
What you need to put Machine Translation into practice: Tools, People, and Pr...
What you need to put Machine Translation into practice: Tools, People, and Pr...What you need to put Machine Translation into practice: Tools, People, and Pr...
What you need to put Machine Translation into practice: Tools, People, and Pr...tauyou
 
How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)
How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)
How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)tauyou
 
Learn to Innovate (GALA Istanbul 2014)
Learn to Innovate (GALA Istanbul 2014)Learn to Innovate (GALA Istanbul 2014)
Learn to Innovate (GALA Istanbul 2014)tauyou
 

Mehr von tauyou (20)

Artificial Intelligence and Machine Learning found in Translation
Artificial Intelligence and Machine Learning found in TranslationArtificial Intelligence and Machine Learning found in Translation
Artificial Intelligence and Machine Learning found in Translation
 
I can't help falling in love with machine translation
I can't help falling in love with machine translationI can't help falling in love with machine translation
I can't help falling in love with machine translation
 
Workshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platformWorkshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platform
 
Mind the gap between what you say and what you deliver
Mind the gap between what you say and what you deliverMind the gap between what you say and what you deliver
Mind the gap between what you say and what you deliver
 
Some Lessons Learned on Machine Translation
Some Lessons Learned on Machine TranslationSome Lessons Learned on Machine Translation
Some Lessons Learned on Machine Translation
 
From the Lab to the Market
From the Lab to the MarketFrom the Lab to the Market
From the Lab to the Market
 
APIfying the Translation Industry
APIfying the Translation IndustryAPIfying the Translation Industry
APIfying the Translation Industry
 
The Discreet Charm of Machine Translation
The Discreet Charm of Machine TranslationThe Discreet Charm of Machine Translation
The Discreet Charm of Machine Translation
 
Women in Localization UK Webinar with Diego Bartolome
Women in Localization UK Webinar with Diego BartolomeWomen in Localization UK Webinar with Diego Bartolome
Women in Localization UK Webinar with Diego Bartolome
 
TAUS Post-editing webinar. Spanish-to-English Module
TAUS Post-editing webinar. Spanish-to-English ModuleTAUS Post-editing webinar. Spanish-to-English Module
TAUS Post-editing webinar. Spanish-to-English Module
 
The Beauty of Machine Translation
The Beauty of Machine TranslationThe Beauty of Machine Translation
The Beauty of Machine Translation
 
Emerging Technologies Enabling New Business Models
Emerging Technologies Enabling New Business ModelsEmerging Technologies Enabling New Business Models
Emerging Technologies Enabling New Business Models
 
Innovating in Translation
Innovating in TranslationInnovating in Translation
Innovating in Translation
 
Pushing Machine Translation Forward
Pushing Machine Translation ForwardPushing Machine Translation Forward
Pushing Machine Translation Forward
 
The State of Post-Editing
The State of Post-EditingThe State of Post-Editing
The State of Post-Editing
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
 
lo que he aprendido (y quiero compartir)
lo que he aprendido (y quiero compartir)lo que he aprendido (y quiero compartir)
lo que he aprendido (y quiero compartir)
 
What you need to put Machine Translation into practice: Tools, People, and Pr...
What you need to put Machine Translation into practice: Tools, People, and Pr...What you need to put Machine Translation into practice: Tools, People, and Pr...
What you need to put Machine Translation into practice: Tools, People, and Pr...
 
How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)
How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)
How we failed to win a 100,000,000 word contract (GALA Istanbul 2014)
 
Learn to Innovate (GALA Istanbul 2014)
Learn to Innovate (GALA Istanbul 2014)Learn to Innovate (GALA Istanbul 2014)
Learn to Innovate (GALA Istanbul 2014)
 

Kürzlich hochgeladen

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

2011 TAUS Executive Forum Barcelona: Language Technology for optimum localization

  • 1. © 2011 #1 language technology for optimum localization Diego Bartolomé, CEO
  • 2. © 2011 #2 optimum workflow gather in-domain data train the translation solution enrich solution with related text terminology priorization update the translation solution add rules to enhance quality weekly updates
  • 3. © 2011 #3 data issues 1 <large volume of heterogeneus data> training with all the data semantic classification for domain selection fine tuning for each client glossary priorization continuous machine learning
  • 4. © 2011 #4 data issues 2 <scarce data> add dictionaries into corpora complementary segments from memories balance client data with generic texts in-domain adaptation of generic system increase the number of sentences with rules
  • 5. © 2011 #5 data issues 3 <dirty data> remove multiple translations eliminate text in other languages correct spelling select sentences with correct grammar automatic alignment with client terminology filter out other undesired segments
  • 6. © 2011 #6 data issues 4 <data creation and enhancement> final client defined unaligned translated documents generic translations optimum corpus/memories creation rule-based extension/filtering
  • 7. © 2011 #7 linguistic issues 1 <untranslated words> dictionary creation <grammatical errors> post-processing rules <blind quality filtering> do not translate sentences below threshold
  • 8. © 2011 #8 linguistic issues 2 <source text cleaning> spelling and grammar sentence simplification terminology homogenization <special words detection> people, places, organizations alphanumeric codes
  • 9. © 2011 #9 use case <recurrent small volumes> frequent translations clients from different domains <workflow> gather as much data as possible receive a new file for translation create an ad hoc domain for that file train the translation solution + basic rules <output> optimum adaptation for a file in around 4 hours
  • 10. © 2011 #10 <tauyou_text> snapshot <fully customizable> look and feel + functionality
  • 11. © 2011 #11 Thanks! // Diego Bartolomé, PhD <address> C/ Les Planes 39 – 08201 Sabadell – Spain <phone> +34 93 711 29 96 <cell> +34 670 331 225 <email> dbc@tauyou.com <www> tauyou.com