SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Machine Translation
The Neural Frontier
John Tinsley
GALA, Amsterdam, March 2017
Source: http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf
What we’re actually going to cover this morning!
How does it work?
What’s all the fuss about?
“Neural machine translation is ______.”
What is the status as of today?
Is it really that good?
What does all this mean for the future?
What they actually said...

“In some cases human and GNMT translations are nearly
indistinguishable on the relatively simplistic and isolated
sentences sampled from Wikipedia and news articles for this
experiment.”
What was reported...

MT developers
around the world
Evolution
or
Revolution?
Source: (modified from) http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf
Rule Based
 Statistical
Neural
A brief history of MT…
“State of the Union”
The initial splash
made by
statistical MT
The initial splash
made by neural
MT
wow that’s
pretty
good!
We’re about here
now
March 27th 2007
This is where the
excitement is
coming from
Statistical
Machine
Translation
MTQuality
Neural
Machine
Translation
20+ years worth of
research
?
Neural machine translation is

exciting!
Neural machine translation is

the future
Neural machine translation is


ultimately just another type of MT
Neural machine translation is not


going to replace human translators
Neural machine translation is not

a silver bullet
Still early stage
Language independent
Fundamental practical
considerations not yet
addressed
Neural Machine Translation March 27th 2017
Generic applications only
No flexibility for customisation
Significant hurdles for cost-
effective scalable production
performance

Academia
 Industry
Output can be insanely fluent!
Source:https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
They needed more computers — “G.P.U.s,”
graphics processors reconfigured for neural
networks — for training…
“Should we ask for a
thousand G.P.U.s?”
“Why not 2,000?”
Ten days later, they had the additional 2,000
processors.
Is it really that good?
(Yes, it can be!)
• “Yeah it looks better”Anecdotal
• Generally, neural is better*
• More obviously so for complex languages
• It falls over badly on long sentences
Academic
• Stark improvements for Chinese and Arabic
• Comparable performance on other
languages
WIPO


What evaluations are out there?
WIPO large scale apples-to-apples comparison
English to Chinese
Arabic to Chinese
Spanish to Chinese
French to Chinese
•  “Yeah it looks better”Anecdotal
•  Generally, neural is better*
•  More obviously so for complex languages
•  It falls over badly on long sentences
Academic
•  Stark improvements for Chinese and Arabic
•  Comparable performance on other languagesWIPO


•  Practical comparison with production MT
•  Mixed results depending on content type
•  Clear strengths and weaknesses emerging
Iconic
What evaluations are out there?
Real-world languages
and content


Chinese to English
patents, mature
production engine,
highly tuned.
“Real-world” comparative use case


Apples to apples
comparison


Access to same
training data, test
data, including all of
the ugly parts.


Effective qualitative
evaluation


No one-size-fits-all, so
what MT good and
what and where does
it fall down?
Short
Sentences
All
Sentences
u  Iconic Production MT
u  Iconic Neural MT
Neural MT works – and it’s good!
It is not a silver bullet
+ word order
+ agreement
-  omitting phrases

+ terminology
+ error free output
-  sentence structure
New Opportunities =
New Challenges
Black Box
 Customisation
 Production
“Why is this error
happening?”
“Can you fix this
error please?”
“How much is that
GPU??!”
Data
 Evaluation
 Pricing
Still needed, now
more than ever!
Do we know how to
quantify “quality”?
How much does it
cost now?
Old Challenges
Short term
•  Research which takes time
•  More effective use of general machine translation
2-5 years
•  Emerging use cases, new types of hybrid, and clarity



Longer term
•  “Zero-shot” translation?
What does this mean for the future?
Rule-based
Statistical
Neural
You are here
1st
Recurrent
Neural
Network
2nd
Recurrent
Neural
Network
0.034203423
3.343423423
2.234235234
0.453423423
0.002340234
2.234234234
5.023234234
3.342342355
0.034203423
3.343423423
“GO
RAIBH
MAITH
AGAT”
“THANK
YOU”
Encoder Decoder
Encoded
Sentence
Gaelic
Input English
Output
Memory of previously
translated words influence
next result
Thank you!

P.S. This is kind of how neural
machine translation works…
john@iconictranslation.com
@johntins

Weitere ähnliche Inhalte

Was ist angesagt?

2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
RIILP
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly
 

Was ist angesagt? (20)

NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Chatbot ppt
Chatbot pptChatbot ppt
Chatbot ppt
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
 
MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
Blenderbot
BlenderbotBlenderbot
Blenderbot
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Machine Tanslation
Machine TanslationMachine Tanslation
Machine Tanslation
 
NLP Project Presentation
NLP Project PresentationNLP Project Presentation
NLP Project Presentation
 
Plug play language_models
Plug play language_modelsPlug play language_models
Plug play language_models
 
Nlp presentation
Nlp presentationNlp presentation
Nlp presentation
 
Vitalii Braslavskyi - Declarative engineering
Vitalii Braslavskyi - Declarative engineering Vitalii Braslavskyi - Declarative engineering
Vitalii Braslavskyi - Declarative engineering
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Statistical machine translation for indian language copy
Statistical machine translation for indian language   copyStatistical machine translation for indian language   copy
Statistical machine translation for indian language copy
 
The NLP Muppets revolution!
The NLP Muppets revolution!The NLP Muppets revolution!
The NLP Muppets revolution!
 

Ähnlich wie Machine Translation: The Neural Frontier

Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALL
Lawrie Hunter
 

Ähnlich wie Machine Translation: The Neural Frontier (20)

State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)
 
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Global Audit
Global AuditGlobal Audit
Global Audit
 
Learning to Translate with Joey NMT
Learning to Translate with Joey NMTLearning to Translate with Joey NMT
Learning to Translate with Joey NMT
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Large Components in the Rearview Mirror
Large Components in the Rearview MirrorLarge Components in the Rearview Mirror
Large Components in the Rearview Mirror
 
Putting the science in computer science
Putting the science in computer sciencePutting the science in computer science
Putting the science in computer science
 
Gadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALLGadgets pwn us? A pattern language for CALL
Gadgets pwn us? A pattern language for CALL
 
16-nlp (2).ppt
16-nlp (2).ppt16-nlp (2).ppt
16-nlp (2).ppt
 
Nlp 2020 global ai conf -jeff_shomaker_final
Nlp 2020 global ai conf -jeff_shomaker_finalNlp 2020 global ai conf -jeff_shomaker_final
Nlp 2020 global ai conf -jeff_shomaker_final
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social media
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
 
From DevOps to NoOps how not to get Equifaxed Apidays
From DevOps to NoOps how not to get Equifaxed ApidaysFrom DevOps to NoOps how not to get Equifaxed Apidays
From DevOps to NoOps how not to get Equifaxed Apidays
 
How to Teach and Learn with ChatGPT - BETT 2023
How to Teach and Learn with ChatGPT - BETT 2023How to Teach and Learn with ChatGPT - BETT 2023
How to Teach and Learn with ChatGPT - BETT 2023
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
 
2014 abic-talk
2014 abic-talk2014 abic-talk
2014 abic-talk
 
Iconic Translation: The Neural Frontier by John Tinsley (Iconic Translation M...
Iconic Translation: The Neural Frontier by John Tinsley (Iconic Translation M...Iconic Translation: The Neural Frontier by John Tinsley (Iconic Translation M...
Iconic Translation: The Neural Frontier by John Tinsley (Iconic Translation M...
 

Mehr von Iconic Translation Machines

Mehr von Iconic Translation Machines (10)

The growing role of translation technology in e-discovery, litigation, digita...
The growing role of translation technology in e-discovery, litigation, digita...The growing role of translation technology in e-discovery, litigation, digita...
The growing role of translation technology in e-discovery, litigation, digita...
 
Making the Old New Again - Modern Technical Provides Access to Historical Che...
Making the Old New Again - Modern Technical Provides Access to Historical Che...Making the Old New Again - Modern Technical Provides Access to Historical Che...
Making the Old New Again - Modern Technical Provides Access to Historical Che...
 
What? Why? How? Factors that impact the success of commercial MT projects
What? Why? How? Factors that impact the success of commercial MT projectsWhat? Why? How? Factors that impact the success of commercial MT projects
What? Why? How? Factors that impact the success of commercial MT projects
 
Innovative Business and Pricing Models: for MT
Innovative Business and Pricing Models: for MTInnovative Business and Pricing Models: for MT
Innovative Business and Pricing Models: for MT
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case Study
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
 
MT Evaluation: Seeing the Wood for the Trees
MT Evaluation: Seeing the Wood for the TreesMT Evaluation: Seeing the Wood for the Trees
MT Evaluation: Seeing the Wood for the Trees
 
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
 
From the Lab to the Market: Commercialising MT Research
From the Lab to the Market: Commercialising MT ResearchFrom the Lab to the Market: Commercialising MT Research
From the Lab to the Market: Commercialising MT Research
 
Beyond Data: Delivering Machine Translation with Subject Matter Expertise
Beyond Data: Delivering Machine Translation with Subject Matter ExpertiseBeyond Data: Delivering Machine Translation with Subject Matter Expertise
Beyond Data: Delivering Machine Translation with Subject Matter Expertise
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Machine Translation: The Neural Frontier

Hinweis der Redaktion

  1. What this talk is not! => An intractable, impenetrable technical deep dive into how neural MT works!
  2. Fuss – context NMT is – significance / impact Status – not media / meaning for YOU Good? – examples and case studies Future – short, mid, longer term
  3. Before we get into it… Fake news Embellished, sensationalised, out of context reporting – not doing anyone any favours Example
  4. Google paper, 23 pages Not the first time History of false dawns
  5. I’m going to be a more friendly expectation manager! Not bursting bubble – just bringing down to earth Bombastic!
  6. What better way to start Answer needs historical context
  7. Paradigm shift Rule based didn’t go away Neural last 2 years – why the fuss?
  8. SMT became incremental NMT here out of the box Excitement – haven’t even had the chance to try all these things! Hype cycle
  9. Brand new runway Light and end of the tunnel - EXCITING
  10. Hard to timeframe – will try later Run in parallel But definitely the way forward
  11. HOWEVER still just MT + ML, like SMT Same UI, same integrations, same problems (later) But better quality = ?
  12. Not US vs THEM, MAN vs MACHINE Reframe conversation – competition, frustrating Complimentary technology Own use cases
  13. Promising, way forward, needs time… “Do you do neural machine translation?” With that being said, where are we today
  14. Early – fringe in 2015 General – “German nouns” Practical – not yet, glossaries, customisation BUT GOOD people Generic – no use cases Flexibility – no customisation GPUs! revisit MUCH OF IT IS A MATTER OF TIME BUT THAT’S WHERE WE STAND NOW
  15. Obvious question – depends on many factors, let’s look at evals
  16. Anecdotal – good, surface, needs deeper Academic – more effective in some cases. Room for improvement = improvement WIPO – broadly on par but some interesting. LET’S LOOK
  17. Automatic scores We don’t know why 
  18. most highly optimised type of MT we want to know ourselves!
  19. My first experience, very cool. What are the practical implications of this? What do we do now? What’s holding us back? Now we have some direction
  20. Leveraging and utilising it in industry is challenging! How do we make decisions – where to use it, when, and how! Needs more practical field testing…
  21. “This first wave of NMT solutions are mostly generic systems, which are clearly improved in most language combinations over existing generic SMT solutions, especially to human evaluators. While we need to be wary of over exuberance about the progress, there is reason for optimism and we can expect further quality improvements as our understanding of the mystery of ‘hidden layers’ of deep learning improves”, “MT must be adaptable/customizable for specific business purposes, i.e. they need to learn specific terminology and specific customer domain. Comprehensive customization will take significantly more computing time and all the requirements for good quality data will only intensify.”