Machine Translation: The Neural Frontier

Machine Translation
The Neural Frontier
John Tinsley
GALA, Amsterdam, March 2017

Source: http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf

What we’re actually going to cover this morning!
How does it work?
What’s all the fuss about?
“Neural machine translation is ______.”
What is the status as of today?
Is it really that good?
What does all this mean for the future?

What they actually said...

“In some cases human and GNMT translations are nearly
indistinguishable on the relatively simplistic and isolated
sentences sampled from Wikipedia and news articles for this
experiment.”
What was reported...

MT developers
around the world

Source: (modified from) http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf
Rule Based
Statistical
Neural
A brief history of MT…

“State of the Union”
The initial splash
made by
statistical MT
The initial splash
made by neural
MT
wow that’s
pretty
good!
We’re about here
now
March 27th 2007
This is where the
excitement is
coming from
Statistical
Machine
Translation
MTQuality
Neural
Machine
Translation
20+ years worth of
research
?

Neural machine translation is

exciting!


the future


ultimately just another type of MT

Neural machine translation is not

going to replace human translators

Neural machine translation is not

a silver bullet

Still early stage
Language independent
Fundamental practical
considerations not yet
addressed
Neural Machine Translation March 27th 2017
Generic applications only
No flexibility for customisation
Significant hurdles for cost-
effective scalable production
performance

Academia
Industry
Output can be insanely ﬂuent!

Source:https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
They needed more computers — “G.P.U.s,”
graphics processors reconfigured for neural
networks — for training…
“Should we ask for a
thousand G.P.U.s?”
“Why not 2,000?”
Ten days later, they had the additional 2,000
processors.

Is it really that good?
(Yes, it can be!)

• “Yeah it looks better”Anecdotal
• Generally, neural is better*
• More obviously so for complex languages
• It falls over badly on long sentences
Academic
• Stark improvements for Chinese and Arabic
• Comparable performance on other
languages
WIPO

What evaluations are out there?

WIPO large scale apples-to-apples comparison
English to Chinese
Arabic to Chinese
Spanish to Chinese
French to Chinese

•  “Yeah it looks better”Anecdotal
•  Generally, neural is better*
•  More obviously so for complex languages
•  It falls over badly on long sentences
Academic
•  Stark improvements for Chinese and Arabic
•  Comparable performance on other languagesWIPO

•  Practical comparison with production MT
•  Mixed results depending on content type
•  Clear strengths and weaknesses emerging
Iconic
What evaluations are out there?

Real-world languages
and content

Chinese to English
patents, mature
production engine,
highly tuned.
“Real-world” comparative use case

Apples to apples
comparison

Access to same
training data, test
data, including all of
the ugly parts.

Effective qualitative
evaluation

No one-size-fits-all, so
what MT good and
what and where does
it fall down?

Short
Sentences
All
Sentences
u  Iconic Production MT
u  Iconic Neural MT
Neural MT works – and it’s good!
It is not a silver bullet
+ word order
+ agreement
-  omitting phrases

+ terminology
+ error free output
-  sentence structure

New Opportunities =
New Challenges
Black Box
Customisation
Production
“Why is this error
happening?”
“Can you fix this
error please?”
“How much is that
GPU??!”
Data
Evaluation
Pricing
Still needed, now
more than ever!
Do we know how to
quantify “quality”?
How much does it
cost now?
Old Challenges

Short term
•  Research which takes time
•  More effective use of general machine translation
2-5 years
•  Emerging use cases, new types of hybrid, and clarity

Longer term
•  “Zero-shot” translation?
What does this mean for the future?
Rule-based
Statistical
Neural
You are here

1st
Recurrent
Neural
Network
2nd
Recurrent
Neural
Network
0.034203423
3.343423423
2.234235234
0.453423423
0.002340234
2.234234234
5.023234234
3.342342355
0.034203423
3.343423423
“GO
RAIBH
MAITH
AGAT”
“THANK
YOU”
Encoder Decoder
Encoded
Sentence
Gaelic
Input English
Output
Memory of previously
translated words influence
next result
Thank you!

P.S. This is kind of how neural
machine translation works…
john@iconictranslation.com
@johntins

Machine Translation: The Neural Frontier

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Machine Translation: The Neural Frontier

Ähnlich wie Machine Translation: The Neural Frontier (20)

Mehr von Iconic Translation Machines

Mehr von Iconic Translation Machines (10)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Machine Translation: The Neural Frontier

Hinweis der Redaktion