SlideShare ist ein Scribd-Unternehmen logo
1 von 24
TAUS USER CONFERENCE 2010
LANGUAGE BUSINESS INNOVATION
4 – 6 OCTOBER / PORTLAND (OR), USA




MONDAY 4 OCTOBER / 15.00

MAN, MACHINE AND ADVANCED TRANSLATION
MEMORY LEVERAGING
Daniel Gervais, MultiCorpora
Five New Technologies...


     ...that will change enterprise computing.

          Search – the Next Generation
          Environments to create Virtual Companies
          Virtualization Management Consoles
          Secure Cloud Creation
          Management Technologies

            Source:
            Eric Lundquist, Editor-in-Chief, eWeek
            smartertechnology.com




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
So, what does that mean for us?

     • elastic capacity                                         • Search – the Next Generation
     • fault tolerant                                           • Environments to create Virtual Companies
     • Scalable                                                 • Virtualization Management Consoles
     • Secure                                                   • Secure Cloud Creation
     • and easily maintained                                    • Management Technologies


     Cool concepts, but...
      How does this affect our industry?
      How do we access them?
      How do we harness them for greater productivity?
      What are the real benefits?
      What is the cost?
      What are the best practices?
      Where are they going?


© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
A brief roundup of SCbDS


      Super-Cloud based Data Sharing
           o      TDA
           o      MyMemory
           o      Google Translate
           o      Grand Dictionnaire Terminologique, Termium, IATE, ...
           o      EUR-Lex
           o      Other multilingual public-domain sources
           o      ...



© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
SCbDS upsides

      Advances in technology support large translation memories
       o Build vs. Existing
       o Proprietary vs. shared
       o Public domain mining
      Align large multilingual corpora
      Data mine within aligned corpora
      Measurable benefits have been obtained through ALTM on top of large
       memories
      BUT THERE’S A DANGER: Translation memory pollution & too much
       automation!




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Translation Memory Pollution is...

      Correctly aligned segments containing poor translation:
           o      Inadequate editing
           o      Poor post-mortem cleanup
      Incorrectly aligned segments:
           o      Poor alignment technology
           o      Inadequate post-alignment proofing
          Rogue tags
          Correct translation of undesired content
          Correct translation of obsolete source
          Obsolete translation of correct source
          Poor translation of poorly written source content:




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Translation Memory Pollution: overall conclusion

           Sentence-level leveraging in absence of contextual information is too
           simplistic and can lead to unsatisfactory results!




                    TM                                   ???
                                                                                          3§“§%!°“§$%“§$&$&/!


                                                                                                     




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
The Big Question

      Does increased
       matching through
       ALTM equate to
       REAL productivity
       gain?




          We say    YES!

© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Here‘s why we say YES!


      Large enterprise case
      Large government case
      Department of Justice
      Medax
      UNESCO
      Services Canada
© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
The main problem
      Wide variation of Document Types
      Legacy files in PDF
      No TM for certain customers
     Secondary problems
      Content is often complex
      Highly sensitive to context and style
      Highly client-specific




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Conventional TMs

     Mixed Results:
          No promised massive cost savings
          Useful enforcement tool
          Conventional terminology tool unwieldy
          Excel spreadsheets preferred!
     Time Investment Critical
      Therefore, selectivity of clients
      No ability to influence clients at the authoring stage - Documents are
       rarely repetitive on a traditional segment model
      Cost-benefit decisions: no TMs or truncated TMs




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
ALTM addressed needs for:

                 Context
                 Matches at the paragraph, level
                 Matches at the segment and sub-segment levels
                 Interfacing/Compatibility with external vendors who
                  used various TM tools
                 Better integration with terminology management, live
                  online deployment
                 Server-based solution to link global production platform




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
ALTM Benefits:


          Alignment automation = Low overhead for maintaining memory
          Rapid creation of larger memories = Faster project scoping and bidding
          Higher probability of matches
          Context provided at all times = Reduce research time
          Identification of sub-expressions = Result in more matches
          Terminology integration = Reduce research time, increase consistency


                       In general, more matches reduce revision time
                       Used to rebuild out-of-date conventional TM’s
                               Cost-effective competitiveness

© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Proof of proposal example


      Translation Bureau RFP
           o For 1200 licenses
           o Proof of Proposal – 5 consecutive business days:
                                Install full client-server, 20 workstations
                                Create a production TM of 15 000 pairs of unstructured
                                 documents in various formats (≈ 20 M source words)
                                1 day - 10 people user training
                                1 day – production simulation use
                                Ensure no productivity loss - compute gains
                                              MultiCorpora won the RFP

© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Harmonize legacy documents


      Department of Justice Canada
           o Laws & Regulations in French and English
           o No harmonization of ambiguous terms
           o ALTM allowed to extract terminology, see the translation
             discrepancies in context and identify corrections
           o ALTM combined with terminology allowed building TermBases of
             ambiguous terms from process on one document, and correct in all
             other documents
           o Continuous learning process, powered by ALTM
                                        Do in computing minutes
                                      what used to take people months

© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
• German Translation Service Provider
            •      geographically dispersed translator pool - roughly 250 doctors and pharmacists
            •      seven full-time employees oversee processing of nearly 5 million words per year
      • Historically no clear TM strategy
            •      Document types not conducive to TM
            •      Lacklustre productivity gains vs. overhead
      • Discovered ROI from the terminology management and sub-segment
        matching
            •      high number of shorter, domain-specific repeated sub-segment phrases
       Creates hybrid, partially pre-translated documents containing
        “pre-harmonized” terminology to send out
            o      90% comes from the TermBase, created by sub-segment matches, analysis
            o      Remaining 10% from the TextBase



© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
UNESCO

       “On the Fly” translation memories
            o Analyse docs against all translation memories
            o Identify which docs and memories are the most used
            o Re-build specific memories from UNESCO documents, and related
              organisations’ documents referenced in documents
            o Achieve higher degree of recycling from partner organisation’s
              documents
            o Ability to recycle / harmonize domain-specific terminology by
              example, powered by ALTM.
            o Continuous improvement virtuous circle
                Create a TM in minutes vs. what would take months to align
                                Add additional external content
                 Get domain-specific terminology though sub-expressions
© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Services Canada - Job Bank


      Distinctive Hybrid translation
       process
           o 90M words per year
           o TM / MT / post editing
           o Linguistic assets comprise
                                Previous job offers
                                Domain-specific terms
                                Shared data increased productivity




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Translation Memory Pollution: Antidote


      Content selection
           o Too much unstructured content
           o Need establish mining hierarchy

      Use of statistics
           o Generate usage & translation distribution statistics per content
             repositories
           o Standardize in “live” Terminology Databases

      Use human intelligence
           o Human needs to be involved. Too much automation only
             propagates pollution…
           o Virtuous improvement circle

© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
Other uses of ALTM


      Monolingual analysis
           o Identify single source candidates
           o Identify terms to standardize
           o Identify deviations of customized documents from
             baseline texts
           o Identify localization order prioritization of baseline
             documents - 15% savings potential
                                TextBase repetitions
                                Term repetitions




© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
The Journey Is Not Yet Finished

      More automation of the antidotes to pollution
      Recent improvement in term extraction algorithms can
       expose pollution sources
      Evangelization of the processes
      No quick fix: Human factor remains involved. Not yet at the
       vision of fully automated pre-translated ALTM.
      New collaboration models between linguists and TM
       systems
      Better support for linguistic decision-making
      Evangelization of the role of the post-editor


© 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
TAUS USER CONFERENCE 2010, Man, Machine and advanced translation memory leveraging
TAUS USER CONFERENCE 2010, Man, Machine and advanced translation memory leveraging

Weitere ähnliche Inhalte

Andere mochten auch

TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS - The Language Data Network
 
The Future of Technical Communication is Marketing
The Future of Technical Communication is MarketingThe Future of Technical Communication is Marketing
The Future of Technical Communication is MarketingScott Abel
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS - The Language Data Network
 
Antzinaroa eta erdi aroa nora taus
Antzinaroa eta erdi aroa nora tausAntzinaroa eta erdi aroa nora taus
Antzinaroa eta erdi aroa nora tausLourdes Macicior
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS - The Language Data Network
 
The cognitive era and the future of content
The cognitive era and the future of contentThe cognitive era and the future of content
The cognitive era and the future of contentScott Abel
 

Andere mochten auch (10)

TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013TAUS webinar The Big Picture View On The Translation Industry, March 2013
TAUS webinar The Big Picture View On The Translation Industry, March 2013
 
TAUS Moses Roundtable, Prague, 11 September 2013
TAUS Moses Roundtable, Prague, 11 September 2013TAUS Moses Roundtable, Prague, 11 September 2013
TAUS Moses Roundtable, Prague, 11 September 2013
 
TAUS New Year's Reception 2014
TAUS New Year's Reception 2014TAUS New Year's Reception 2014
TAUS New Year's Reception 2014
 
The Future of Technical Communication is Marketing
The Future of Technical Communication is MarketingThe Future of Technical Communication is Marketing
The Future of Technical Communication is Marketing
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
 
TAUS MT Post-Editing Guidelines
TAUS MT Post-Editing GuidelinesTAUS MT Post-Editing Guidelines
TAUS MT Post-Editing Guidelines
 
Antzinaroa eta erdi aroa nora taus
Antzinaroa eta erdi aroa nora tausAntzinaroa eta erdi aroa nora taus
Antzinaroa eta erdi aroa nora taus
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
 
The cognitive era and the future of content
The cognitive era and the future of contentThe cognitive era and the future of content
The cognitive era and the future of content
 

Ähnlich wie TAUS USER CONFERENCE 2010, Man, Machine and advanced translation memory leveraging

DEC-16-UNLEASH THE POWER OF HUMAN COLLABORATION
DEC-16-UNLEASH THE POWER OF HUMAN COLLABORATIONDEC-16-UNLEASH THE POWER OF HUMAN COLLABORATION
DEC-16-UNLEASH THE POWER OF HUMAN COLLABORATIONMichael G. Schwarzwalder
 
Intro to watson bluemix services
Intro to watson bluemix servicesIntro to watson bluemix services
Intro to watson bluemix servicesVikas Manoria
 
Tempo - Mobile access with Governance
Tempo - Mobile access with GovernanceTempo - Mobile access with Governance
Tempo - Mobile access with GovernanceGabe Faraone
 
SDL Server 2009 Launch Presentation
SDL Server 2009 Launch PresentationSDL Server 2009 Launch Presentation
SDL Server 2009 Launch Presentationanthonytate88
 
Lotusphere BP304: Looking For the Right Document Management Alternative
Lotusphere BP304: Looking For the Right Document Management AlternativeLotusphere BP304: Looking For the Right Document Management Alternative
Lotusphere BP304: Looking For the Right Document Management AlternativeRoland Driesen
 
Tely Labs Webinar Intro May 22nd 2014
Tely Labs Webinar Intro May 22nd 2014Tely Labs Webinar Intro May 22nd 2014
Tely Labs Webinar Intro May 22nd 2014Paul Richards
 
Exponential e-unified-communications-presentations
Exponential e-unified-communications-presentationsExponential e-unified-communications-presentations
Exponential e-unified-communications-presentationsExponential_e
 
Unified Communications - Collaborative services that deliver greater busines...
Unified Communications  - Collaborative services that deliver greater busines...Unified Communications  - Collaborative services that deliver greater busines...
Unified Communications - Collaborative services that deliver greater busines...Exponential_e
 
The Human ROI: Past, Present and Future of Localization
The Human ROI: Past, Present and Future of LocalizationThe Human ROI: Past, Present and Future of Localization
The Human ROI: Past, Present and Future of LocalizationMichael Meinhardt
 
UG Software Technologies
UG Software TechnologiesUG Software Technologies
UG Software TechnologiesUg Webmart
 
Case Studies in Enterprise Messaging Federation
Case Studies in Enterprise Messaging FederationCase Studies in Enterprise Messaging Federation
Case Studies in Enterprise Messaging FederationAlan Quayle
 
Evolve Com Tec Presentation
Evolve Com Tec PresentationEvolve Com Tec Presentation
Evolve Com Tec Presentationsamanthahubbard
 
Microsoft Skype for Business and the quest for legacy video interoperability
Microsoft Skype for Business and the quest for legacy video interoperabilityMicrosoft Skype for Business and the quest for legacy video interoperability
Microsoft Skype for Business and the quest for legacy video interoperabilityAnders Løkke
 
OpenText PowerDOCS: A Cloud Solution for Document Generation
OpenText PowerDOCS: A Cloud Solution for Document GenerationOpenText PowerDOCS: A Cloud Solution for Document Generation
OpenText PowerDOCS: A Cloud Solution for Document GenerationMarc St-Pierre
 
IBM Lotus Sametime - IM for the Enterprise
IBM Lotus Sametime - IM for the EnterpriseIBM Lotus Sametime - IM for the Enterprise
IBM Lotus Sametime - IM for the EnterpriseDvir Reznik
 
The Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in CloudThe Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in CloudAll Things Open
 
Train foundation model for domain-specific language model
Train foundation model for domain-specific language modelTrain foundation model for domain-specific language model
Train foundation model for domain-specific language modelBenjaminlapid1
 
MiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganMiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganKirti Vashee
 

Ähnlich wie TAUS USER CONFERENCE 2010, Man, Machine and advanced translation memory leveraging (20)

DEC-16-UNLEASH THE POWER OF HUMAN COLLABORATION
DEC-16-UNLEASH THE POWER OF HUMAN COLLABORATIONDEC-16-UNLEASH THE POWER OF HUMAN COLLABORATION
DEC-16-UNLEASH THE POWER OF HUMAN COLLABORATION
 
Insights in the MT Market, by Jaap van der Meer, TAUS
Insights in the MT Market, by Jaap van der Meer, TAUSInsights in the MT Market, by Jaap van der Meer, TAUS
Insights in the MT Market, by Jaap van der Meer, TAUS
 
Intro to watson bluemix services
Intro to watson bluemix servicesIntro to watson bluemix services
Intro to watson bluemix services
 
Tempo - Mobile access with Governance
Tempo - Mobile access with GovernanceTempo - Mobile access with Governance
Tempo - Mobile access with Governance
 
SDL Server 2009 Launch Presentation
SDL Server 2009 Launch PresentationSDL Server 2009 Launch Presentation
SDL Server 2009 Launch Presentation
 
Lotusphere BP304: Looking For the Right Document Management Alternative
Lotusphere BP304: Looking For the Right Document Management AlternativeLotusphere BP304: Looking For the Right Document Management Alternative
Lotusphere BP304: Looking For the Right Document Management Alternative
 
Tely Labs Webinar Intro May 22nd 2014
Tely Labs Webinar Intro May 22nd 2014Tely Labs Webinar Intro May 22nd 2014
Tely Labs Webinar Intro May 22nd 2014
 
Exponential e-unified-communications-presentations
Exponential e-unified-communications-presentationsExponential e-unified-communications-presentations
Exponential e-unified-communications-presentations
 
Unified Communications - Collaborative services that deliver greater busines...
Unified Communications  - Collaborative services that deliver greater busines...Unified Communications  - Collaborative services that deliver greater busines...
Unified Communications - Collaborative services that deliver greater busines...
 
Open Source in Government / Graham Taylor
Open Source in Government / Graham TaylorOpen Source in Government / Graham Taylor
Open Source in Government / Graham Taylor
 
The Human ROI: Past, Present and Future of Localization
The Human ROI: Past, Present and Future of LocalizationThe Human ROI: Past, Present and Future of Localization
The Human ROI: Past, Present and Future of Localization
 
UG Software Technologies
UG Software TechnologiesUG Software Technologies
UG Software Technologies
 
Case Studies in Enterprise Messaging Federation
Case Studies in Enterprise Messaging FederationCase Studies in Enterprise Messaging Federation
Case Studies in Enterprise Messaging Federation
 
Evolve Com Tec Presentation
Evolve Com Tec PresentationEvolve Com Tec Presentation
Evolve Com Tec Presentation
 
Microsoft Skype for Business and the quest for legacy video interoperability
Microsoft Skype for Business and the quest for legacy video interoperabilityMicrosoft Skype for Business and the quest for legacy video interoperability
Microsoft Skype for Business and the quest for legacy video interoperability
 
OpenText PowerDOCS: A Cloud Solution for Document Generation
OpenText PowerDOCS: A Cloud Solution for Document GenerationOpenText PowerDOCS: A Cloud Solution for Document Generation
OpenText PowerDOCS: A Cloud Solution for Document Generation
 
IBM Lotus Sametime - IM for the Enterprise
IBM Lotus Sametime - IM for the EnterpriseIBM Lotus Sametime - IM for the Enterprise
IBM Lotus Sametime - IM for the Enterprise
 
The Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in CloudThe Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in Cloud
 
Train foundation model for domain-specific language model
Train foundation model for domain-specific language modelTrain foundation model for domain-specific language model
Train foundation model for domain-specific language model
 
MiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganMiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit Michigan
 

Mehr von TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)TAUS - The Language Data Network
 

Mehr von TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
 

Kürzlich hochgeladen

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

TAUS USER CONFERENCE 2010, Man, Machine and advanced translation memory leveraging

  • 1. TAUS USER CONFERENCE 2010 LANGUAGE BUSINESS INNOVATION 4 – 6 OCTOBER / PORTLAND (OR), USA MONDAY 4 OCTOBER / 15.00 MAN, MACHINE AND ADVANCED TRANSLATION MEMORY LEVERAGING Daniel Gervais, MultiCorpora
  • 2.
  • 3. Five New Technologies... ...that will change enterprise computing.  Search – the Next Generation  Environments to create Virtual Companies  Virtualization Management Consoles  Secure Cloud Creation  Management Technologies Source: Eric Lundquist, Editor-in-Chief, eWeek smartertechnology.com © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 4. So, what does that mean for us? • elastic capacity • Search – the Next Generation • fault tolerant • Environments to create Virtual Companies • Scalable • Virtualization Management Consoles • Secure • Secure Cloud Creation • and easily maintained • Management Technologies Cool concepts, but...  How does this affect our industry?  How do we access them?  How do we harness them for greater productivity?  What are the real benefits?  What is the cost?  What are the best practices?  Where are they going? © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 5. A brief roundup of SCbDS  Super-Cloud based Data Sharing o TDA o MyMemory o Google Translate o Grand Dictionnaire Terminologique, Termium, IATE, ... o EUR-Lex o Other multilingual public-domain sources o ... © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 6. SCbDS upsides  Advances in technology support large translation memories o Build vs. Existing o Proprietary vs. shared o Public domain mining  Align large multilingual corpora  Data mine within aligned corpora  Measurable benefits have been obtained through ALTM on top of large memories  BUT THERE’S A DANGER: Translation memory pollution & too much automation! © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 7. Translation Memory Pollution is...  Correctly aligned segments containing poor translation: o Inadequate editing o Poor post-mortem cleanup  Incorrectly aligned segments: o Poor alignment technology o Inadequate post-alignment proofing  Rogue tags  Correct translation of undesired content  Correct translation of obsolete source  Obsolete translation of correct source  Poor translation of poorly written source content: © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 8. Translation Memory Pollution: overall conclusion Sentence-level leveraging in absence of contextual information is too simplistic and can lead to unsatisfactory results! TM ??? 3§“§%!°“§$%“§$&$&/!  © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 9. The Big Question  Does increased matching through ALTM equate to REAL productivity gain?  We say YES! © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 10. Here‘s why we say YES!  Large enterprise case  Large government case  Department of Justice  Medax  UNESCO  Services Canada © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 11. The main problem  Wide variation of Document Types  Legacy files in PDF  No TM for certain customers Secondary problems  Content is often complex  Highly sensitive to context and style  Highly client-specific © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 12. Conventional TMs Mixed Results:  No promised massive cost savings  Useful enforcement tool  Conventional terminology tool unwieldy  Excel spreadsheets preferred! Time Investment Critical  Therefore, selectivity of clients  No ability to influence clients at the authoring stage - Documents are rarely repetitive on a traditional segment model  Cost-benefit decisions: no TMs or truncated TMs © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 13. ALTM addressed needs for:  Context  Matches at the paragraph, level  Matches at the segment and sub-segment levels  Interfacing/Compatibility with external vendors who used various TM tools  Better integration with terminology management, live online deployment  Server-based solution to link global production platform © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 14. ALTM Benefits:  Alignment automation = Low overhead for maintaining memory  Rapid creation of larger memories = Faster project scoping and bidding  Higher probability of matches  Context provided at all times = Reduce research time  Identification of sub-expressions = Result in more matches  Terminology integration = Reduce research time, increase consistency In general, more matches reduce revision time Used to rebuild out-of-date conventional TM’s Cost-effective competitiveness © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 15. Proof of proposal example  Translation Bureau RFP o For 1200 licenses o Proof of Proposal – 5 consecutive business days:  Install full client-server, 20 workstations  Create a production TM of 15 000 pairs of unstructured documents in various formats (≈ 20 M source words)  1 day - 10 people user training  1 day – production simulation use  Ensure no productivity loss - compute gains MultiCorpora won the RFP © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 16. Harmonize legacy documents  Department of Justice Canada o Laws & Regulations in French and English o No harmonization of ambiguous terms o ALTM allowed to extract terminology, see the translation discrepancies in context and identify corrections o ALTM combined with terminology allowed building TermBases of ambiguous terms from process on one document, and correct in all other documents o Continuous learning process, powered by ALTM Do in computing minutes what used to take people months © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 17. • German Translation Service Provider • geographically dispersed translator pool - roughly 250 doctors and pharmacists • seven full-time employees oversee processing of nearly 5 million words per year • Historically no clear TM strategy • Document types not conducive to TM • Lacklustre productivity gains vs. overhead • Discovered ROI from the terminology management and sub-segment matching • high number of shorter, domain-specific repeated sub-segment phrases  Creates hybrid, partially pre-translated documents containing “pre-harmonized” terminology to send out o 90% comes from the TermBase, created by sub-segment matches, analysis o Remaining 10% from the TextBase © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 18. UNESCO  “On the Fly” translation memories o Analyse docs against all translation memories o Identify which docs and memories are the most used o Re-build specific memories from UNESCO documents, and related organisations’ documents referenced in documents o Achieve higher degree of recycling from partner organisation’s documents o Ability to recycle / harmonize domain-specific terminology by example, powered by ALTM. o Continuous improvement virtuous circle Create a TM in minutes vs. what would take months to align Add additional external content Get domain-specific terminology though sub-expressions © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 19. Services Canada - Job Bank  Distinctive Hybrid translation process o 90M words per year o TM / MT / post editing o Linguistic assets comprise  Previous job offers  Domain-specific terms  Shared data increased productivity © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 20. Translation Memory Pollution: Antidote  Content selection o Too much unstructured content o Need establish mining hierarchy  Use of statistics o Generate usage & translation distribution statistics per content repositories o Standardize in “live” Terminology Databases  Use human intelligence o Human needs to be involved. Too much automation only propagates pollution… o Virtuous improvement circle © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 21. Other uses of ALTM  Monolingual analysis o Identify single source candidates o Identify terms to standardize o Identify deviations of customized documents from baseline texts o Identify localization order prioritization of baseline documents - 15% savings potential  TextBase repetitions  Term repetitions © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.
  • 22. The Journey Is Not Yet Finished  More automation of the antidotes to pollution  Recent improvement in term extraction algorithms can expose pollution sources  Evangelization of the processes  No quick fix: Human factor remains involved. Not yet at the vision of fully automated pre-translated ALTM.  New collaboration models between linguists and TM systems  Better support for linguistic decision-making  Evangelization of the role of the post-editor © 2009 – 2010 | This confidential document is the property of MultiCorpora and cannot be shared, reproduced, distributed or used without permission.