SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Machine Translation withStatisticalApproach 1
Whatis the machine translation??? Machine translation is the study of designingsystemsthat translate from one humanlanguage in to another. Machine translation system essentiallytakes a text in one language (called the source language), and translate itintoanotherlanguage(calledtargetlanguage). The source  and targetlanguage are naturallanguagessuch as english and hindi. 2
Contd……..	 This is the hard problem, sinceprocessingnaturallanguagerequiresworkatseverallevles, and complexities and ambiguitiesariesateach of thoselevles. Hence an MT system canbesaid to bedoingnaturallanguageprocessing(NLP).In fact,most machine translation application requiressomedegree of naturallanguageunderstanding to do the translation. 3
History of Machine Translation  Machine translation as a discipline dates back to the earlynineteen-fifties. The complexity of the problemwasoriginallyunderestimated, and someearlysuccessfuldemonstrations of experimental system lead to unrealisticexpectionswhichwere hard to fulfil. In the early eighties, the JapaneseFifthGenerationComputing Project revivedinterest in thiswork. The currentapproach to MT is more pragmatic and realistic. 4
Contd…. It isnowwidelyacceptedthatfullyautomatic, general-purpose , highquality machine translation is a verydifficultproblem, but veryuseful and pratical system canneverthelessbedeveloped by realxing one or more of thesecriteria,andseveralusefulsystems have been built by doingso,and are in use today. Suchsystems are beingused to translate public announcements,weather bulletins, technical documents, and web pages. 5
Contd.. Some machine translation services are starting to becomeavailable on the world wide web. For example,the web page of the Google searchenginealsoprovides a translation service thatcan translate simple sentences among a handful of languages. 6
Translation telephonetechnology(speech to speech translation) The ‘Janus’ projectat the Interactive System lab, Carnegie Mellon University, Is working on set of translation project. You dial yourcolleague in tokyo. You do not speakJapanese, and hedoes not speakenglish.Soyouneed system suchthatyouspeakinto the phone in english, whichautomaticallygets translate intojapanese for him, he replies in japanese, and youhearit in english. 7
Research MT System Example:thejanustranlsating Phone project This prototype system allowstwousers to communicate in a givendomain via a videoconferencingconnection. Each party sees the other conversant, hearshis/herorginalvoicesees/hears translation of whathe/shesays as subtitles, caption and synthetic speech. The situation iscooperative, That isbothuserswant to understandeachother and collaborate via the system to achieveunderstanding. 8
Contd…. After the record buttonisactivated, the station acceptsspoken input and produces a paraphrase of the input sentence first. Once the user has verifiedthat the system properlyunderstood the intendedmeaning, he/sheactivate the sendbutton to send a translation of thisintendedmeaning to the otherside in the desiredlanguage. Various interactive correction mechanismsfacilitate quick recovery, should possible processingerros and miscommunication have altered the intendedmeaning. 9
Machine Translation & Artificial Intelligence MT is an important sub-discipline of the widerfield of Artificial Intelligence(AI). AI(amongotherthings)deals withgetting machine to exhibit intelligent behaviour. As wemightimagine,both AI and MT are interesting and challengingfields. 10
        Component of MT Wecandivide the machine translation taskintothree main phases:- The system has to first analyse the source language input to createsomeinternalrepresentetion. It thentypicallymanipulatesthisinternalrepresentationtotransferit to a formsuitable for a targetlanguage. Finally,itgenerates the output in the targetlanguage. 11
Analysis Transfer Generation Source Language  Target Language  Intermediate Representation based on source language Intermediate Representation based on target language 12
Contd… A typical MT system contains components for analysis ,transfer and generation as shown in diagram. These components incorporate a lot of knowledge about words(Lexical Knowledge), and about the language (LinguisticKnowledge). Suchknowledgeisstored in one or more lexicons ,and possiblyother sources of linguisticknowledge ,such as grammar.   13
Contd… The user interface isinvariably a crucial part of most MT system. The interface allows user to verify,disambiguate and if necessary correct the output of the system. Anothercommonfeature of NLP workis use of large ‘corpora’. A corpus is a large collection of textwhichisused for acquiring the required lexical and linguisticknowledge.  14
Contd… Somesystemsprefer to split the lexiconinto a source lexicon, a targetlexicon,and a transferlexiconthatmapsbetween the two. An MT lexicontypicallyneeds to bemuch more formal,precise and elaboratethan a typicalhumandictionary,sinceitismeant for mechanicalprocessing,and not for reading by humans. The lexiconplays a central role in modern MT system. 15
Lexicon The lexiconis an important component of MT system. A lexiconcontains all the relevant information about words and phrases thatisrequired for the variouslevels of analysis and generation. A typicallexicon entry for a wordwouldcontain the following information about the word:the part of speech,information about the equivalentword in the targetlanguage.  16
Approaches to MT Based on how closely the internalrepresentationdepends on the source and targetlanguages,approaches to MT canbedividedintothree major classes-  Direct. Transfer-based. Inter-lingual.   17
A direct MT system tries to directlymap the source language to the targetlanguage , and isthereforehighlydependent on both the source and targetlanguages. A transfer-basedapproach first converts the source languageinto an internalrepresentation (IRs)whichisdependent on the source but not the targetlanguage.The system thentransformIRsinto a formIRtwhichisindependent of the source language and dependsonly on the targetlanguage and finallygenerates the targetlanguage output fromIRt.     18
… The Inter-lingualapproachconverts the input into a single internalrepresentation(IR) thatisindependent of both source and targetlanguages,andthenconvertsfromthisinto the output. 19
Levels of Natural LanguageProcessing Dealingwithnaturallanguagetypicallyrequiresprocessingatvariouslevels.Inincreasingorder of difficulty,they are:- The Lexical Level(or the Word Level) The SyntacticLevel(or the Sentence Level) The SemanticLevel(or the MeaningLevel) The Discourse and PragmaticLevel(or the Conversation ContextLevel). 20
The Lexical Level This level deals withlookingat the input string of characters and seperatingthemintotokens,whichmaybewords,space or punctuation. This levelalso deal with issues likehyphenatedwords,andmisspeltwords. It is the lexical levelwhich tells us that the input ‘’hejoined the parti’’consist of four words of which the last is incorrect. This levelissometimescalled ‘tokenisation’or ‘lexical analysis’. 21
The SyntacticLevel This level deals withidentifying the structure of a sentence,andverifyingwhether a sentence isgrammatically correct. This leveltypicallyconsist of a ‘parser’ which looks at the grammar of the language,and the input sentence,and tries to form a ‘parseTree’. If itcanform a parsetree ,the sentence issyntactically correct and the parsetreegives us the structure and the function of various components. 22
For ex., a typical English sentence wouldconsist of a subject and predicate.Thesubjectisnormally a noun phrase and the predicateis a verbphrase,andso on. The syntacticlevel tells us the sentence ‘’He the party joined’’ is (syntactically) incorrect, eventhougheachword in itis (lexically) correct. 23
           The SemanticLevel This level deals with the meaning of the input and its components. It is the semanticlevelwhich tells us that the sentence ‘’He ate the Party’’ issemanticallyincorrect,thoughitislexically and syntacticallywellformed. In general, semanticanalysisinvolvesknowledge about the world,orat least the relevant aspect of world.  24
The Conversation ContextLevel This level deals with the information carriedacross multiple sentences, and with information thatis not explicit in the input, but isimplicit in the socio-cultural context of the input pessage or conversation. For ex., the expectedanswer to the question ‘’Do you know what the time is?’’issomethinglike ‘’4p.m.’’ , and not just ‘’Yes’’though the latter islexically,syntactically and symanticallyaccurate. 25
Issues in Machine Translation Machine Translation(and Natural LanguageProcessing) is a difficultproblem. There are two mains reasons, which are related to it. The first reasonisthatnaturallanguageishighlyambiguous.Theambiguityoccurat  all levels-lexical,syntactic,semantic and pragmatic.Agivenword or sentence can have more than one meaning.Forex,theword ‘’party’’ couldmean a polyticalparty,or a social event,anddeciding the suitable one in perticular case is crucial to getting right analysis and therefore right translation  26
The second reasonisthatwhenhuman use naturallanguage , they use an enormousamount of commonsense, and knowledge about the world, whichhelps to resolve the ambiguity. For ex., in ‘’He went to the bank,butitwasclosed for lunch’’,wecaninferthat ‘bank’ refers to a financial institution, and not a river bank,becausewe know fromourknowledge of the world thatonly the former type of bankcanbeclosed for lunch.   27
The StatisticalApproach          (Warren Weaver,1949) Theyconsideronly the translation of indivisual sentences. Usually, there are many acceptable translation of a perticular sentence the choiceamongthembeinglargely a matter of taste. Theytake the viewthatevery sentence in one languageis a possible translation of any sentence in the other. 28
Theyassign to every pair of sentences (S,T) a probability P(S/T) ie. Probabilitythat a translator willproduce T in the targetlanguagewhenpresentedwith S in the source language. Given a sentence T in the targetlanguage,theytry to seek the sentences S fromwhich the translator produces T. The chance of errorisminimized by choosingthat sentence S thatismost probable given T. Thus,theywish to choose S so as to maximize P(S/T). 29
UsingBayse’ theorm                                                     P(S/T)  =  P(S).P(T/S) / P(T) The denominator on the right of thisequationdoes not depend on S, and soitsuffices to choose the S thatmaximizes the product P(S)P(T/S) .                                                                      where,                                                                              P(S) is the language model probability of S , and     P(T/S) is the translation probability of T given S. 30
Conclusion Twophenomena have given a new impetus to machine translation work-the globalisation of the world economy, and the explosion of the internet and World Wide Web. Boththesedevelopmentsmeanthatthereis a need for making an immense collection of naturallanguage documents available to multilingual global audience, and translation tools  and system can go a long way in meeting thatneed. 31
The global translation marketisestimated to beat least 12 billion dollars. System thatautomatically translates Kalidasa and Shakespeare maystillbe  a distant dream, but system that translate stock marketreport,weather bulletins and technicalmeasures are a reality today, and will continue to play an increasingly important role in the society of the next millenium. 32
THANK YOU 33

Weitere ähnliche Inhalte

Was ist angesagt?

The translation of metaphor
The translation of metaphorThe translation of metaphor
The translation of metaphor
Amer Minhas
 
Code Switching: a paper by Krishna Bista
Code Switching: a paper by Krishna BistaCode Switching: a paper by Krishna Bista
Code Switching: a paper by Krishna Bista
Ana Azevedo
 
Equivalencein translation
Equivalencein translationEquivalencein translation
Equivalencein translation
Dorina Moisa
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Mariana Soffer
 

Was ist angesagt? (20)

A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
The translation of metaphor
The translation of metaphorThe translation of metaphor
The translation of metaphor
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
Trans studies lecture 2
Trans studies lecture 2Trans studies lecture 2
Trans studies lecture 2
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
C.nord before translating
C.nord before translatingC.nord before translating
C.nord before translating
 
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...
 
Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
NLP
NLPNLP
NLP
 
Intro to sociolinguistics
Intro to sociolinguisticsIntro to sociolinguistics
Intro to sociolinguistics
 
Code Switching: a paper by Krishna Bista
Code Switching: a paper by Krishna BistaCode Switching: a paper by Krishna Bista
Code Switching: a paper by Krishna Bista
 
Equivalencein translation
Equivalencein translationEquivalencein translation
Equivalencein translation
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
An introduction to systemic functional linguistics
An introduction to systemic functional linguisticsAn introduction to systemic functional linguistics
An introduction to systemic functional linguistics
 
Discourse analysis session 2_12_10_2021 Conversation.pdf
Discourse analysis session 2_12_10_2021 Conversation.pdfDiscourse analysis session 2_12_10_2021 Conversation.pdf
Discourse analysis session 2_12_10_2021 Conversation.pdf
 
Translation Types
Translation TypesTranslation Types
Translation Types
 
Challenges of Translation
Challenges of TranslationChallenges of Translation
Challenges of Translation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologies
 

Andere mochten auch

Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
vini89
 
NLP and its applications
NLP and its applicationsNLP and its applications
NLP and its applications
Utphala P
 
Natural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationNatural Language Processing: Definition and Application
Natural Language Processing: Definition and Application
Stephen Shellman
 
Computer Aided Translation
Computer Aided TranslationComputer Aided Translation
Computer Aided Translation
Philipp Koehn
 
Natural language processing 2
Natural language processing 2Natural language processing 2
Natural language processing 2
Tony Vo
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
vini89
 
Machine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted TranslationMachine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted Translation
Teritaa
 

Andere mochten auch (20)

Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slides
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translation
 
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela BarreiroTowards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
 
Google services
Google servicesGoogle services
Google services
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine Translation
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
 
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
 
Tools for translators: some theory & background
Tools for translators: some theory & backgroundTools for translators: some theory & background
Tools for translators: some theory & background
 
Hcs
HcsHcs
Hcs
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
NLP and its applications
NLP and its applicationsNLP and its applications
NLP and its applications
 
Natural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationNatural Language Processing: Definition and Application
Natural Language Processing: Definition and Application
 
Computer Aided Translation
Computer Aided TranslationComputer Aided Translation
Computer Aided Translation
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
Jeeves -natural language interface application
Jeeves -natural language interface applicationJeeves -natural language interface application
Jeeves -natural language interface application
 
Natural language processing 2
Natural language processing 2Natural language processing 2
Natural language processing 2
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Machine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted TranslationMachine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted Translation
 

Ähnlich wie Machine translation with statistical approach

Prolog (present)
Prolog (present) Prolog (present)
Prolog (present)
Melody Joey
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
CameliaN
 
Vl3.cultureplex presentation
Vl3.cultureplex presentationVl3.cultureplex presentation
Vl3.cultureplex presentation
CameliaN
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
CameliaN
 
Vl3.lab presentation
Vl3.lab presentationVl3.lab presentation
Vl3.lab presentation
CameliaN
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
write4
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
write5
 

Ähnlich wie Machine translation with statistical approach (20)

Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
A DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEY
A DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEYA DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEY
A DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEY
 
Natural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and DifficultiesNatural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and Difficulties
 
NLPinAAC
NLPinAACNLPinAAC
NLPinAAC
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overviewNatural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overview
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
A Short Introduction To Text-To-Speech Synthesis
A Short Introduction To Text-To-Speech SynthesisA Short Introduction To Text-To-Speech Synthesis
A Short Introduction To Text-To-Speech Synthesis
 
Prolog (present)
Prolog (present) Prolog (present)
Prolog (present)
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
 
Vl3.cultureplex presentation
Vl3.cultureplex presentationVl3.cultureplex presentation
Vl3.cultureplex presentation
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
 
An Overview Of Natural Language Processing
An Overview Of Natural Language ProcessingAn Overview Of Natural Language Processing
An Overview Of Natural Language Processing
 
Vl3.lab presentation
Vl3.lab presentationVl3.lab presentation
Vl3.lab presentation
 
NL Context Understanding 23(6)
NL Context Understanding 23(6)NL Context Understanding 23(6)
NL Context Understanding 23(6)
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdf
 
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGEAN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
 
An Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine LanguageAn Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine Language
 

Mehr von vini89

Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
vini89
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
vini89
 
Ai presentation
Ai presentationAi presentation
Ai presentation
vini89
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
vini89
 

Mehr von vini89 (7)

Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
 
Ann
Ann Ann
Ann
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Ai
Ai Ai
Ai
 
Ai presentation
Ai presentationAi presentation
Ai presentation
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Mycin
MycinMycin
Mycin
 

Kürzlich hochgeladen

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Kürzlich hochgeladen (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

Machine translation with statistical approach

  • 2. Whatis the machine translation??? Machine translation is the study of designingsystemsthat translate from one humanlanguage in to another. Machine translation system essentiallytakes a text in one language (called the source language), and translate itintoanotherlanguage(calledtargetlanguage). The source and targetlanguage are naturallanguagessuch as english and hindi. 2
  • 3. Contd…….. This is the hard problem, sinceprocessingnaturallanguagerequiresworkatseverallevles, and complexities and ambiguitiesariesateach of thoselevles. Hence an MT system canbesaid to bedoingnaturallanguageprocessing(NLP).In fact,most machine translation application requiressomedegree of naturallanguageunderstanding to do the translation. 3
  • 4. History of Machine Translation Machine translation as a discipline dates back to the earlynineteen-fifties. The complexity of the problemwasoriginallyunderestimated, and someearlysuccessfuldemonstrations of experimental system lead to unrealisticexpectionswhichwere hard to fulfil. In the early eighties, the JapaneseFifthGenerationComputing Project revivedinterest in thiswork. The currentapproach to MT is more pragmatic and realistic. 4
  • 5. Contd…. It isnowwidelyacceptedthatfullyautomatic, general-purpose , highquality machine translation is a verydifficultproblem, but veryuseful and pratical system canneverthelessbedeveloped by realxing one or more of thesecriteria,andseveralusefulsystems have been built by doingso,and are in use today. Suchsystems are beingused to translate public announcements,weather bulletins, technical documents, and web pages. 5
  • 6. Contd.. Some machine translation services are starting to becomeavailable on the world wide web. For example,the web page of the Google searchenginealsoprovides a translation service thatcan translate simple sentences among a handful of languages. 6
  • 7. Translation telephonetechnology(speech to speech translation) The ‘Janus’ projectat the Interactive System lab, Carnegie Mellon University, Is working on set of translation project. You dial yourcolleague in tokyo. You do not speakJapanese, and hedoes not speakenglish.Soyouneed system suchthatyouspeakinto the phone in english, whichautomaticallygets translate intojapanese for him, he replies in japanese, and youhearit in english. 7
  • 8. Research MT System Example:thejanustranlsating Phone project This prototype system allowstwousers to communicate in a givendomain via a videoconferencingconnection. Each party sees the other conversant, hearshis/herorginalvoicesees/hears translation of whathe/shesays as subtitles, caption and synthetic speech. The situation iscooperative, That isbothuserswant to understandeachother and collaborate via the system to achieveunderstanding. 8
  • 9. Contd…. After the record buttonisactivated, the station acceptsspoken input and produces a paraphrase of the input sentence first. Once the user has verifiedthat the system properlyunderstood the intendedmeaning, he/sheactivate the sendbutton to send a translation of thisintendedmeaning to the otherside in the desiredlanguage. Various interactive correction mechanismsfacilitate quick recovery, should possible processingerros and miscommunication have altered the intendedmeaning. 9
  • 10. Machine Translation & Artificial Intelligence MT is an important sub-discipline of the widerfield of Artificial Intelligence(AI). AI(amongotherthings)deals withgetting machine to exhibit intelligent behaviour. As wemightimagine,both AI and MT are interesting and challengingfields. 10
  • 11. Component of MT Wecandivide the machine translation taskintothree main phases:- The system has to first analyse the source language input to createsomeinternalrepresentetion. It thentypicallymanipulatesthisinternalrepresentationtotransferit to a formsuitable for a targetlanguage. Finally,itgenerates the output in the targetlanguage. 11
  • 12. Analysis Transfer Generation Source Language Target Language Intermediate Representation based on source language Intermediate Representation based on target language 12
  • 13. Contd… A typical MT system contains components for analysis ,transfer and generation as shown in diagram. These components incorporate a lot of knowledge about words(Lexical Knowledge), and about the language (LinguisticKnowledge). Suchknowledgeisstored in one or more lexicons ,and possiblyother sources of linguisticknowledge ,such as grammar. 13
  • 14. Contd… The user interface isinvariably a crucial part of most MT system. The interface allows user to verify,disambiguate and if necessary correct the output of the system. Anothercommonfeature of NLP workis use of large ‘corpora’. A corpus is a large collection of textwhichisused for acquiring the required lexical and linguisticknowledge. 14
  • 15. Contd… Somesystemsprefer to split the lexiconinto a source lexicon, a targetlexicon,and a transferlexiconthatmapsbetween the two. An MT lexicontypicallyneeds to bemuch more formal,precise and elaboratethan a typicalhumandictionary,sinceitismeant for mechanicalprocessing,and not for reading by humans. The lexiconplays a central role in modern MT system. 15
  • 16. Lexicon The lexiconis an important component of MT system. A lexiconcontains all the relevant information about words and phrases thatisrequired for the variouslevels of analysis and generation. A typicallexicon entry for a wordwouldcontain the following information about the word:the part of speech,information about the equivalentword in the targetlanguage. 16
  • 17. Approaches to MT Based on how closely the internalrepresentationdepends on the source and targetlanguages,approaches to MT canbedividedintothree major classes- Direct. Transfer-based. Inter-lingual. 17
  • 18. A direct MT system tries to directlymap the source language to the targetlanguage , and isthereforehighlydependent on both the source and targetlanguages. A transfer-basedapproach first converts the source languageinto an internalrepresentation (IRs)whichisdependent on the source but not the targetlanguage.The system thentransformIRsinto a formIRtwhichisindependent of the source language and dependsonly on the targetlanguage and finallygenerates the targetlanguage output fromIRt. 18
  • 19. … The Inter-lingualapproachconverts the input into a single internalrepresentation(IR) thatisindependent of both source and targetlanguages,andthenconvertsfromthisinto the output. 19
  • 20. Levels of Natural LanguageProcessing Dealingwithnaturallanguagetypicallyrequiresprocessingatvariouslevels.Inincreasingorder of difficulty,they are:- The Lexical Level(or the Word Level) The SyntacticLevel(or the Sentence Level) The SemanticLevel(or the MeaningLevel) The Discourse and PragmaticLevel(or the Conversation ContextLevel). 20
  • 21. The Lexical Level This level deals withlookingat the input string of characters and seperatingthemintotokens,whichmaybewords,space or punctuation. This levelalso deal with issues likehyphenatedwords,andmisspeltwords. It is the lexical levelwhich tells us that the input ‘’hejoined the parti’’consist of four words of which the last is incorrect. This levelissometimescalled ‘tokenisation’or ‘lexical analysis’. 21
  • 22. The SyntacticLevel This level deals withidentifying the structure of a sentence,andverifyingwhether a sentence isgrammatically correct. This leveltypicallyconsist of a ‘parser’ which looks at the grammar of the language,and the input sentence,and tries to form a ‘parseTree’. If itcanform a parsetree ,the sentence issyntactically correct and the parsetreegives us the structure and the function of various components. 22
  • 23. For ex., a typical English sentence wouldconsist of a subject and predicate.Thesubjectisnormally a noun phrase and the predicateis a verbphrase,andso on. The syntacticlevel tells us the sentence ‘’He the party joined’’ is (syntactically) incorrect, eventhougheachword in itis (lexically) correct. 23
  • 24. The SemanticLevel This level deals with the meaning of the input and its components. It is the semanticlevelwhich tells us that the sentence ‘’He ate the Party’’ issemanticallyincorrect,thoughitislexically and syntacticallywellformed. In general, semanticanalysisinvolvesknowledge about the world,orat least the relevant aspect of world. 24
  • 25. The Conversation ContextLevel This level deals with the information carriedacross multiple sentences, and with information thatis not explicit in the input, but isimplicit in the socio-cultural context of the input pessage or conversation. For ex., the expectedanswer to the question ‘’Do you know what the time is?’’issomethinglike ‘’4p.m.’’ , and not just ‘’Yes’’though the latter islexically,syntactically and symanticallyaccurate. 25
  • 26. Issues in Machine Translation Machine Translation(and Natural LanguageProcessing) is a difficultproblem. There are two mains reasons, which are related to it. The first reasonisthatnaturallanguageishighlyambiguous.Theambiguityoccurat all levels-lexical,syntactic,semantic and pragmatic.Agivenword or sentence can have more than one meaning.Forex,theword ‘’party’’ couldmean a polyticalparty,or a social event,anddeciding the suitable one in perticular case is crucial to getting right analysis and therefore right translation 26
  • 27. The second reasonisthatwhenhuman use naturallanguage , they use an enormousamount of commonsense, and knowledge about the world, whichhelps to resolve the ambiguity. For ex., in ‘’He went to the bank,butitwasclosed for lunch’’,wecaninferthat ‘bank’ refers to a financial institution, and not a river bank,becausewe know fromourknowledge of the world thatonly the former type of bankcanbeclosed for lunch. 27
  • 28. The StatisticalApproach (Warren Weaver,1949) Theyconsideronly the translation of indivisual sentences. Usually, there are many acceptable translation of a perticular sentence the choiceamongthembeinglargely a matter of taste. Theytake the viewthatevery sentence in one languageis a possible translation of any sentence in the other. 28
  • 29. Theyassign to every pair of sentences (S,T) a probability P(S/T) ie. Probabilitythat a translator willproduce T in the targetlanguagewhenpresentedwith S in the source language. Given a sentence T in the targetlanguage,theytry to seek the sentences S fromwhich the translator produces T. The chance of errorisminimized by choosingthat sentence S thatismost probable given T. Thus,theywish to choose S so as to maximize P(S/T). 29
  • 30. UsingBayse’ theorm P(S/T) = P(S).P(T/S) / P(T) The denominator on the right of thisequationdoes not depend on S, and soitsuffices to choose the S thatmaximizes the product P(S)P(T/S) . where, P(S) is the language model probability of S , and P(T/S) is the translation probability of T given S. 30
  • 31. Conclusion Twophenomena have given a new impetus to machine translation work-the globalisation of the world economy, and the explosion of the internet and World Wide Web. Boththesedevelopmentsmeanthatthereis a need for making an immense collection of naturallanguage documents available to multilingual global audience, and translation tools and system can go a long way in meeting thatneed. 31
  • 32. The global translation marketisestimated to beat least 12 billion dollars. System thatautomatically translates Kalidasa and Shakespeare maystillbe a distant dream, but system that translate stock marketreport,weather bulletins and technicalmeasures are a reality today, and will continue to play an increasingly important role in the society of the next millenium. 32