A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
Machine Translation
1.
2.
3.
4. 72.1 percent of the
consumers spend most or
all of their
time on sites in their own
language
72.4 percent say they
would be more likely to buy
a product with information
in their own language
56.2 percent say that
the ability to obtain
information in their own
language is more
important than price.
6. Content that does not need to be perfect but just
approximately understandable (e.g. any website for a
quick review.)
Content that would normally be too expensive or too
slow to translate with a human only translation approach
(e.g. many projects that have insufficient budget for a
human only approach.)
High value content that is changing every hour and every
day there is time sensitivity (e.g. stock market news.)
AFTER 1 HOUR AFTER 2 HOUR
7. RULE BASED MACHINE TRANSLATION
(RBMT)
STATISTICAL MACHINE TRANSLATION
(SMT)
8. • Rules-based systems use a combination of
language and grammar rules plus dictionaries for
common words. Specialist dictionaries are
created to focus on certain industries or
disciplines.
RULE BASED APPROACH
GRAMMAR
RULE
LEXICON
SOFTWARE
PROGRAM
11. • Method based on Dictionary entries, which means that
the words will be translated as a dictionary does – word
by word, usually without much correlation of meaning
between them followed by some syntactic arrangement.
• Dictionary lookups may be done with or without
morphological analysis(Structure of word).
• Direct - based machine translation is ideally suitable for
the translation of long lists of phrases.
SL-TL
Dictionary
SL TEXT TL TEXT
12.
13. SL TEXT ANALYSIS TRANSFER GENERATION TL TEXT
SL
DICTIONARY /
GRAMMAR
SL-TL
DICTIONARY
/GRAMMAR
TL
GRAMMAR/
DICTIONARY
• In this translation system, a database of
translation rules is used to translate text from
source to target language. Whenever a
sentence matches one of the rules, or
examples , it is translated directly using a
dictionary.
• A transfer-based approach first converts the source
language into an internal representation (IR) which is
dependent on the source but not the Target language.
The system then transform IRs into a form IRt which is
independent of the source language and depends only
on the Target lagunage and finally generates the target
language output from IRt
Lexical Level
Syntactic Level
Semantic Level
Intermediate
Representation
Based On Source
Language
Intermediate
Representation
Based On Target
Language
15. • The Interlingual approach converts the input
into a single internal representation(IR) That is
independent of both source and target
languages,and then converts from this into
the output.
Analysis
Interlingua
Representation Generation
SL TEXT TL TEXT
• The advantage in multilingual machine
translations is that no transfer component has to
be created for each language pair
• The obvious disadvantage is that the definition of
an interlingua is difficult and maybe even
impossible for a wider domain.
16. STATISTICAL MACHINE TRANSLATION
• Statistical machine translation (SMT) learns how
to translate by analyzing existing human
translations (known as bilingual text corpora).
• Machine translator can use a database as the
source for all the information it need for
translating.
17. ISSUES IN MACHINE TRANSLATION
• Word order
Word order in languages differs. Some classification can be done by
naming the typical order of subject (S), verb (V) and object (O) in a
sentence . Some languages have word orders as SOV. The target
language may have a different word order. In such cases, word to word
translation is difficult. For example, English language has SVO and Hindi
language has SOV sentence structure.
18. • Ambiguity
A given word or sentence can have more than one
meaning.For ex, the word ‘’party’’ could mean a
polytical party, or a social event,and deciding the
suitable one in perticular case is crucial to getting right
analysis and therefore right translation
• The third reason is that when human use natural
language, they use an enormous amount of common
sense, and knowledge about the world, which helps
to resolve the ambiguity. For ex. in ‘’He went to the
bank, but it was closed for lunch’’,we can infer that
‘bank’ refers to a financial institution, and not a river
bank, because we know from our knowledge of the
world that only the former type of bank can be
closed for lunch.
19. SYSTRAN TRANSLATOR
• RULE BASED MACHINE TRANSLATION SYSTEM.
• SUPPORT 45 LANGUGAES.
BING TRANSLATOR
• STATISTICAL BASED MACHINE TRANSLATION.
• SUPPORT 47 LANGUGAES.
GOOGLE TRANSLATOR
• STATISTICAL BASED MACHINE TRANSLATION
• SUPPORT 80 LANGUAGES.
EXISTING MACHINE TRANSLATION
Hinweis der Redaktion
If you want to read a novel written in french language then just feed that novel into machine and you will get the translated version of that novel in your language. Machine translation help to remove language barrier.
There is lot’s of information available on internet but the same information is not available in vernacular langugage like hindi and malayalam. Taking the case of india only 3% people know english so a small set of people can get get access to these information.This phenomena is called as digital divide.
1. The direct MT system starts with morphological analysis. Morphological analysis removes morphological inflections from the words to get the root word from the source language words. The next step in direct MT system is bilingual dictionary lookup. A bilingual dictionary is looked up to get the target-language words corresponding to the source-language words. The last step in direct MT system is syntactic rearrangement. In syntactic rearrangement, the word order is changed to that which best matches the word order of the target language.
2. Disadvantage
a). Direct MT involves only lexical analysis. It does not consider structure and relationships between words.
b). Direct MT systems can be quite expensive, for multilingual scenarios
c). Some of the source text meaning can be lost in the translation
1. These components incorporate a lot of knowledge about words(Lexical Knowledge), and about the language (Linguistic Knowledge).
2. Such knowledge is stored in one or more lexicons ,and possibly other sources of linguistic knowledge ,such as grammar.
3.Analysis Phase Consist of three LEVEL
Analysis Phase is used to produce source language structure.
LEXICAL LEVEL
This level deals with looking at the input string of characters and seperating them into tokens,which may be words,space or punctuation.
This level also deal with issues like hyphenated words,and misspeltwords
It is the lexical level which tells us that the input ‘’he joined the parti’’consist of four words of which the last is incorrect.
This level is sometimes called ‘tokenisation’or ‘lexical analysis’.
SYNTATIC LEVEL
1)This level deals with identifying the structure of a sentence,and verifying whether a sentence is grammatically correct.
2) For ex., a typical English sentence would consist of a subject and predicate.The subject is normally a noun phrase and the predicate is a verb phrase,and so on.
3) The syntactic level tells us the sentence ‘’He the party joined’’ is (syntactically) incorrect, even though each word in it is (lexically) correct.
SEMANTIC LEVEL
This level deals with the meaning of the input
It is the semantic level which tells us that the sentence ‘’He ate the Party’’ is semantically incorrect,though it is lexically and syntactically well formed
4.Transfer phase is used to transfer source language representation to a target level representation.
5. Generation phase is used to generate target language text using target
level structure
Transfer-based approach has following advantages.
a). It has a modular structure.
b). The system easily handles ambiguities that carry over from one language to
another
In the direct approach, words are translated directly without passing through an additional representation.
In the transfer approach, the source language is transformed into an abstract, less language-specific representation.