SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Downloaden Sie, um offline zu lesen
Abhay Adapanawar, Anita Garje, Paurnima Thakare, Prajakta Gundawar, Priyanka Kulkarni
  / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622
                  www.ijera.com Vol. 3, Issue 2, March -April 2013, pp.516-518
         English To Marathi Translation Of Assertive Sentences

       Abhay Adapanawar1, Anita Garje2, Paurnima Thakare3, Prajakta
                    Gundawar4, Priyanka Kulkarni5
              Department of Information Technology, STES’s Sinhgad Academy of Engineering,
                                        Pune-48, Maharashtra, India.



ABSTRACT                                                          Natural Language Processing (NLP) is
         This paper presents the conversion for         quite a difficult task. There were many researches in
simple English Assertive sentences to Marathi           the field of language translation but there is no fully
sentences. This is basically a machine translation.
In this proposed system we are going through            successful language translation machine so far.
various processes such as morphological analysis,       Since it is a Human Language Technology (HLT),
part of speech, local word grouping, for                there are lots of varieties and lots of opportunities
converting the meaning of simple assertive              for research. It is not possible to work on the whole
English sentence into corresponding Marathi             language translation process together. Therefore, it is
sentence.                                               segmented into many parts[1].
         Here English to Marathi bilingual                        But, the fact is, most of them choose a part
dictionary has been formed for the purpose of           of the source language to be translated to the target
language translation. And English sentence are          language[2]. For example, in our paper the source
translated into the Marathi sentences using the         language is English and the target language is
rules that are produced earlier in the language         Marathi. There are varieties types of sentences both
translation system.                                     in English and Marathi language. In this paper, we
                                                        have taken assertive sentences. We have mainly
Keywords-      Machine      Translation,     Natural    focused on the process of language translation and
Language Processing, Artificial Intelligence            the effectiveness of the language translation.

1. Introduction                                         3. Motivation for the work:
          Marathi is one of the richest languages                The people from rural area are not able
among all the languages exist in the world and one      understand high level of English. And also it’s not
of the largely spoken languages in the world. More      possible for them to carry dictionary for conversion
than 72 million people speak in Marathi as their        everywhere every time. Also these people are
native language. It is ranked 19th based on the         hesitating while using internet. Hence they can’t able
number of speakers .Marathi is the mother language      to help themselves for using services like internet
of India and also a large number of people in           banking or other.
southern area of India (Maharashtra) speak and write             Therefore, the purpose behind this project
in Marathi.                                             is to help people who had done their primary
          Marathi is a member of the Indo-Aryan         education but not up to the level of understanding
languages. It is derived from Sanskrit. It is written   each and every meaning of English word.
left-to-right, top-to-bottom of page (same as
English). Its vocabulary is akin to Sanskrit. Though    4. Literature Survey
the vocabularies are quite difficult at first, but to             Machine Translation in India is relatively
some extent there are similarities with English as      young. The earliest efforts date from the late 80s and
exemplified by the following words in Table 1.          early 90s. The prominent among these are the
                                                        projects at IIT Kanpur, University of Hyderabad,
TABLE 1: Comparison of the similarities between         NCST Mumbai and CDAC Pune. The Technology
different languages Major Headings                      Development in Indian Languages (TDIL), an
                                                        initiative of the Department of IT, Ministry of
Language    Word                                        Communications and Information Technology,
English     month             mother   new      night   Government of India, has played an instrumental
Sanskrit    mĂŁs               matar    nava     nakt    role by funding these projects.[3]
Marathi     mahina            mata     navin    ratra             Since the mid and late 90’s, a few more
                                                        projects have been initiated—at IIT Bombay, IIIT
2. Need of translation                                  Hyderabad, AU-KBC Centre Chennai and Jabalpur
                                                        University Kolkata. There are also a couple of
                                                        efforts from the private sector - from Super



                                                                                               516 | P a g e
Abhay Adapanawar, Anita Garje, Paurnima Thakare, Prajakta Gundawar, Priyanka Kulkarni
  / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622
                  www.ijera.com Vol. 3, Issue 2, March -April 2013, pp.516-518
InfosoftPvt Ltd, and more recently, the IBM India       and provide disambiguation information using an
Research Lab.                                           intuitive GUI, allowing the system to produce a
   We now look at some of the major Indian MT           single correct translation. The system uses rule-bases
projects in more detail. The parameters we look at      and heuristics to resolve ambiguities to the extent
are: language pair(s), formalism, and strategy for      possible – for example, a rule-base is used to map
handling complexity/ambiguity, and application          English prepositions into Hindi postpositions.
domain, wherever this information is available.                  The system can work in a fully automatic
                                                        mode and produce rough translations for end users,
4.1. Anglabharat (and Anubharati):                      but is primarily meant for translators, editors and
          Anglabharati deals with machine translation   content providers. Currently, it works for simple
from English to Indian languages, primarily Hindi,      sentences, and work is on to extend the coverage to
using a rule-based transfer approach. The primary       complex sentences. The MaTra lexicon and
strategy for handling ambiguity/complexity is post-     approach is general-purpose, but the system has been
editing—in case of ambiguity, the system retains all    applied mainly in the domains of news, annual
possible ambiguous constructs, and the user has to      reports and technical phrases, and has been funded
select the correct choices using a post-editing         by TDIL.
window to get the correct translation. The system’s
approach and lexicon is general-purpose, but has        4.4. Mantra:
been applied mainly in the domain of public health.              The Mantra project is based on the TAG
The project is primarily based at IIT-Kanpur, in        formalism from University of Pennsylvania. A sub-
collaboration with ER&DCI, Noida, and has been          language English-Hindi MT system has been
funded by TDIL.                                         developed for the domain of gazette notifications
          Anubharati is a recent project at IIT         pertaining to government appointments. In addition
Kanpur, dealing with template-based machine             to translating the content, the system can also
translation from Hindi to English, using a variation    preserve the formatting of input Word documents
of example-based machine translation. An early          across the translation. The Mantra approach is
prototype has been developed and is being extended.     general, but the lexicon/grammar has been limited to
                                                        the sub-language of the domain. Recently, work has
4.2. Anusaaraka:                                        been initiated on other language pairs such as Hindi-
          The focus in Anusaaraka is not mainly on      English and Hindi-Bengali, as well as on extending
machine translation, but on Language Access             to the domain of parliament proceeding summaries.
between Indian languages. Using principles of           The project has been funded by TDIL, and later by
Paninian Grammar (PG), and exploiting the close         the Department of Official Languages.
similarity of Indian languages, an Anusaaraka           4.5. UCSG-based English-Kannada MT:
essentially maps local word groups between the          The CS Department at the Univ of Hyderabad has
source and target languages. Where there are            worked on an English-Kannada MT system, using
differences between the languages, the system           the Universal Clause Structure Grammar (UCSG)
introduces extra notation to preserve the information   formalism, also invented there. This is essentially a
of the source language. Thus, the user needs some       transfer-based approach, and has been applied to the
training to understand the output of the system. The    domain of government circulars, and funded by the
project has developed Language Assessors from           Karnataka government.
Punjabi, Bengali, Telugu, Kannada and Marathi into
Hindi. The approach and lexicon is general, but the     4.6. UNL-based MT between English, Hindi
system has mainly been applied for children’s           and Marathi:
stories. The project originated at IIT Kanpur, and               The Universal Networking Language
later shifted mainly to the Centre for Applied          (UNL) is an international project of the United
Linguistics and Translation Studies (CALTS),            Nations University, with an aim to create an
Department of Humanities and Social Sciences,           Interlingua for all major human languages. IIT
University of Hyderabad. It was funded by TDIL.         Bombay is the Indian participant in UNL, and is
Of late, the Language Technology Research Centre        working on MT systems between English, Hindi and
(LTRC) at IIIT Hyderabad is attempting an English-      Marathi using the UNL formalism. This essentially
Hindi Anusaaraka/MT system.                             uses an interlingual approach—the source language
                                                        is converted into UNL using an ‘enconverter’, and
4.3. MaTra:                                             then converted into the target language using a
         MaTra is a Human-Assisted translation          ‘deconverter’. [4]
project for English to Indian languages, currently      4.7. Tamil-Hindi Anusaaraka and English-
Hindi, essentially based on a transfer approach using   Tamil MT:
a frame-like structured representation. The focus is             The Anna University KB Chandrasekhar
on the innovative use of man-machine synergy—the        Research Centre at Chennai was established
user can visually inspect the analysis of the system,   recently, and is active in the area of Tamil NLP. A



                                                                                              517 | P a g e
Abhay Adapanawar, Anita Garje, Paurnima Thakare, Prajakta Gundawar, Priyanka Kulkarni
  / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622
                  www.ijera.com Vol. 3, Issue 2, March -April 2013, pp.516-518
Tamil-Hindi language accessor has been built using      5. The grouping of words forms the sentence and
the Anusaaraka formalism described above.               then Syntax checking of sentences according to
Recently, the group has begun work on an English-       language rules is done
Tamil MT system.                                        6. Grouped words are separated by using local word
                                                        separator for mapping from English to Marathi.
4.8. English-Hindi MAT for news sentences:              7. Separated words, corresponding Marathi match
         The Jadavpur University at Kolkata has         words are fetched from dictionary.
recently worked on a rule-based English-Hindi MAT       8. Finally, Marathi sentence is generated by applying
for news sentences using the transfer approach.         Marathi grammar.

4.9. Anuvadak English-Hindi software:                   6. Features:
         Super InfosoftPvt Ltd is one of the very few   1. Can be used for localization purpose.
private sector efforts in MT in India. They have been   2. Can be useful for students in learning phase.
working on software called Anuvadak, which is a         3. Useful for the people who have taken not
general-purpose English-Hindi translation tool that     understand English of higher level
supports post-editing.                                  4. No need to learn other languages.
                                                        5. Can be useful for document conversion.
4.10. English-Hindi Statistical MT
         The IBM India Research Lab at New Delhi        7. Utilities:
has recently initiated work on statistical MT between   1. Tokenization: User is able to see the separation of
English and Indian languages, building on IBM’s         given sentences into words by lexical analyzer.
existing work on statistical MT.                        2. English tokenized words and their exact Marathi
                                                        word are shown parallel to the user.
5. Flow of translation                                  3. Parts of speech: User is able to see the parts of
                                                        speech of each and every English words.

                                                        8. Conclusion
                                                                  In this paper the system is proposed for
                                                        user, where the user will get the English to Marathi
                                                        translation of simple assertive sentences. The system
                                                        also proposes that user will get the details of parts of
                                                        speech of English words. The approach gives easy
                                                        translation; hence it can be used for developing other
                                                        language translators.

                                                        References:
                                                          [1]    Elaine     Rich    and    Kevin     Knight,
                                                                 Shivashankar Nair, Artificial Intelligence
                                                                 (Chapter Natural Language Processing),
                                                                 3rd Edition, Tata McGraw-Hill, ISBN-10-
                                                                 0070087709, ISBN-13- 9780070087705
                                                          [2]    S. M. Chaware, Srikanta Rao, Domain
                                                                 Specific     Information    Retrieval   in
                                                                 Multilingual Environment, International
                                                                 Journal of Recent Trends in Engineering,
                                                                 2(4), 2009, 179-181
                                                          [3]    Bharati, Akshar, Vineet Chaitanya, Rajeev
Fig 1 : flow of translation                                      Sangal, Natural Language Processing: A
                                                                 PaninianPrespective, Prentice-Hall of
5.1 Explanation of flow:                                         India,1995.
1. English sentence is taken as input                      [4]   Sangal, Rajeev,AksharBharati, DiptiMisra
                                                                 Sharma, Lakshmi Bai, Guidelines For POS
2. Lexical analyzer creates lexicons i.e it divides              And Chunk Annotation For Indian
sentences into words.                                            Languages, December 2006.
3. Morphological analyzer is responsible for              [5]    Sangal,      Rajeev,DiptiMisra     Sharma,
formation      of    morphemes       and    performs             Lakshmi Bai, KaruneshArora, Developing
identification, analysis, description of structure of            Indian languages corpora: Standards and
words.                                                           practice, November 2006.
4. Then with according to English grammar words
are grouped.


                                                                                                518 | P a g e

Weitere Àhnliche Inhalte

Was ist angesagt?

Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation SystemInterpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
ijtsrd
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Waqas Tariq
 

Was ist angesagt? (19)

Transliteration by orthography or phonology for hindi and marathi to english ...
Transliteration by orthography or phonology for hindi and marathi to english ...Transliteration by orthography or phonology for hindi and marathi to english ...
Transliteration by orthography or phonology for hindi and marathi to english ...
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi Language
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
 
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation SystemInterpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
 
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
 
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
 
Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
Implementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large DictionaryImplementation of Marathi Language Speech Databases for Large Dictionary
Implementation of Marathi Language Speech Databases for Large Dictionary
 
A Review on Speech Corpus Development for Automatic Speech Recognition in Ind...
A Review on Speech Corpus Development for Automatic Speech Recognition in Ind...A Review on Speech Corpus Development for Automatic Speech Recognition in Ind...
A Review on Speech Corpus Development for Automatic Speech Recognition in Ind...
 
G1803013542
G1803013542G1803013542
G1803013542
 
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
S URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELSS URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELS
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
 
HMM BASED POS TAGGER FOR HINDI
HMM BASED POS TAGGER FOR HINDIHMM BASED POS TAGGER FOR HINDI
HMM BASED POS TAGGER FOR HINDI
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems
 
A decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri languageA decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri language
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 

Ähnlich wie Cf32516518

Ähnlich wie Cf32516518 (20)

Summer Research Project (Anusaaraka) Report
Summer Research Project (Anusaaraka) ReportSummer Research Project (Anusaaraka) Report
Summer Research Project (Anusaaraka) Report
 
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISHHANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
 
MACHINE TRANSLATION WITH SPECIAL REFERENCE TO MALAYALAM LANGUAGE
MACHINE TRANSLATION WITH SPECIAL REFERENCE TO MALAYALAM LANGUAGEMACHINE TRANSLATION WITH SPECIAL REFERENCE TO MALAYALAM LANGUAGE
MACHINE TRANSLATION WITH SPECIAL REFERENCE TO MALAYALAM LANGUAGE
 
Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...
 
Survey of machine translation systems in india
Survey of machine translation systems in indiaSurvey of machine translation systems in india
Survey of machine translation systems in india
 
Speech To Speech Translation
Speech To Speech TranslationSpeech To Speech Translation
Speech To Speech Translation
 
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
 
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisGrapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indian
 
Ijetcas14 444
Ijetcas14 444Ijetcas14 444
Ijetcas14 444
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 
Ac04507168175
Ac04507168175Ac04507168175
Ac04507168175
 
Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translation
 
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTS
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTSSENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTS
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTS
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
 
An Unsupervised Approach to Develop Stemmer
An Unsupervised Approach to Develop StemmerAn Unsupervised Approach to Develop Stemmer
An Unsupervised Approach to Develop Stemmer
 
Marathi-English CLIR using detailed user query and unsupervised corpus-based WSD
Marathi-English CLIR using detailed user query and unsupervised corpus-based WSDMarathi-English CLIR using detailed user query and unsupervised corpus-based WSD
Marathi-English CLIR using detailed user query and unsupervised corpus-based WSD
 
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
IMPROVING THE QUALITY OF GUJARATI-HINDI MACHINE TRANSLATION THROUGH PART-OF-S...
 

KĂŒrzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

KĂŒrzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Cf32516518

  • 1. Abhay Adapanawar, Anita Garje, Paurnima Thakare, Prajakta Gundawar, Priyanka Kulkarni / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 2, March -April 2013, pp.516-518 English To Marathi Translation Of Assertive Sentences Abhay Adapanawar1, Anita Garje2, Paurnima Thakare3, Prajakta Gundawar4, Priyanka Kulkarni5 Department of Information Technology, STES’s Sinhgad Academy of Engineering, Pune-48, Maharashtra, India. ABSTRACT Natural Language Processing (NLP) is This paper presents the conversion for quite a difficult task. There were many researches in simple English Assertive sentences to Marathi the field of language translation but there is no fully sentences. This is basically a machine translation. In this proposed system we are going through successful language translation machine so far. various processes such as morphological analysis, Since it is a Human Language Technology (HLT), part of speech, local word grouping, for there are lots of varieties and lots of opportunities converting the meaning of simple assertive for research. It is not possible to work on the whole English sentence into corresponding Marathi language translation process together. Therefore, it is sentence. segmented into many parts[1]. Here English to Marathi bilingual But, the fact is, most of them choose a part dictionary has been formed for the purpose of of the source language to be translated to the target language translation. And English sentence are language[2]. For example, in our paper the source translated into the Marathi sentences using the language is English and the target language is rules that are produced earlier in the language Marathi. There are varieties types of sentences both translation system. in English and Marathi language. In this paper, we have taken assertive sentences. We have mainly Keywords- Machine Translation, Natural focused on the process of language translation and Language Processing, Artificial Intelligence the effectiveness of the language translation. 1. Introduction 3. Motivation for the work: Marathi is one of the richest languages The people from rural area are not able among all the languages exist in the world and one understand high level of English. And also it’s not of the largely spoken languages in the world. More possible for them to carry dictionary for conversion than 72 million people speak in Marathi as their everywhere every time. Also these people are native language. It is ranked 19th based on the hesitating while using internet. Hence they can’t able number of speakers .Marathi is the mother language to help themselves for using services like internet of India and also a large number of people in banking or other. southern area of India (Maharashtra) speak and write Therefore, the purpose behind this project in Marathi. is to help people who had done their primary Marathi is a member of the Indo-Aryan education but not up to the level of understanding languages. It is derived from Sanskrit. It is written each and every meaning of English word. left-to-right, top-to-bottom of page (same as English). Its vocabulary is akin to Sanskrit. Though 4. Literature Survey the vocabularies are quite difficult at first, but to Machine Translation in India is relatively some extent there are similarities with English as young. The earliest efforts date from the late 80s and exemplified by the following words in Table 1. early 90s. The prominent among these are the projects at IIT Kanpur, University of Hyderabad, TABLE 1: Comparison of the similarities between NCST Mumbai and CDAC Pune. The Technology different languages Major Headings Development in Indian Languages (TDIL), an initiative of the Department of IT, Ministry of Language Word Communications and Information Technology, English month mother new night Government of India, has played an instrumental Sanskrit mĂŁs matar nava nakt role by funding these projects.[3] Marathi mahina mata navin ratra Since the mid and late 90’s, a few more projects have been initiated—at IIT Bombay, IIIT 2. Need of translation Hyderabad, AU-KBC Centre Chennai and Jabalpur University Kolkata. There are also a couple of efforts from the private sector - from Super 516 | P a g e
  • 2. Abhay Adapanawar, Anita Garje, Paurnima Thakare, Prajakta Gundawar, Priyanka Kulkarni / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 2, March -April 2013, pp.516-518 InfosoftPvt Ltd, and more recently, the IBM India and provide disambiguation information using an Research Lab. intuitive GUI, allowing the system to produce a We now look at some of the major Indian MT single correct translation. The system uses rule-bases projects in more detail. The parameters we look at and heuristics to resolve ambiguities to the extent are: language pair(s), formalism, and strategy for possible – for example, a rule-base is used to map handling complexity/ambiguity, and application English prepositions into Hindi postpositions. domain, wherever this information is available. The system can work in a fully automatic mode and produce rough translations for end users, 4.1. Anglabharat (and Anubharati): but is primarily meant for translators, editors and Anglabharati deals with machine translation content providers. Currently, it works for simple from English to Indian languages, primarily Hindi, sentences, and work is on to extend the coverage to using a rule-based transfer approach. The primary complex sentences. The MaTra lexicon and strategy for handling ambiguity/complexity is post- approach is general-purpose, but the system has been editing—in case of ambiguity, the system retains all applied mainly in the domains of news, annual possible ambiguous constructs, and the user has to reports and technical phrases, and has been funded select the correct choices using a post-editing by TDIL. window to get the correct translation. The system’s approach and lexicon is general-purpose, but has 4.4. Mantra: been applied mainly in the domain of public health. The Mantra project is based on the TAG The project is primarily based at IIT-Kanpur, in formalism from University of Pennsylvania. A sub- collaboration with ER&DCI, Noida, and has been language English-Hindi MT system has been funded by TDIL. developed for the domain of gazette notifications Anubharati is a recent project at IIT pertaining to government appointments. In addition Kanpur, dealing with template-based machine to translating the content, the system can also translation from Hindi to English, using a variation preserve the formatting of input Word documents of example-based machine translation. An early across the translation. The Mantra approach is prototype has been developed and is being extended. general, but the lexicon/grammar has been limited to the sub-language of the domain. Recently, work has 4.2. Anusaaraka: been initiated on other language pairs such as Hindi- The focus in Anusaaraka is not mainly on English and Hindi-Bengali, as well as on extending machine translation, but on Language Access to the domain of parliament proceeding summaries. between Indian languages. Using principles of The project has been funded by TDIL, and later by Paninian Grammar (PG), and exploiting the close the Department of Official Languages. similarity of Indian languages, an Anusaaraka 4.5. UCSG-based English-Kannada MT: essentially maps local word groups between the The CS Department at the Univ of Hyderabad has source and target languages. Where there are worked on an English-Kannada MT system, using differences between the languages, the system the Universal Clause Structure Grammar (UCSG) introduces extra notation to preserve the information formalism, also invented there. This is essentially a of the source language. Thus, the user needs some transfer-based approach, and has been applied to the training to understand the output of the system. The domain of government circulars, and funded by the project has developed Language Assessors from Karnataka government. Punjabi, Bengali, Telugu, Kannada and Marathi into Hindi. The approach and lexicon is general, but the 4.6. UNL-based MT between English, Hindi system has mainly been applied for children’s and Marathi: stories. The project originated at IIT Kanpur, and The Universal Networking Language later shifted mainly to the Centre for Applied (UNL) is an international project of the United Linguistics and Translation Studies (CALTS), Nations University, with an aim to create an Department of Humanities and Social Sciences, Interlingua for all major human languages. IIT University of Hyderabad. It was funded by TDIL. Bombay is the Indian participant in UNL, and is Of late, the Language Technology Research Centre working on MT systems between English, Hindi and (LTRC) at IIIT Hyderabad is attempting an English- Marathi using the UNL formalism. This essentially Hindi Anusaaraka/MT system. uses an interlingual approach—the source language is converted into UNL using an ‘enconverter’, and 4.3. MaTra: then converted into the target language using a MaTra is a Human-Assisted translation ‘deconverter’. [4] project for English to Indian languages, currently 4.7. Tamil-Hindi Anusaaraka and English- Hindi, essentially based on a transfer approach using Tamil MT: a frame-like structured representation. The focus is The Anna University KB Chandrasekhar on the innovative use of man-machine synergy—the Research Centre at Chennai was established user can visually inspect the analysis of the system, recently, and is active in the area of Tamil NLP. A 517 | P a g e
  • 3. Abhay Adapanawar, Anita Garje, Paurnima Thakare, Prajakta Gundawar, Priyanka Kulkarni / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 2, March -April 2013, pp.516-518 Tamil-Hindi language accessor has been built using 5. The grouping of words forms the sentence and the Anusaaraka formalism described above. then Syntax checking of sentences according to Recently, the group has begun work on an English- language rules is done Tamil MT system. 6. Grouped words are separated by using local word separator for mapping from English to Marathi. 4.8. English-Hindi MAT for news sentences: 7. Separated words, corresponding Marathi match The Jadavpur University at Kolkata has words are fetched from dictionary. recently worked on a rule-based English-Hindi MAT 8. Finally, Marathi sentence is generated by applying for news sentences using the transfer approach. Marathi grammar. 4.9. Anuvadak English-Hindi software: 6. Features: Super InfosoftPvt Ltd is one of the very few 1. Can be used for localization purpose. private sector efforts in MT in India. They have been 2. Can be useful for students in learning phase. working on software called Anuvadak, which is a 3. Useful for the people who have taken not general-purpose English-Hindi translation tool that understand English of higher level supports post-editing. 4. No need to learn other languages. 5. Can be useful for document conversion. 4.10. English-Hindi Statistical MT The IBM India Research Lab at New Delhi 7. Utilities: has recently initiated work on statistical MT between 1. Tokenization: User is able to see the separation of English and Indian languages, building on IBM’s given sentences into words by lexical analyzer. existing work on statistical MT. 2. English tokenized words and their exact Marathi word are shown parallel to the user. 5. Flow of translation 3. Parts of speech: User is able to see the parts of speech of each and every English words. 8. Conclusion In this paper the system is proposed for user, where the user will get the English to Marathi translation of simple assertive sentences. The system also proposes that user will get the details of parts of speech of English words. The approach gives easy translation; hence it can be used for developing other language translators. References: [1] Elaine Rich and Kevin Knight, Shivashankar Nair, Artificial Intelligence (Chapter Natural Language Processing), 3rd Edition, Tata McGraw-Hill, ISBN-10- 0070087709, ISBN-13- 9780070087705 [2] S. M. Chaware, Srikanta Rao, Domain Specific Information Retrieval in Multilingual Environment, International Journal of Recent Trends in Engineering, 2(4), 2009, 179-181 [3] Bharati, Akshar, Vineet Chaitanya, Rajeev Fig 1 : flow of translation Sangal, Natural Language Processing: A PaninianPrespective, Prentice-Hall of 5.1 Explanation of flow: India,1995. 1. English sentence is taken as input [4] Sangal, Rajeev,AksharBharati, DiptiMisra Sharma, Lakshmi Bai, Guidelines For POS 2. Lexical analyzer creates lexicons i.e it divides And Chunk Annotation For Indian sentences into words. Languages, December 2006. 3. Morphological analyzer is responsible for [5] Sangal, Rajeev,DiptiMisra Sharma, formation of morphemes and performs Lakshmi Bai, KaruneshArora, Developing identification, analysis, description of structure of Indian languages corpora: Standards and words. practice, November 2006. 4. Then with according to English grammar words are grouped. 518 | P a g e