SlideShare ist ein Scribd-Unternehmen logo
1 von 35
OCR System
Presented By:-
Vijay apurva(9910103462),
From 4th
year,CSEGuided By:-
Mr. Ankur
kulhari
The current capacity to translate paper documents quickly and
accurately into machine readable form using optical character
recognition technology augments the opportunities in document
searching and storing, as well as the automated document
processing. A fast response in translating large collections of image-
based electronic documents into structured electronic documents is
still a problem. The availability of a large number of processing units
in Grid environments and of free optical character recognition
tools can be exploited to produce a fast translation.
ABSTRACT:-
CONTENTS :-
 What is OCR?
 When and Why OCR?
 Existing System.
 Proposed System.
 Architecture of OCR.
 Algorithms of OCR.
 Modules of OCR.
 Design of OCR.
 Design of Screen shots for OCR.
 Conclusion.
WHAT IS OCR? :-
OCR stands for Optical Character Recognition. It is
one such system that allows us to scan printed, typewritten or
hand written text (numerals, letters or symbols) and/or convert
scanned image in to a computer process able format, either in the
form of a plain text or a word document.
 Later the converted documents can be edited, used or reused
in other documents. Thus the documents become editable.
WHEN AND WHY OCR? :-
 OCR is used when recreating a similar document in paper as
a document in electronic form takes more time.
 The converted text files take less space than the original
image file and can be indexed. Hence the use of OCR adds an
advantage to the user who had to deal with conversion of great
amount of paper works in to electronic form.
EXISTING SYSTEM:-
In the running world there is a growing demand for
the users to convert the printed documents in to electronic
documents for maintaining the security of their data. Hence the
basic OCR system was invented to convert the data available on
papers in to computer process able documents, So that the
documents can be editable and reusable.
PROPOSED SYSTEM:-
Our proposed system is OCR ON A GRID
INFRASTRUCTURE which is a character recognition system that
supports recognition of the characters of multiple languages. This
feature is what we call grid infrastructure which eliminates the
problem of heterogeneous character recognition. In this context,
Grid infrastructure means the infrastructure that supports group of
specific set of languages. Thus OCR on a grid infrastructure is multi-
lingual.
ARCHITECTURE :-
 The Architecture of the optical character recognition system on a
grid infrastructure consists of the three main components. They are:-
 Scanner
 OCR Hardware or Software
 Output Interface
Document
Illuminator
Detector
Document
Analysis
Character
Recognition Contextual
Processing
Scanner
OCR Hard-Ware Or Soft-Ware
Document image
Output
Interface
Recognition Results
To application user
TYPES OF TRAINING:-
Basically there are two major types of training using which we can
train a neural network system. They are:-
 Supervised Training
 Unsupervised Training
FLOWCHART FOR UNSUPERVISED LEARNING:-
KOHONEN NETWORK:-
The Kohonen network is presented with data, but the correct
output that corresponds to that data is not specified. Using the
Kohonen network this data can be classified into groups.
FLOWCHART FOR KOHONEN TRAINING:-
ALGORITHMS OF OCR:-
TRAINING ALGORITHM:-
One of the most common learning algorithms is called Hebb’s
Rule. This rule was developed to assist with unsupervised training.
 Hebb’s rule is expressed as:
Δ Wi j= µ ai aj (d-a)
MODULES :-
The Modules that were identified in the Optical Character
Recognition system are as follows:-
 Document Processing
 Neural network System Training
 Document Recognition
 Document Editing and
 Document Searching
DESIGN OF OCR :-
The design of our OCR system can be best explained
with the following diagram:-
Scan
Store
Recognize Editing
Searching
Document
and users
Database
OVERALL USECASE DIAGRAM:-
end-user1
end-user2
Document modification Document deletion
Document recognition
scan documents
store documents
Document processing
<<includes>>
<<includes>>
Document processing
Document editing
administrator
Trains the system
end-user
OVERALL CLASS DIAGRAM:-
Document
docid : integer
docname : String
docsize : integer
doctype : String
getDocumentDetails()
scanDocument()
covertToImage()
storeImage()
Editor
cut()
copy()
paste()
new()
open()
find()
HelpFrame
HEntry
hLineClear()
vLineClear()
findBounds()
TrainingSet
inputCount : int
outputcount : int
trainingSetCount : int
setInputCount()
setOutputCount()
setTrainingSetCount()
setClassify()
1..*
1
1..*
1
MainScreen
editor()
helpFrame()
printedFrame()
handWrittenFrame()
Entry
recog : int
downSampleLeft : int
downSampleRight : int
downSampleTop : int
downSampleBottom : int
hLineClear()
hLineClearWithin()
vLineClear()
vLineClearWithin()
PrintedFrame
open_action()
train_action()
topen_action()
recogniseAll_action()
1..*
1
1..*
1
KohenNetwork
LearnMethod = 1:int
LearnRate = 0.3:double
quitError : double
copyWeights()
clearWeights()
winner()
normalizeInput()
1..*1..* 1..*1..* 1..*1..* 1..*1..*
DESIGN OF SCREEN SHOTS FOR OCR:-
 Main Screen
 Hand Written Recognition Screen
 Scanned Document Recognition Screen
 Training Screen
 Recognition Screen
 Editor Screen
The screenshots that describe the operations carried out by our
system are as follows :-
CONCLUSION:-
The Grid infrastructure used in the implementation of
Optical Character Recognition system can be efficiently used to
speed up the translation of image based documents into structured
documents that are currently easy to discover, search and process.
The automated entry of data by OCR is one of the most
attractive, labor reducing technology
The recognition of new font characters by the system is very
easy and quick.
We can edit the information of the documents more
conveniently and we can reuse the edited information as and
when required.
The extension to software other than editing and searching is
topic for future works.
• Training and recognition speeds can
be increased greater and greater by
making it more user-friendly.
• Many applications exist where it
would be desirable to read
handwritten entries. Reading
handwriting is a very difficult task
considering the diversities that exist
in ordinary penmanship. However,
progress is being made.
optical character recognition system

Weitere ähnliche Inhalte

Was ist angesagt?

A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESA STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESijcsitcejournal
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character RecognitionRahul Mallik
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character RecognitionDurjoy Saha
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCRxsconfused
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHarshana Madusanka Jayamaha
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechKushagraChadha1
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Vidyut Singhania
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionNaiyan Noor
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting RecognitionBindu Karki
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using PythonYogeshIJTSRD
 
Automatic handwriting recognition
Automatic handwriting recognitionAutomatic handwriting recognition
Automatic handwriting recognitionBIJIT GHOSH
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...iosrjce
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Chiranjeevi Adi
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognitionsowmith chatlapally
 

Was ist angesagt? (20)

A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESA STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
 
Text reader [OCR]
Text reader [OCR]Text reader [OCR]
Text reader [OCR]
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
OCR Text Extraction
OCR Text ExtractionOCR Text Extraction
OCR Text Extraction
 
Ocr abstract
Ocr abstractOcr abstract
Ocr abstract
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCR
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural network
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTech
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
 
Basics of-optical-character-recognition
Basics of-optical-character-recognitionBasics of-optical-character-recognition
Basics of-optical-character-recognition
 
Tamil OCR using Tesseract OCR Engine
Tamil OCR using Tesseract OCR EngineTamil OCR using Tesseract OCR Engine
Tamil OCR using Tesseract OCR Engine
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting Recognition
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using Python
 
Automatic handwriting recognition
Automatic handwriting recognitionAutomatic handwriting recognition
Automatic handwriting recognition
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 

Andere mochten auch

Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0STUDIO BARONI
 
OCR vs. Urjanet
OCR vs. UrjanetOCR vs. Urjanet
OCR vs. UrjanetUrjanet
 
SPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionSPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionUrjanet
 
Spark 2017 Key Takeaways
Spark 2017 Key TakeawaysSpark 2017 Key Takeaways
Spark 2017 Key TakeawaysUrjanet
 
How to Access Utility Data
How to Access Utility DataHow to Access Utility Data
How to Access Utility DataUrjanet
 
The Credit Score Present and Future
The Credit Score Present and FutureThe Credit Score Present and Future
The Credit Score Present and FutureUrjanet
 
SPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilitySPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilityUrjanet
 
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...Urjanet
 
SPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationSPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationUrjanet
 

Andere mochten auch (9)

Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0
 
OCR vs. Urjanet
OCR vs. UrjanetOCR vs. Urjanet
OCR vs. Urjanet
 
SPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionSPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product Vision
 
Spark 2017 Key Takeaways
Spark 2017 Key TakeawaysSpark 2017 Key Takeaways
Spark 2017 Key Takeaways
 
How to Access Utility Data
How to Access Utility DataHow to Access Utility Data
How to Access Utility Data
 
The Credit Score Present and Future
The Credit Score Present and FutureThe Credit Score Present and Future
The Credit Score Present and Future
 
SPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilitySPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & Sustainability
 
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
 
SPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationSPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through Gamification
 

Ähnlich wie optical character recognition system

IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...IRJET Journal
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition Shobhit Saxena
 
Optical Recognition of Handwritten Text
Optical Recognition of Handwritten TextOptical Recognition of Handwritten Text
Optical Recognition of Handwritten TextIRJET Journal
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization wordDhana K
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsMonika Renate Barget
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Editor IJARCET
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Editor IJARCET
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptxDanielJDanso
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...ijiert bestjournal
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR RecognitionBharat Kalia
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AIIRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET Journal
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PIijtsrd
 

Ähnlich wie optical character recognition system (20)

50120130406005
5012013040600550120130406005
50120130406005
 
D017222226
D017222226D017222226
D017222226
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition
 
Z04405149151
Z04405149151Z04405149151
Z04405149151
 
PB.docx
PB.docxPB.docx
PB.docx
 
Optical Recognition of Handwritten Text
Optical Recognition of Handwritten TextOptical Recognition of Handwritten Text
Optical Recognition of Handwritten Text
 
A12REVIEW.pptx
A12REVIEW.pptxA12REVIEW.pptx
A12REVIEW.pptx
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutions
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx
 
Ocr 1
Ocr 1Ocr 1
Ocr 1
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AI
 
CRC Final Report
CRC Final ReportCRC Final Report
CRC Final Report
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PI
 

Kürzlich hochgeladen

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 

Kürzlich hochgeladen (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 

optical character recognition system

  • 1. OCR System Presented By:- Vijay apurva(9910103462), From 4th year,CSEGuided By:- Mr. Ankur kulhari
  • 2. The current capacity to translate paper documents quickly and accurately into machine readable form using optical character recognition technology augments the opportunities in document searching and storing, as well as the automated document processing. A fast response in translating large collections of image- based electronic documents into structured electronic documents is still a problem. The availability of a large number of processing units in Grid environments and of free optical character recognition tools can be exploited to produce a fast translation. ABSTRACT:-
  • 3. CONTENTS :-  What is OCR?  When and Why OCR?  Existing System.  Proposed System.  Architecture of OCR.  Algorithms of OCR.  Modules of OCR.  Design of OCR.  Design of Screen shots for OCR.  Conclusion.
  • 4. WHAT IS OCR? :- OCR stands for Optical Character Recognition. It is one such system that allows us to scan printed, typewritten or hand written text (numerals, letters or symbols) and/or convert scanned image in to a computer process able format, either in the form of a plain text or a word document.  Later the converted documents can be edited, used or reused in other documents. Thus the documents become editable.
  • 5. WHEN AND WHY OCR? :-  OCR is used when recreating a similar document in paper as a document in electronic form takes more time.  The converted text files take less space than the original image file and can be indexed. Hence the use of OCR adds an advantage to the user who had to deal with conversion of great amount of paper works in to electronic form.
  • 6. EXISTING SYSTEM:- In the running world there is a growing demand for the users to convert the printed documents in to electronic documents for maintaining the security of their data. Hence the basic OCR system was invented to convert the data available on papers in to computer process able documents, So that the documents can be editable and reusable.
  • 7. PROPOSED SYSTEM:- Our proposed system is OCR ON A GRID INFRASTRUCTURE which is a character recognition system that supports recognition of the characters of multiple languages. This feature is what we call grid infrastructure which eliminates the problem of heterogeneous character recognition. In this context, Grid infrastructure means the infrastructure that supports group of specific set of languages. Thus OCR on a grid infrastructure is multi- lingual.
  • 8. ARCHITECTURE :-  The Architecture of the optical character recognition system on a grid infrastructure consists of the three main components. They are:-  Scanner  OCR Hardware or Software  Output Interface
  • 9. Document Illuminator Detector Document Analysis Character Recognition Contextual Processing Scanner OCR Hard-Ware Or Soft-Ware Document image Output Interface Recognition Results To application user
  • 10. TYPES OF TRAINING:- Basically there are two major types of training using which we can train a neural network system. They are:-  Supervised Training  Unsupervised Training
  • 12. KOHONEN NETWORK:- The Kohonen network is presented with data, but the correct output that corresponds to that data is not specified. Using the Kohonen network this data can be classified into groups.
  • 13. FLOWCHART FOR KOHONEN TRAINING:-
  • 14. ALGORITHMS OF OCR:- TRAINING ALGORITHM:- One of the most common learning algorithms is called Hebb’s Rule. This rule was developed to assist with unsupervised training.  Hebb’s rule is expressed as: Δ Wi j= µ ai aj (d-a)
  • 15. MODULES :- The Modules that were identified in the Optical Character Recognition system are as follows:-  Document Processing  Neural network System Training  Document Recognition  Document Editing and  Document Searching
  • 16. DESIGN OF OCR :- The design of our OCR system can be best explained with the following diagram:- Scan Store Recognize Editing Searching Document and users Database
  • 17. OVERALL USECASE DIAGRAM:- end-user1 end-user2 Document modification Document deletion Document recognition scan documents store documents Document processing <<includes>> <<includes>> Document processing Document editing administrator Trains the system end-user
  • 18. OVERALL CLASS DIAGRAM:- Document docid : integer docname : String docsize : integer doctype : String getDocumentDetails() scanDocument() covertToImage() storeImage() Editor cut() copy() paste() new() open() find() HelpFrame HEntry hLineClear() vLineClear() findBounds() TrainingSet inputCount : int outputcount : int trainingSetCount : int setInputCount() setOutputCount() setTrainingSetCount() setClassify() 1..* 1 1..* 1 MainScreen editor() helpFrame() printedFrame() handWrittenFrame() Entry recog : int downSampleLeft : int downSampleRight : int downSampleTop : int downSampleBottom : int hLineClear() hLineClearWithin() vLineClear() vLineClearWithin() PrintedFrame open_action() train_action() topen_action() recogniseAll_action() 1..* 1 1..* 1 KohenNetwork LearnMethod = 1:int LearnRate = 0.3:double quitError : double copyWeights() clearWeights() winner() normalizeInput() 1..*1..* 1..*1..* 1..*1..* 1..*1..*
  • 19. DESIGN OF SCREEN SHOTS FOR OCR:-  Main Screen  Hand Written Recognition Screen  Scanned Document Recognition Screen  Training Screen  Recognition Screen  Editor Screen The screenshots that describe the operations carried out by our system are as follows :-
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. CONCLUSION:- The Grid infrastructure used in the implementation of Optical Character Recognition system can be efficiently used to speed up the translation of image based documents into structured documents that are currently easy to discover, search and process. The automated entry of data by OCR is one of the most attractive, labor reducing technology The recognition of new font characters by the system is very easy and quick. We can edit the information of the documents more conveniently and we can reuse the edited information as and when required. The extension to software other than editing and searching is topic for future works.
  • 34. • Training and recognition speeds can be increased greater and greater by making it more user-friendly. • Many applications exist where it would be desirable to read handwritten entries. Reading handwriting is a very difficult task considering the diversities that exist in ordinary penmanship. However, progress is being made.