2. Contents
፩. Introduction
፪. What is OCR
፫. When and Why OCR
፬. What motivates US
፭. Problem Overview
፮. Proposed solution
፯. Results
፰. Conclusion
፱.
3. Introduction
Ge'ez (ግዕዝ) (also known as Ethiopic)
is a script used as an abugida for several languages of Ethiopia
and Eritrea. It was first used to write Ge'ez, now the language of
the Ethiopian Orthodox Tewahedo Church and the Eritrean
Orthodox Tewahedo Church. In Amharic and Tigrinya, the script is
often called fidäl (ፊደል), meaning "script" or "alphabet".
The Ge'ez script has been adapted to write other, mostly
Semitic, languages, particularly Amharic in Ethiopia, and
Tigrinya in both Eritrea and Ethiopia.
4. What is OCR
OCR stand for Optical Character Recognition is a
technology that is used to translate scanned images of
text into computer editable and searchable text.
• Input: scanned images of printed text
• Output: Computer readable version of input contents
5. When and Why OCR?
OCR is used when recreating a similar document in paper
as a document in electronic form takes more time.
Since we Ethiopian have many historical and religious
books, we have to save them and recreate them. HOW??
The converted text files take less space than the original
image file.
6. Problem Overview
In the running world there is a growing demand for the users to
convert the printed documents in to electronic documents for
maintaining the security of their data. Hence the basic OCR system
was invented to convert the data available on papers in to computer
process able documents, So that the documents can be editable and
reusable.
It won't be an exaggeration to claim that Ethiopia's intellectual
property is hardly digitized; and is stored in paper, that is in the form
of century old parchment paper in monasteries or in the form of file
cabinets in various regional and federal offices.
Digitizing these a number of documents by hand is not a feasible.
7. What Motivates Us
The absences of locally and or internationally
developed single production for OCR software for
Ge’ez characters.
The language is not supported by ASCII standard
to use it on the computer.
8. Proposed Solution
Our proposed system is OCR Supports in identifying
and digitizing documents made up of Ethiopian
characters.
By using OCR technology and application Artificial
Neural Network(ANN) we are going to develop an
application software that helps us to recognize
Ge'ez characters from a given images.
10. Conclusion
OCR system for Ge’ez Characters can be efficiently used
to digitize:
Ethiopian books reside in Ethiopia and countries outside
Ethiopia
Old books of EOTC
Many other documents written in Ge’ez or Amharic as well
as Tigrigna.