SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Downloaden Sie, um offline zu lesen
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.




Tools for Document Image Analysis and Search
   Ioannis Pratikakis, Basilis Gatos and Anastasios Kesidis

   Computational Intelligence Laboratory
   Institute of Informatics and Telecommunications
   National Center for Scientific Research "Demokritos"
   GR-153 10 Agia Paraskevi, Athens, Greece
   May 7, 2010
   Bratislava
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.




Word spotting Architecture
                                                                                          Document
                                                                                           corpus




   The main operational parts of the Word
   Spotting engine are:                                                                PRE-PROCESSING
                                                                                                                                        Keywords
                                                                                                                                           list


                                                                                              TR1
                                                                                             Image
                                                                                          enhancement
                                                                                                                Segmented               Synthetic    Character
 Marking character templates                                                                                     words                 keyword      templates

                                                                                             TR2
                                                                                          Segmentation
                                                                                                                 Feature                 Feature

 Feature extraction & word matching
                                                                                                                extraction              extraction


                                                                                                                         Similarity
                                                                                                                        measurement



 User feedback                                                                                                               Initial
                                                                                                                             ranking
                                                                                                                              results



 Searching                                                                                                                    Final
                                                                                                                             ranking
                                                                                                                              results



 User access control
                                                                                                               Mark keyword instances in
                                                                                                                      documents



                                                                                                                                                                 2
IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands.




For future inquiries :



 Ioannis PRATIKAKIS (ipratika@iit.demokritos.gr)

 Basilis GATOS (bgat@iit.demokritos.gr)




                                                                                                                                                  3

Weitere ähnliche Inhalte

Andere mochten auch

BSB Demo Day - Mühlberger - Dokumentstrukturanalyse
BSB Demo Day - Mühlberger - DokumentstrukturanalyseBSB Demo Day - Mühlberger - Dokumentstrukturanalyse
BSB Demo Day - Mühlberger - Dokumentstrukturanalyse
IMPACT Centre of Competence
 

Andere mochten auch (9)

8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
 
POLARIS Brand Design&Development
POLARIS Brand Design&DevelopmentPOLARIS Brand Design&Development
POLARIS Brand Design&Development
 
Wroclaw university library - Grazyna Piotrowicz
Wroclaw university library - Grazyna PiotrowiczWroclaw university library - Grazyna Piotrowicz
Wroclaw university library - Grazyna Piotrowicz
 
OCR en toepassing bij de KB by Marian Hellema
OCR en toepassing bij de KB by Marian HellemaOCR en toepassing bij de KB by Marian Hellema
OCR en toepassing bij de KB by Marian Hellema
 
IMPACT Final Conference - USAL - Arbitrary warping
IMPACT Final Conference - USAL - Arbitrary warpingIMPACT Final Conference - USAL - Arbitrary warping
IMPACT Final Conference - USAL - Arbitrary warping
 
IMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus GravenhorstIMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus Gravenhorst
 
Biomolecules
BiomoleculesBiomolecules
Biomolecules
 
Governare Reti, Governare con le Reti (con note intervento)
Governare Reti, Governare con le Reti (con note intervento)Governare Reti, Governare con le Reti (con note intervento)
Governare Reti, Governare con le Reti (con note intervento)
 
BSB Demo Day - Mühlberger - Dokumentstrukturanalyse
BSB Demo Day - Mühlberger - DokumentstrukturanalyseBSB Demo Day - Mühlberger - Dokumentstrukturanalyse
BSB Demo Day - Mühlberger - Dokumentstrukturanalyse
 

Mehr von IMPACT Centre of Competence

Mehr von IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Bratislava WS - Pratikakis - NCSR - image analysis tools_pdf

  • 1. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Tools for Document Image Analysis and Search Ioannis Pratikakis, Basilis Gatos and Anastasios Kesidis Computational Intelligence Laboratory Institute of Informatics and Telecommunications National Center for Scientific Research "Demokritos" GR-153 10 Agia Paraskevi, Athens, Greece May 7, 2010 Bratislava
  • 2. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. Word spotting Architecture Document corpus The main operational parts of the Word Spotting engine are: PRE-PROCESSING Keywords list TR1 Image enhancement Segmented Synthetic Character  Marking character templates words keyword templates TR2 Segmentation Feature Feature  Feature extraction & word matching extraction extraction Similarity measurement  User feedback Initial ranking results  Searching Final ranking results  User access control Mark keyword instances in documents 2
  • 3. IMPACT is supported by the European Community under the FP7 ICT Work Programme. The project is coordinated by the National Library of the Netherlands. For future inquiries :  Ioannis PRATIKAKIS (ipratika@iit.demokritos.gr)  Basilis GATOS (bgat@iit.demokritos.gr) 3