SlideShare ist ein Scribd-Unternehmen logo
1 von 27
ICFHR 2014 COMPETITION ON HANDWRITTEN 
KEYWORD SPOTTING (H-KWS 2014) 
IOANNIS PRATIKAKIS1 
KONSTANTINOS ZAGORIS1,2 
BASILIS GATOS2 
Visual Computing Group 
Democritus University of Thrace 
Dept. of Electrical and Computer Engineering 
Xanthi, Greece 
National Centre of Scientific Research “Demokritos” 
Institute of Informatics and Telecommunications 
Athens, Greece 
1 
2 
GEORGIOS LOULOUDIS2 
NIKOLAOS STAMATOPOULOS2
INTRODUCING ICFHR 2014 H-KWS 
Handwritten keyword spotting is the task of detecting query words in handwritten 
document image collections without explicitly recognizing it. 
The objective of the H-KWS 2014 is threefold: 
• Record current advances in keyword spotting. 
• Provide benchmarking handwritten datasets containing both historical and 
modern documents from multiple writers. 
• Explore established evaluation performance measures frequently encountered 
in the information retrieval literature while providing the software for these 
measures as implementation reference.
ORGANIZING ICFHR 2014 H-KWS 
TRACK I - SEGMENTATION-BASED 
• 50 document images of Bentham 
dataset. 
• 100 document images of Modern 
dataset (25 documents per language). 
• Word Location in XML format. 
TRACK II - SEGMENTATION-FREE 
• 50 document images of Bentham 
dataset. 
• 100 document images of Modern 
dataset (25 documents per language).
ICFHR 2014 H-KWS TIMELINE 
Registration to the contest 
7/4/2014 
Submission of the Results 
from the participants 
2014 April May June July August September 2014 
ICFHR 2014 H-KWS 
Presentation 
4/9/2014 
12/5/2014 
Submission of 
competition report/paper 
21/4/2014 
14/4/2014 
Release of the datasets 
and the query sets 
Today
BENTHAM DATASET 
It consists of high quality 
(approximately 3000 pixel 
width and 4000 pixel height) 
handwritten manuscripts. 
The documents are written 
by Jeremy Bentham (1748- 
1832) himself as well as by 
Bentham’s secretarial staff 
over a period of sixty years.
MODERN DATASET 
It consists of modern 
handwritten documents from 
the ICDAR 2009 Handwritten 
Segmentation Contest. 
These documents originate from 
several writers that were asked 
to copy a given text. 
They do not include any non-text 
elements (lines, drawings, 
etc.). 
They are written in four (4) 
languages: English, French, 
German and Greek.
CHALLENGES 
They both contain several 
very difficult problems to be 
addressed, wherein the most 
difficult is the word 
variability. 
The variation of the same 
word is high and involves: 
• writing style 
• font size 
• noise 
• their combination 
BENTHAM 
MODERN
WORD-LENGTH STATISTICS FOR EACH DATASET 
BENTHAM MODERN 
2500 
2000 
1500 
1000 
500 
0 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 
Frequency 
Word Length (characters) 
TRACK I TRACK II 
3500 
3000 
2500 
2000 
1500 
1000 
500 
0 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 
Frequency 
Word Length (characters) 
TRACK I TRACK II
QUERY STATISTICS 
The query set of each dataset is provided in XML format and it contains word image queries of 
length greater than 6 and frequency greater than 5. 
TRACK I - SEGMENTATION-BASED 
• 320 queries for the Bentham Dataset 
• 300 queries for the Modern Dataset 
TRACK II - SEGMENTATION-FREE 
• 290 queries for the Bentham Dataset 
• 300 queries for the Modern Dataset 
180 
160 
140 
120 
100 
80 
60 
40 
20 
0 
Query Word Lengths 
5 6 7 8 9 10 11 12 13 14 15 16 17 
Queries 
Query Word Length (characters) 
TRACK I - Bentham TRACK I - Modern TRACK II - Bentham TRACK II - Modern 
160 
140 
120 
100 
80 
60 
40 
20 
0 
Query Word Frequencies 
4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 
Queries 
Query Word Frequency 
TRACK I - Bentham TRACK I - Modern TRACK II - Bentham TRACK II - Modern
http://vc.ee.duth.gr/H-KWS2014/#Datasets
EVALUATION CHALLENGES 
• Small variations of the query word that can be found in the datasets. 
For example the word “husband” appears as well as : 
• husband, 
• husband: 
• husband. 
• Husband. 
• Husband] 
• Evaluating overall performance as well as precision. 
• Segmentation - free systems may not detect the whole word or include 
parts of another word.
CHOSEN EVALUATION MEASURES 
• Precision at Top 5 Retrieved words (P@5) for evaluating precision 
performance. 
• The Mean Average Precision (MAP) for evaluating overall 
performance. 
• Normalized Discounted Cumulative Gain (NDCG) with binary 
judgment relevancies for evaluating precision-oriented overall 
performance. 
• Normalized Discounted Cumulative Gain (NDCG) with non-binary 
judgment relevancies for evaluating small variations of the query 
word.
PRECISION AT TOP K RETRIEVED WORD (P@K) 
Precision is the fraction of retrieved words that are relevant to the query. 
P@k is the precision for the k top retrieved words. 
In the proposed evaluation, P@5 is used which is the precision at top 5 
retrieved words. 
This metric defines how successfully the algorithms produce relevant results 
to the first 5 positions of the ranking list. 
푃@푘 = 
푟푒푙푒푣푎푛푡 푤표푟푑푠 ∩ 푘 푟푒푡푟푖푒푣푒푑 푤표푟푑푠 
푘 푟푒푡푟푖푒푣푒푑 푤표푟푑푠
MEAN AVERAGE PRECISION (MAP) 
It is a typical measure for the performance of information retrieval systems. 
It is implemented from the Text REtrieval Conference (TREC) community by the 
National Institute of Standards and Technology (NIST). 
It is defined as the average of the precision value obtained after each relevant word is 
retrieved: 
퐴푃 = 
푛 푘=1 
푃@퐾 × 푟푒푙 푘 
푟푒푙푒푣푎푛푡 푤표푟푑푠 
푟푒푙 푘 = 
1, 푖푓 푤표푟푑 푎푡 푟푎푛푘 푘 푖푠 푟푒푙푒푣푎푛푡 
0, 푖푓 푤표푟푑 푎푡 푟푎푛푘 푘 푖푠 푛표푡 푟푒푙푒푣푎푛푡 
where:
NORMALIZED DISCOUNTED CUMULATIVE GAIN (NDCG) 
The NDCG measures the performance of a retrieval system based on the graded 
relevance of the retrieved entities. 
It varies from 0.0 to 1.0, with 1.0 representing the ideal ranking of the entities. 
It is defined as: 
푛퐷퐶퐺 = 
퐷퐶퐺 
퐼퐷퐶퐺 
퐷퐶퐺 = 푟푒푙1 + 
푛 
푖=1 
푟푒푙푖 
푙표푔2 푖 + 1 
where: 
reli is the relevance judgment at position i and IDCG is the ideal DCG which is 
computed from the perfect retrieval result.
NON-BINARY VS BINARY RELEVANCE JUDGMENT 
VALUES 
NON-BINARY 
Word Relevance Judgment 
husband 1.0 
husband 1.0 
husband 1.0 
husband 1.0 
husband 1.0 
husband, 0.9 
husband: 0.9 
husband. 0.9 
Husband. 0.8 
Husband] 0.8 
BINARY 
Word Relevance Judgment 
husband 1.0 
husband 1.0 
husband 1.0 
husband 1.0 
husband 1.0 
husband, 1.0 
husband: 1.0 
husband. 1.0 
Husband. 1.0 
Husband] 1.0
SEGMENTATION – FREE OVERLAPPING THRESHOLD 
A word instance is considered as detected only if there is a significant overlap with the 
ground truth word. 
The overlap is expressed by the intersection over the ground truth word area metric 
(IOA) and it is defined as: 
퐼푂퐴 = 
퐴 ∩ 퐵 
퐴 
where A and B denote the bounding box areas of the ground truth word and the method output 
word, respectively. 
The IOA metric ranges from 0 to 1, where 1 corresponds to exact matching. 
A thresholdT is applied in order to decide whether the word instance and the segmented 
word match sufficiently. 
The performance evaluation for three different thresholds (0.6, 0.7 and 0.8) is used for 
testing.
EVALUATION APPLICATION 
• An evaluation application is developed as referenced 
implementation for each metric. 
• It is available for Windows, Mac OS X and Linux 
operating systems as both command-line and GUI 
form. 
• It accepts as input the experimental results file and 
the relevance judgment file, which represents the 
ground truth. Afterwards, it calculates the 
aforementioned evaluation metrics.
http://vc.ee.duth.gr/H-KWS2014/#VCGEval
http://vc.ee.duth.gr/H-KWS2014/#Resources
PARTICIPANTS 
Method Affiliation Participating 
G1 The Blavatnik School of Computer Science, 
Tel-Aviv University, Israel 
TRACK I 
TRACK II 
G2 Computer Vision Center, Barcelona, Spain TRACK I 
G3 Smith College Department of Computer Science, 
Northampton MA, USA 
TRACK I 
TRACK II 
G4 Université de Lyon, CNRS INSA - Lyon, LIRIS, 
France TRACK II 
G5 Institute for Communications Technology (IfN) 
of Technische Universität Braunschweig, 
Braunschweig, Germany 
TRACK II
TRACK I: SEGMENTATION-BASED - EXPERIMENTAL RESULTS 
BENTHAM DATASET 
Method P@5 MAP NDCG(Binary) NDCG 
G1 0.738 (1) 0.524 (1) 0.742 (2) 0.762 (2) 
G2 0.724 (2) 0.513 (2) 0.744 (1) 0.764 (1) 
G3 0.718 (3) 0.462 (3) 0.638 (3) 0.657 (3) 
MODERN DATASET 
Method P@5 MAP NDCG(Binary) NDCG 
G1 0.588 (2) 0.338 (2) 0.611 (2) 0.612 (2) 
G2 0.706 (1) 0.523 (1) 0.757 (1) 0.757 (1) 
G3 0.569 (3) 0.278 (3) 0.484 (3) 0.484 (3)
TRACK I: SEGMENTATION-BASED - PRECISION – RECALL CURVES 
BENTHAM DATASET 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Precision 
Recall 
G1 G2 G3 
MODERN DATASET 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Precision 
Recall 
G1 G2 G3
TRACK II: SEGMENTATION-FREE - EXPERIMENTAL RESULTS 
BENTHAM DATASET 
P@5 MAP NDCG (Binary) NDCG 
Method 
Overlapping Threshold 
Average 
Overlapping Threshold 
Average 
Overlapping Threshold 
Average 
Overlapping Threshold 
Average 
0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 
G1 0.617 0.611 0.599 0.609 (1) 0.428 0.419 0.402 0.416 (1) 0.653 0.640 0.621 0.638 (1) 0.671 0.657 0.640 0.560 (1) 
G3 0.596 0.568 0.506 0.556 (2) 0.397 0.372 0.321 0.363 (2) 0.551 0.518 0.457 0.509 (2) 0.569 0.536 0.474 0.526 (2) 
G4 0.351 0.341 0.313 0.335 (4) 0.219 0.209 0.187 0.205 (4) 0.386 0.363 0.319 0.356 (4) 0.400 0.376 0.331 0.369 (4) 
G5 0.597 0.55 0.477 0.543 (3) 0.385 0.347 0.280 0.337(3) 0.569 0.513 0.424 0.502 (3) 0.586 0.531 0.440 0.519 (3) 
MODERN DATASET 
P@5 MAP NDCG (Binary) NDCG 
Method 
Overlapping Threshold 
Average 
Overlapping Threshold 
Average 
Overlapping Threshold 
Average 
Overlapping Threshold 
Average 
0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 
G1 0.541 0.541 0.535 0.539 (1) 0.265 0.265 0.259 0.263 (1) 0.491 0.484 0.473 0.483 (1) 0.491 0.485 0.474 0.483 (1) 
G3 0.429 0.422 0.399 0.417 (2) 0.170 0.165 0.152 0.163 (2) 0.310 0.301 0.277 0.296 (2) 0.310 0.301 0.277 0.296 (2) 
G4 0.250 0.241 0.211 0.234 (4) 0.095 0.089 0.077 0.087 (4) 0.218 0.195 0.161 0.191 (4) 0.218 0.195 0.161 0.191 (4) 
G5 0.264 0.247 0.223 0.245 (3) 0.100 0.092 0.081 0.091 (3) 0.229 0.201 0.168 0.199 (3) 0.229 0.202 0.168 0.200 (3)
TRACK I: SEGMENTATION-BASED - PRECISION – RECALL CURVES 
BENTHAM DATASET 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Precision 
Recall 
G1 G3 G4 G5 
MODERN DATASET 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 
0 
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 
Precision 
Recall 
G1 G3 G4 G5
FINAL RANKING 
TRACK I: SEGMENTATION-BASED 
Rank Method Score 
1 G2 10 
2 G1 14 
3 G3 24 
TRACK II: SEGMENTATION-FREE 
Rank Method Score 
1 G1 8 
2 G3 16 
3 G5 24 
4 G4 32
AND THE WINNER IS: 
FOR TRACK I – SEGMENTATION-BASED: 
Method G2 which has been submitted by Jon 
Almazán, Albert Gordo, Ernest Valveny 
affiliated to the: 
Computer Vision Center, Universitat Autònoma 
de Barcelona, Spain. 
TRACK II – SEGMENTATION-FREE 
Method G1 which has been submitted by Alon 
Kovalchuk, Lior Wolf, Nachum Dershowitz 
affiliated to the: 
Blavatnik School of Computer Science, Tel-Aviv 
University, Israel.

Weitere ähnliche Inhalte

Was ist angesagt?

An effective and robust technique for the binarization of degraded document i...
An effective and robust technique for the binarization of degraded document i...An effective and robust technique for the binarization of degraded document i...
An effective and robust technique for the binarization of degraded document i...eSAT Publishing House
 
[Chung il kim] 0829 thesis
[Chung il kim] 0829 thesis[Chung il kim] 0829 thesis
[Chung il kim] 0829 thesisChung-Il Kim
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Bartlomiej Twardowski
 
A framework for outlier detection in
A framework for outlier detection inA framework for outlier detection in
A framework for outlier detection inijfcstjournal
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET Journal
 
Kernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of movingKernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of movingIAEME Publication
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Editor IJMTER
 
Two Stage Reversible Data Hiding Based On Image Interpolation and Histogram ...
Two Stage Reversible Data Hiding Based On Image Interpolation  and Histogram ...Two Stage Reversible Data Hiding Based On Image Interpolation  and Histogram ...
Two Stage Reversible Data Hiding Based On Image Interpolation and Histogram ...IJMER
 
Bhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueBhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueVijayananda Mohire
 
Network embedding
Network embeddingNetwork embedding
Network embeddingSOYEON KIM
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Punit Sharnagat
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Benjamin Bengfort
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phoneshabeebsab
 
Super Resolution with OCR Optimization
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Super Resolution with OCR OptimizationniveditJain
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencleSAT Publishing House
 

Was ist angesagt? (20)

An effective and robust technique for the binarization of degraded document i...
An effective and robust technique for the binarization of degraded document i...An effective and robust technique for the binarization of degraded document i...
An effective and robust technique for the binarization of degraded document i...
 
[Chung il kim] 0829 thesis
[Chung il kim] 0829 thesis[Chung il kim] 0829 thesis
[Chung il kim] 0829 thesis
 
Ijetcas14 527
Ijetcas14 527Ijetcas14 527
Ijetcas14 527
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
 
50120140501016
5012014050101650120140501016
50120140501016
 
A framework for outlier detection in
A framework for outlier detection inA framework for outlier detection in
A framework for outlier detection in
 
IRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and PythonIRJET - Object Detection using Deep Learning with OpenCV and Python
IRJET - Object Detection using Deep Learning with OpenCV and Python
 
Kernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of movingKernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of moving
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
 
Two Stage Reversible Data Hiding Based On Image Interpolation and Histogram ...
Two Stage Reversible Data Hiding Based On Image Interpolation  and Histogram ...Two Stage Reversible Data Hiding Based On Image Interpolation  and Histogram ...
Two Stage Reversible Data Hiding Based On Image Interpolation and Histogram ...
 
Bhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueBhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogue
 
G1802053147
G1802053147G1802053147
G1802053147
 
H1802054851
H1802054851H1802054851
H1802054851
 
Network embedding
Network embeddingNetwork embedding
Network embedding
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
 
Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)Dynamics in graph analysis (PyData Carolinas 2016)
Dynamics in graph analysis (PyData Carolinas 2016)
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phones
 
Super Resolution with OCR Optimization
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Super Resolution with OCR Optimization
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencl
 

Andere mochten auch

Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...Konstantinos Zagoris
 
Svm based cbir of breast masses on mammograms
Svm based cbir of breast masses on mammogramsSvm based cbir of breast masses on mammograms
Svm based cbir of breast masses on mammogramsKonstantinos Zagoris
 
Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...Konstantinos Zagoris
 
Content and Metadata Based Image Document Retrieval (in Greek)
Content and Metadata Based Image Document Retrieval (in Greek)Content and Metadata Based Image Document Retrieval (in Greek)
Content and Metadata Based Image Document Retrieval (in Greek)Konstantinos Zagoris
 
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...Konstantinos Zagoris
 
A Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
A Semi-Automatic Annotation Tool For Arabic Online Handwritten TextA Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
A Semi-Automatic Annotation Tool For Arabic Online Handwritten TextRanda Elanwar
 
A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1Prasad Babu
 
Word segmentation method for handwritten documents based on structured learning
Word segmentation method for handwritten documents based on structured learningWord segmentation method for handwritten documents based on structured learning
Word segmentation method for handwritten documents based on structured learningI3E Technologies
 
Holistic Approach for Arabic Word Recognition
Holistic Approach for Arabic Word RecognitionHolistic Approach for Arabic Word Recognition
Holistic Approach for Arabic Word RecognitionEditor IJCATR
 
Query expansion based on visual content new
Query expansion based on visual content newQuery expansion based on visual content new
Query expansion based on visual content newLazaros Tsochatzidis
 
Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognitionDhiraj Singh
 
Performance of Statistics Based Line Segmentation System for Unconstrained H...
Performance of Statistics Based Line Segmentation  System for Unconstrained H...Performance of Statistics Based Line Segmentation  System for Unconstrained H...
Performance of Statistics Based Line Segmentation System for Unconstrained H...AM Publications
 
Arabic Handwritten Script Recognition Towards Generalization: A Survey
Arabic Handwritten Script Recognition Towards Generalization: A Survey Arabic Handwritten Script Recognition Towards Generalization: A Survey
Arabic Handwritten Script Recognition Towards Generalization: A Survey Randa Elanwar
 

Andere mochten auch (14)

Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...
 
Svm based cbir of breast masses on mammograms
Svm based cbir of breast masses on mammogramsSvm based cbir of breast masses on mammograms
Svm based cbir of breast masses on mammograms
 
Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...Color reduction using the combination of the kohonen self organized feature m...
Color reduction using the combination of the kohonen self organized feature m...
 
Content and Metadata Based Image Document Retrieval (in Greek)
Content and Metadata Based Image Document Retrieval (in Greek)Content and Metadata Based Image Document Retrieval (in Greek)
Content and Metadata Based Image Document Retrieval (in Greek)
 
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
Comparative Performance Evaluation of Image Descriptors Over IEEE 802.11b Noi...
 
Spotting Customers in Trouble
Spotting Customers in TroubleSpotting Customers in Trouble
Spotting Customers in Trouble
 
A Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
A Semi-Automatic Annotation Tool For Arabic Online Handwritten TextA Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
A Semi-Automatic Annotation Tool For Arabic Online Handwritten Text
 
A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1A simple text ine segmentation method for handwritten documents1
A simple text ine segmentation method for handwritten documents1
 
Word segmentation method for handwritten documents based on structured learning
Word segmentation method for handwritten documents based on structured learningWord segmentation method for handwritten documents based on structured learning
Word segmentation method for handwritten documents based on structured learning
 
Holistic Approach for Arabic Word Recognition
Holistic Approach for Arabic Word RecognitionHolistic Approach for Arabic Word Recognition
Holistic Approach for Arabic Word Recognition
 
Query expansion based on visual content new
Query expansion based on visual content newQuery expansion based on visual content new
Query expansion based on visual content new
 
Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognition
 
Performance of Statistics Based Line Segmentation System for Unconstrained H...
Performance of Statistics Based Line Segmentation  System for Unconstrained H...Performance of Statistics Based Line Segmentation  System for Unconstrained H...
Performance of Statistics Based Line Segmentation System for Unconstrained H...
 
Arabic Handwritten Script Recognition Towards Generalization: A Survey
Arabic Handwritten Script Recognition Towards Generalization: A Survey Arabic Handwritten Script Recognition Towards Generalization: A Survey
Arabic Handwritten Script Recognition Towards Generalization: A Survey
 

Ähnlich wie ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)

IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptxKtonNguyn2
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksDataBench
 
How to valuate and determine standard essential patents
How to valuate and determine standard essential patentsHow to valuate and determine standard essential patents
How to valuate and determine standard essential patentsMIPLM
 
DISCUSSION 1The Internet of Things (IoT) is based upon emerging .docx
DISCUSSION 1The Internet of Things (IoT) is based upon emerging .docxDISCUSSION 1The Internet of Things (IoT) is based upon emerging .docx
DISCUSSION 1The Internet of Things (IoT) is based upon emerging .docxelinoraudley582231
 
Rag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented GenerationRag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented Generationkevig
 
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONRAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONkevig
 
Document clustering for forensic analysis
Document clustering for forensic analysisDocument clustering for forensic analysis
Document clustering for forensic analysissrinivasa teja
 
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...Advanced-Concepts-Team
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoopIRJET Journal
 
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010ivan provalov
 
Seven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAASeven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAAJames Lawlor
 
Creating a dataset of peer review in computer science conferences published b...
Creating a dataset of peer review in computer science conferences published b...Creating a dataset of peer review in computer science conferences published b...
Creating a dataset of peer review in computer science conferences published b...Aliaksandr Birukou
 
Information-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks upInformation-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks upEswar Publications
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionGeorge Awad
 
Tomorrows language technology
Tomorrows language technologyTomorrows language technology
Tomorrows language technologyRoss Turner
 

Ähnlich wie ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014) (20)

IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
 
How to valuate and determine standard essential patents
How to valuate and determine standard essential patentsHow to valuate and determine standard essential patents
How to valuate and determine standard essential patents
 
DISCUSSION 1The Internet of Things (IoT) is based upon emerging .docx
DISCUSSION 1The Internet of Things (IoT) is based upon emerging .docxDISCUSSION 1The Internet of Things (IoT) is based upon emerging .docx
DISCUSSION 1The Internet of Things (IoT) is based upon emerging .docx
 
Rag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented GenerationRag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented Generation
 
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONRAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
 
Cocomo models
Cocomo modelsCocomo models
Cocomo models
 
Document clustering for forensic analysis
Document clustering for forensic analysisDocument clustering for forensic analysis
Document clustering for forensic analysis
 
Cocomo
CocomoCocomo
Cocomo
 
Generator of pseudorandom sequences
Generator of pseudorandom sequences Generator of pseudorandom sequences
Generator of pseudorandom sequences
 
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
 
Association Rule Mining using RHadoop
Association Rule Mining using RHadoopAssociation Rule Mining using RHadoop
Association Rule Mining using RHadoop
 
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010
 
Seven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAASeven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAA
 
Creating a dataset of peer review in computer science conferences published b...
Creating a dataset of peer review in computer science conferences published b...Creating a dataset of peer review in computer science conferences published b...
Creating a dataset of peer review in computer science conferences published b...
 
Information-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks upInformation-Flow Analysis of Design Breaks up
Information-Flow Analysis of Design Breaks up
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
 
Software metrics
Software metricsSoftware metrics
Software metrics
 
Tomorrows language technology
Tomorrows language technologyTomorrows language technology
Tomorrows language technology
 

Kürzlich hochgeladen

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 

Kürzlich hochgeladen (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 

ICFHR 2014 Competition on Handwritten KeyWord Spotting (H-KWS 2014)

  • 1. ICFHR 2014 COMPETITION ON HANDWRITTEN KEYWORD SPOTTING (H-KWS 2014) IOANNIS PRATIKAKIS1 KONSTANTINOS ZAGORIS1,2 BASILIS GATOS2 Visual Computing Group Democritus University of Thrace Dept. of Electrical and Computer Engineering Xanthi, Greece National Centre of Scientific Research “Demokritos” Institute of Informatics and Telecommunications Athens, Greece 1 2 GEORGIOS LOULOUDIS2 NIKOLAOS STAMATOPOULOS2
  • 2. INTRODUCING ICFHR 2014 H-KWS Handwritten keyword spotting is the task of detecting query words in handwritten document image collections without explicitly recognizing it. The objective of the H-KWS 2014 is threefold: • Record current advances in keyword spotting. • Provide benchmarking handwritten datasets containing both historical and modern documents from multiple writers. • Explore established evaluation performance measures frequently encountered in the information retrieval literature while providing the software for these measures as implementation reference.
  • 3. ORGANIZING ICFHR 2014 H-KWS TRACK I - SEGMENTATION-BASED • 50 document images of Bentham dataset. • 100 document images of Modern dataset (25 documents per language). • Word Location in XML format. TRACK II - SEGMENTATION-FREE • 50 document images of Bentham dataset. • 100 document images of Modern dataset (25 documents per language).
  • 4. ICFHR 2014 H-KWS TIMELINE Registration to the contest 7/4/2014 Submission of the Results from the participants 2014 April May June July August September 2014 ICFHR 2014 H-KWS Presentation 4/9/2014 12/5/2014 Submission of competition report/paper 21/4/2014 14/4/2014 Release of the datasets and the query sets Today
  • 5. BENTHAM DATASET It consists of high quality (approximately 3000 pixel width and 4000 pixel height) handwritten manuscripts. The documents are written by Jeremy Bentham (1748- 1832) himself as well as by Bentham’s secretarial staff over a period of sixty years.
  • 6. MODERN DATASET It consists of modern handwritten documents from the ICDAR 2009 Handwritten Segmentation Contest. These documents originate from several writers that were asked to copy a given text. They do not include any non-text elements (lines, drawings, etc.). They are written in four (4) languages: English, French, German and Greek.
  • 7. CHALLENGES They both contain several very difficult problems to be addressed, wherein the most difficult is the word variability. The variation of the same word is high and involves: • writing style • font size • noise • their combination BENTHAM MODERN
  • 8. WORD-LENGTH STATISTICS FOR EACH DATASET BENTHAM MODERN 2500 2000 1500 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Frequency Word Length (characters) TRACK I TRACK II 3500 3000 2500 2000 1500 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Frequency Word Length (characters) TRACK I TRACK II
  • 9. QUERY STATISTICS The query set of each dataset is provided in XML format and it contains word image queries of length greater than 6 and frequency greater than 5. TRACK I - SEGMENTATION-BASED • 320 queries for the Bentham Dataset • 300 queries for the Modern Dataset TRACK II - SEGMENTATION-FREE • 290 queries for the Bentham Dataset • 300 queries for the Modern Dataset 180 160 140 120 100 80 60 40 20 0 Query Word Lengths 5 6 7 8 9 10 11 12 13 14 15 16 17 Queries Query Word Length (characters) TRACK I - Bentham TRACK I - Modern TRACK II - Bentham TRACK II - Modern 160 140 120 100 80 60 40 20 0 Query Word Frequencies 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 Queries Query Word Frequency TRACK I - Bentham TRACK I - Modern TRACK II - Bentham TRACK II - Modern
  • 11. EVALUATION CHALLENGES • Small variations of the query word that can be found in the datasets. For example the word “husband” appears as well as : • husband, • husband: • husband. • Husband. • Husband] • Evaluating overall performance as well as precision. • Segmentation - free systems may not detect the whole word or include parts of another word.
  • 12. CHOSEN EVALUATION MEASURES • Precision at Top 5 Retrieved words (P@5) for evaluating precision performance. • The Mean Average Precision (MAP) for evaluating overall performance. • Normalized Discounted Cumulative Gain (NDCG) with binary judgment relevancies for evaluating precision-oriented overall performance. • Normalized Discounted Cumulative Gain (NDCG) with non-binary judgment relevancies for evaluating small variations of the query word.
  • 13. PRECISION AT TOP K RETRIEVED WORD (P@K) Precision is the fraction of retrieved words that are relevant to the query. P@k is the precision for the k top retrieved words. In the proposed evaluation, P@5 is used which is the precision at top 5 retrieved words. This metric defines how successfully the algorithms produce relevant results to the first 5 positions of the ranking list. 푃@푘 = 푟푒푙푒푣푎푛푡 푤표푟푑푠 ∩ 푘 푟푒푡푟푖푒푣푒푑 푤표푟푑푠 푘 푟푒푡푟푖푒푣푒푑 푤표푟푑푠
  • 14. MEAN AVERAGE PRECISION (MAP) It is a typical measure for the performance of information retrieval systems. It is implemented from the Text REtrieval Conference (TREC) community by the National Institute of Standards and Technology (NIST). It is defined as the average of the precision value obtained after each relevant word is retrieved: 퐴푃 = 푛 푘=1 푃@퐾 × 푟푒푙 푘 푟푒푙푒푣푎푛푡 푤표푟푑푠 푟푒푙 푘 = 1, 푖푓 푤표푟푑 푎푡 푟푎푛푘 푘 푖푠 푟푒푙푒푣푎푛푡 0, 푖푓 푤표푟푑 푎푡 푟푎푛푘 푘 푖푠 푛표푡 푟푒푙푒푣푎푛푡 where:
  • 15. NORMALIZED DISCOUNTED CUMULATIVE GAIN (NDCG) The NDCG measures the performance of a retrieval system based on the graded relevance of the retrieved entities. It varies from 0.0 to 1.0, with 1.0 representing the ideal ranking of the entities. It is defined as: 푛퐷퐶퐺 = 퐷퐶퐺 퐼퐷퐶퐺 퐷퐶퐺 = 푟푒푙1 + 푛 푖=1 푟푒푙푖 푙표푔2 푖 + 1 where: reli is the relevance judgment at position i and IDCG is the ideal DCG which is computed from the perfect retrieval result.
  • 16. NON-BINARY VS BINARY RELEVANCE JUDGMENT VALUES NON-BINARY Word Relevance Judgment husband 1.0 husband 1.0 husband 1.0 husband 1.0 husband 1.0 husband, 0.9 husband: 0.9 husband. 0.9 Husband. 0.8 Husband] 0.8 BINARY Word Relevance Judgment husband 1.0 husband 1.0 husband 1.0 husband 1.0 husband 1.0 husband, 1.0 husband: 1.0 husband. 1.0 Husband. 1.0 Husband] 1.0
  • 17. SEGMENTATION – FREE OVERLAPPING THRESHOLD A word instance is considered as detected only if there is a significant overlap with the ground truth word. The overlap is expressed by the intersection over the ground truth word area metric (IOA) and it is defined as: 퐼푂퐴 = 퐴 ∩ 퐵 퐴 where A and B denote the bounding box areas of the ground truth word and the method output word, respectively. The IOA metric ranges from 0 to 1, where 1 corresponds to exact matching. A thresholdT is applied in order to decide whether the word instance and the segmented word match sufficiently. The performance evaluation for three different thresholds (0.6, 0.7 and 0.8) is used for testing.
  • 18. EVALUATION APPLICATION • An evaluation application is developed as referenced implementation for each metric. • It is available for Windows, Mac OS X and Linux operating systems as both command-line and GUI form. • It accepts as input the experimental results file and the relevance judgment file, which represents the ground truth. Afterwards, it calculates the aforementioned evaluation metrics.
  • 21. PARTICIPANTS Method Affiliation Participating G1 The Blavatnik School of Computer Science, Tel-Aviv University, Israel TRACK I TRACK II G2 Computer Vision Center, Barcelona, Spain TRACK I G3 Smith College Department of Computer Science, Northampton MA, USA TRACK I TRACK II G4 Université de Lyon, CNRS INSA - Lyon, LIRIS, France TRACK II G5 Institute for Communications Technology (IfN) of Technische Universität Braunschweig, Braunschweig, Germany TRACK II
  • 22. TRACK I: SEGMENTATION-BASED - EXPERIMENTAL RESULTS BENTHAM DATASET Method P@5 MAP NDCG(Binary) NDCG G1 0.738 (1) 0.524 (1) 0.742 (2) 0.762 (2) G2 0.724 (2) 0.513 (2) 0.744 (1) 0.764 (1) G3 0.718 (3) 0.462 (3) 0.638 (3) 0.657 (3) MODERN DATASET Method P@5 MAP NDCG(Binary) NDCG G1 0.588 (2) 0.338 (2) 0.611 (2) 0.612 (2) G2 0.706 (1) 0.523 (1) 0.757 (1) 0.757 (1) G3 0.569 (3) 0.278 (3) 0.484 (3) 0.484 (3)
  • 23. TRACK I: SEGMENTATION-BASED - PRECISION – RECALL CURVES BENTHAM DATASET 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall G1 G2 G3 MODERN DATASET 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall G1 G2 G3
  • 24. TRACK II: SEGMENTATION-FREE - EXPERIMENTAL RESULTS BENTHAM DATASET P@5 MAP NDCG (Binary) NDCG Method Overlapping Threshold Average Overlapping Threshold Average Overlapping Threshold Average Overlapping Threshold Average 0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 G1 0.617 0.611 0.599 0.609 (1) 0.428 0.419 0.402 0.416 (1) 0.653 0.640 0.621 0.638 (1) 0.671 0.657 0.640 0.560 (1) G3 0.596 0.568 0.506 0.556 (2) 0.397 0.372 0.321 0.363 (2) 0.551 0.518 0.457 0.509 (2) 0.569 0.536 0.474 0.526 (2) G4 0.351 0.341 0.313 0.335 (4) 0.219 0.209 0.187 0.205 (4) 0.386 0.363 0.319 0.356 (4) 0.400 0.376 0.331 0.369 (4) G5 0.597 0.55 0.477 0.543 (3) 0.385 0.347 0.280 0.337(3) 0.569 0.513 0.424 0.502 (3) 0.586 0.531 0.440 0.519 (3) MODERN DATASET P@5 MAP NDCG (Binary) NDCG Method Overlapping Threshold Average Overlapping Threshold Average Overlapping Threshold Average Overlapping Threshold Average 0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 0.6 0.7 0.8 G1 0.541 0.541 0.535 0.539 (1) 0.265 0.265 0.259 0.263 (1) 0.491 0.484 0.473 0.483 (1) 0.491 0.485 0.474 0.483 (1) G3 0.429 0.422 0.399 0.417 (2) 0.170 0.165 0.152 0.163 (2) 0.310 0.301 0.277 0.296 (2) 0.310 0.301 0.277 0.296 (2) G4 0.250 0.241 0.211 0.234 (4) 0.095 0.089 0.077 0.087 (4) 0.218 0.195 0.161 0.191 (4) 0.218 0.195 0.161 0.191 (4) G5 0.264 0.247 0.223 0.245 (3) 0.100 0.092 0.081 0.091 (3) 0.229 0.201 0.168 0.199 (3) 0.229 0.202 0.168 0.200 (3)
  • 25. TRACK I: SEGMENTATION-BASED - PRECISION – RECALL CURVES BENTHAM DATASET 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall G1 G3 G4 G5 MODERN DATASET 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Precision Recall G1 G3 G4 G5
  • 26. FINAL RANKING TRACK I: SEGMENTATION-BASED Rank Method Score 1 G2 10 2 G1 14 3 G3 24 TRACK II: SEGMENTATION-FREE Rank Method Score 1 G1 8 2 G3 16 3 G5 24 4 G4 32
  • 27. AND THE WINNER IS: FOR TRACK I – SEGMENTATION-BASED: Method G2 which has been submitted by Jon Almazán, Albert Gordo, Ernest Valveny affiliated to the: Computer Vision Center, Universitat Autònoma de Barcelona, Spain. TRACK II – SEGMENTATION-FREE Method G1 which has been submitted by Alon Kovalchuk, Lior Wolf, Nachum Dershowitz affiliated to the: Blavatnik School of Computer Science, Tel-Aviv University, Israel.

Hinweis der Redaktion

  1. Recently, handwritten word spotting has attracted the attention of the research community in the field of document image analysis and recognition since it appears to be a feasible solution for indexing and retrieval of handwritten documents in the case that OCR-based methods fail to deliver satisfactory results
  2. Dataset word length frequencies. Αυτά που μπορείς να πεις ότι οι κατανομές είναι ίδιες περίπου παρόλο που είναι διαφορετικά κείμενα
  3. Με τα στατιστικά μπορείς να πεις ότι κυρίως έχουμε 7-10 χαρακτήρων λέξεις αλλά υπάρχουν και μερικές αρκετά μεγάλες όπως 15 χαρακτήρων. Μερικά παραδείγματα είναι «straightforward» , «philosophical» και αρκετές γερμανικές. (οι πολύ μεγάλες είναι στο Modern) Για την συχνότητα. Μπορείς να πεις ότι κυρίως εμφανίζονται 12-14 φορές αλλά έχουμε και κάποιες που εμφανίζονται στην βάση μας 50 φορές.
  4. Από εδώ μπορούν να κατεβάσουν τις εικόνες και τo query set.
  5. Αρχικά τα παρουσιάζω και τι προβλήματα λύνουν και μετά τα εξηγώ.
  6. Αν σε ρωτήσουν το map, p@5 κ.τ.λ. χρησιμοποιεί τα binary relevance values. Δηλαδή π.χ. θεωρεί το «husband:» σωστό. Κάθε αλλαγή μειώνει κατά ένα το relevance
  7. Evaluation εργαλείο
  8. Από εδώ μπορούν να κατεβάσουν τα relevance judgment και το segmentation ground truth
  9. For the sake of clarity, it should be noted that each algorithm appears with the enumeration of the group. Θύμισέ τους για G1, G2. the order of appearance reflects the chronological order of expressing an interest to participate in the competition
  10. For the sake of clarity, it should be noted that each algorithm appears with the enumeration of the group as presented previous slide. For example, a method submitted by the group No. 3 will appear as G3 (Group 3). the order of appearance reflects the chronological order of expressing an interest to participate in the competition. Inside the parenthesis is the ranking value between the competing methods for the corresponding algorithm. The summation of all accumulated ranking values for all evaluation metrics denote the final score
  11. The PR curves are not participating to the calculation of the final ranking. It is for the sake of clarity
  12. The PR curves are not participating to the calculation of the final ranking. It is for the sake of clarity