SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Text detection in video images using adaptive
 edge detection and stroke width verification

Haojin Yang, Bernhard Quehl, Harald Sack


 April 11 – 13, IWSSIP2012, Vienna (Austria)

     Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Agenda




                             (1)  Introduction / Motivation
                             (2)  Related works
                             (3)  Text detection in video frames
                             (4)  Evaluation and experimental results
                             (5)  Conclusion




Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012
Project Mediaglobe

     • Semantic Search Engine for Media Archives

     •  Enable exploratory and semantic search in
        Audiovisual Media Archives

     http://www.projekt-mediaglobe.de/
3




        Seach in multimedia archive?!
    Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012
Automated Audiovisual Analysis!

                                                                 Concept "
                                                                 Analysis
                                                                Classification:"
                                                                   Studio"
                                                                   Indoor"
                                                                News Show
                       Logo "
                                                  Overlay "                         Face "
                                                   Text
                          Detection
   Detection




                                             Scene"
        Audio-Mining
                                              Text


Structural"      Automated"       Speaker"
                  Speech"
 Analysis
       Recognition
   Identification


                                    Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Common OCR vs. Video OCR
•  optimized for Scans                •  low resolution
•  high resolution
                   •  heterogeneous background
•  usually white on black
            •  (motion) blurring 
•  homogenous background              •  perspective distortion
                                      •  uneven illumination
                                      •  shading, rotation 
                                      •  large amounts of data (Images)




            Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Related Works
6   Most of proposed text detection methods take use of texture features, edges, colors
    and some text representative features e.g., stroke width feature.


    Chen et al.[4] Text detection and recognition in images and video frames:
              •  edge based approaches achive a high recall rate 
              •  but may also produce many false alarms



    Epshtein et al. [1] proposed the SWT (Stroke Width Transform) for text detection of nature
    scene images. Shortcomings of the original SWT approach:
              •  Robust to distinguish text like non-text objects
              • The computation of SWT quite costly for images
                with complex contents.




                                        Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Text Detector
7
    Workflow of edge based text detector:




       (a) Original image             (b) Vertical edge map           (c) Vertical dilation map




    (d) Binary map of (c)   (e) Binary map after    (f) After projection-        (g) Detection result
                                 connected          profiling refinement
                             Componet analysis
                                   Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Text Verification – Workflow

8




                         e.g




               Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
SWT Based Text Verification
9
    Stroke Width Transformation 
        (a)   
Boundary detection
        (b)   
From each boundary pixel p send a ray along the text gradient direction, this leads to find
              
another boundary pixel q.
        (c)   
Calculate the potential stroke width value between p und q




                (a)                                 (b)                                    (c)
                                     Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
SWT Based Text Verification
10
     Stroke Width Transformation result example




                         An example output image from stroke width transform for
                                             character w.


                               Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
SWT Based Text Verification
11
     
SWT Verification Constrains:
      A text candidate component is discarded if:
            •    Its stroke width variance is lying inbetween (MinVar, MaxVar) threshold
            •    Its mean stroke width is lying inbetween (MinStroke, MaxStroke) threshold


      •    Generating of the character component by merging candidate
           components with similar stroke width value.


      •     Then, creating character chains by merging character components with 
           
 a similar color and a small distance.


      •     The final verified text line must have more than 2 character chains.


                                    Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
SWT Based Text Verification
12   Edge detection projection profiling




                                →                                →




     SWT Text Verification on profiling candidates




                                →                                 →




                                  Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Evaluation and Experimental
     Results
13
     Experiment setup:
        Test set: 


       •  Mediaglobe test set (31 images)



       •  German TV news test set (72 images)



       •  Microsoft common test set (45 images)


                        Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Evaluation Results
15   •  Evaluation Microsoft common test set 
                                        Method
                                                          Recall     Precision     F1 measure
                              Zhao et al.[10]                0.94           0.98             0.96
                              Thillou et. Al [11]            0.91           0.94             0.92
                              Lienhard et. al.[12]           0.91           0.94             0.92
                              Shivakumara et. al. [4]        0.92           0.90             0.91
                              Gllavata et. al. [13]          0.90           0.87             0.88
                                                             0.93           0.94             0.93
                              Our

     •  Evaluation other test sets 
                              Testset
                                                          Recall     Precision     F1 measure
                              TV News                       0.86            0.81             0.83

                              Mediaglobe                    0.75            0.81             0.77




     •  Example images: http://yovisto.com/labs/VideoOCR/visualResult/
                                    Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Conclusion
16             We have presented a localization-verification
                scheme for text detection in video images.

     •  Using fast edge text detector and an adaptive refinement

      to reduce the false alarms

     •  The proposed method is quite competitive to

      other existing methods

     •  Detect differenced writing systems (English, Japanese, Arabic )




                         Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Reference
17
      [1] B. Epshtein, E. Ofek, Y. Wexler. “Detecting Text in Natural Scene with Stroke Width Transform,” in
           Proc. of Computer Vision and Pattern Recognition, 2010, pp. 2963–2970.


      [2] Y. Zhong, H-J. Zhang, and A. Jain, “Automatic caption localization in compressed video,” IEEE
           Transactions on Pattern Analysis and Machine Intelligence, pp. 385– 392, 2000


      [3] X. Qian, G. Liu, H. Wang, and R. Su, “Text detection, localization and tracking in compressed
           video,” in Proc. of Signal Processing: Image Communication, 2007, pp. 752–768


      [4] Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8
           (6), 679–698 (1986). DOI 10.1109/TPAMI.1986.4767851. URL http: //dx.doi.org/10.1109/TPAMI.
           1986.4767851


      [5] http://yovisto.com/labs/VideoOCR/
      [6] http://www.cs.cityu.edu.hk/~liuwy/PE_VTDetect/




                                      Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
Text detection in video images using
                           adaptive edge detection and
                           stroke width verification


                             Thank you for your attention!


                             Bernhard Quehl
                             Hasso-Plattner-Institut Potsdam
                             Prof.-Dr.-Helmert Str. 2-4
                             14482 Potsdam
                             phone:      #+49 (0)331-5509-548#


                             email: bernhard.quehl@hpi.uni-potsdam.de#
                             web:   http://www.hpi.uni-potsdam.de/#




Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012

Weitere ähnliche Inhalte

Ähnlich wie Presentation iwssip2012

A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
Seokhwan Kim
 
Bayesian Networks with R and Hadoop
Bayesian Networks with R and HadoopBayesian Networks with R and Hadoop
Bayesian Networks with R and Hadoop
DataWorks Summit
 
Exploratory Visual Analysis in Large High-Resolution Displays
Exploratory Visual Analysis in Large High-Resolution DisplaysExploratory Visual Analysis in Large High-Resolution Displays
Exploratory Visual Analysis in Large High-Resolution Displays
lio889
 
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
Edge AI and Vision Alliance
 

Ähnlich wie Presentation iwssip2012 (20)

Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
 
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
論文紹介:Temporal Sentence Grounding in Videos: A Survey and Future Directions
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
 
Das09112008
Das09112008Das09112008
Das09112008
 
Bayesian Networks with R and Hadoop
Bayesian Networks with R and HadoopBayesian Networks with R and Hadoop
Bayesian Networks with R and Hadoop
 
Bayesian Networks with R and Hadoop
Bayesian Networks with R and HadoopBayesian Networks with R and Hadoop
Bayesian Networks with R and Hadoop
 
Exploratory Visual Analysis in Large High-Resolution Displays
Exploratory Visual Analysis in Large High-Resolution DisplaysExploratory Visual Analysis in Large High-Resolution Displays
Exploratory Visual Analysis in Large High-Resolution Displays
 
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
A Framework for Comparison and Evaluation of Nonlinear Intra-Subject Image Re...
A Framework for Comparison and Evaluation of Nonlinear Intra-Subject Image Re...A Framework for Comparison and Evaluation of Nonlinear Intra-Subject Image Re...
A Framework for Comparison and Evaluation of Nonlinear Intra-Subject Image Re...
 
Intermediate inception network for person re-identification
Intermediate inception network for person re-identificationIntermediate inception network for person re-identification
Intermediate inception network for person re-identification
 
Memes & Fitness Landscapes - analogies of testing with sci evol (2011)
Memes & Fitness Landscapes - analogies of testing with sci evol (2011)Memes & Fitness Landscapes - analogies of testing with sci evol (2011)
Memes & Fitness Landscapes - analogies of testing with sci evol (2011)
 
A New Approach for video denoising and enhancement using optical flow Estimation
A New Approach for video denoising and enhancement using optical flow EstimationA New Approach for video denoising and enhancement using optical flow Estimation
A New Approach for video denoising and enhancement using optical flow Estimation
 
JPM1417 Characterness: An Indicator of Text in the Wild
JPM1417   Characterness: An Indicator of Text in the WildJPM1417   Characterness: An Indicator of Text in the Wild
JPM1417 Characterness: An Indicator of Text in the Wild
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
 
Lane detection by use of canny edge
Lane detection by use of canny edgeLane detection by use of canny edge
Lane detection by use of canny edge
 
F0164348
F0164348F0164348
F0164348
 

Kürzlich hochgeladen

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Kürzlich hochgeladen (20)

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Presentation iwssip2012

  • 1. Text detection in video images using adaptive edge detection and stroke width verification Haojin Yang, Bernhard Quehl, Harald Sack April 11 – 13, IWSSIP2012, Vienna (Austria) Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 2. Agenda (1)  Introduction / Motivation (2)  Related works (3)  Text detection in video frames (4)  Evaluation and experimental results (5)  Conclusion Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012
  • 3. Project Mediaglobe • Semantic Search Engine for Media Archives •  Enable exploratory and semantic search in Audiovisual Media Archives http://www.projekt-mediaglobe.de/
  • 4. 3 Seach in multimedia archive?! Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012
  • 5. Automated Audiovisual Analysis! Concept " Analysis Classification:" Studio" Indoor" News Show Logo " Overlay " Face " Text Detection Detection Scene" Audio-Mining Text Structural" Automated" Speaker" Speech" Analysis Recognition Identification Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 6. Common OCR vs. Video OCR •  optimized for Scans •  low resolution •  high resolution •  heterogeneous background •  usually white on black •  (motion) blurring •  homogenous background •  perspective distortion •  uneven illumination •  shading, rotation •  large amounts of data (Images) Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 7. Related Works 6 Most of proposed text detection methods take use of texture features, edges, colors and some text representative features e.g., stroke width feature. Chen et al.[4] Text detection and recognition in images and video frames: •  edge based approaches achive a high recall rate •  but may also produce many false alarms Epshtein et al. [1] proposed the SWT (Stroke Width Transform) for text detection of nature scene images. Shortcomings of the original SWT approach: •  Robust to distinguish text like non-text objects • The computation of SWT quite costly for images with complex contents. Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 8. Text Detector 7 Workflow of edge based text detector: (a) Original image (b) Vertical edge map (c) Vertical dilation map (d) Binary map of (c) (e) Binary map after (f) After projection- (g) Detection result connected profiling refinement Componet analysis Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 9. Text Verification – Workflow 8 e.g Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 10. SWT Based Text Verification 9 Stroke Width Transformation (a) Boundary detection (b) From each boundary pixel p send a ray along the text gradient direction, this leads to find another boundary pixel q. (c) Calculate the potential stroke width value between p und q (a) (b) (c) Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 11. SWT Based Text Verification 10 Stroke Width Transformation result example An example output image from stroke width transform for character w. Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 12. SWT Based Text Verification 11 SWT Verification Constrains: A text candidate component is discarded if: •  Its stroke width variance is lying inbetween (MinVar, MaxVar) threshold •  Its mean stroke width is lying inbetween (MinStroke, MaxStroke) threshold •  Generating of the character component by merging candidate components with similar stroke width value. •  Then, creating character chains by merging character components with a similar color and a small distance. •  The final verified text line must have more than 2 character chains. Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 13. SWT Based Text Verification 12 Edge detection projection profiling → → SWT Text Verification on profiling candidates → → Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 14. Evaluation and Experimental Results 13 Experiment setup: Test set: •  Mediaglobe test set (31 images) •  German TV news test set (72 images) •  Microsoft common test set (45 images) Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 15. Evaluation Results 15 •  Evaluation Microsoft common test set Method Recall Precision F1 measure Zhao et al.[10] 0.94 0.98 0.96 Thillou et. Al [11] 0.91 0.94 0.92 Lienhard et. al.[12] 0.91 0.94 0.92 Shivakumara et. al. [4] 0.92 0.90 0.91 Gllavata et. al. [13] 0.90 0.87 0.88 0.93 0.94 0.93 Our •  Evaluation other test sets Testset Recall Precision F1 measure TV News 0.86 0.81 0.83 Mediaglobe 0.75 0.81 0.77 •  Example images: http://yovisto.com/labs/VideoOCR/visualResult/ Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 16. Conclusion 16 We have presented a localization-verification scheme for text detection in video images. •  Using fast edge text detector and an adaptive refinement to reduce the false alarms •  The proposed method is quite competitive to other existing methods •  Detect differenced writing systems (English, Japanese, Arabic ) Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 17. Reference 17 [1] B. Epshtein, E. Ofek, Y. Wexler. “Detecting Text in Natural Scene with Stroke Width Transform,” in Proc. of Computer Vision and Pattern Recognition, 2010, pp. 2963–2970. [2] Y. Zhong, H-J. Zhang, and A. Jain, “Automatic caption localization in compressed video,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 385– 392, 2000 [3] X. Qian, G. Liu, H. Wang, and R. Su, “Text detection, localization and tracking in compressed video,” in Proc. of Signal Processing: Image Communication, 2007, pp. 752–768 [4] Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8 (6), 679–698 (1986). DOI 10.1109/TPAMI.1986.4767851. URL http: //dx.doi.org/10.1109/TPAMI. 1986.4767851 [5] http://yovisto.com/labs/VideoOCR/ [6] http://www.cs.cityu.edu.hk/~liuwy/PE_VTDetect/ Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack
  • 18. Text detection in video images using adaptive edge detection and stroke width verification Thank you for your attention! Bernhard Quehl Hasso-Plattner-Institut Potsdam Prof.-Dr.-Helmert Str. 2-4 14482 Potsdam phone:  #+49 (0)331-5509-548# email: bernhard.quehl@hpi.uni-potsdam.de# web:   http://www.hpi.uni-potsdam.de/# Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012