Presentation iwssip2012

Text detection in video images using adaptive
edge detection and stroke width veriﬁcation

Haojin Yang, Bernhard Quehl, Harald Sack

April 11 – 13, IWSSIP2012, Vienna (Austria)

Hasso Plattner Institute | H-J. Yang, B.Quehl, H. Sack

Agenda

(1)  Introduction / Motivation
(2)  Related works
(3)  Text detection in video frames
(4)  Evaluation and experimental results
(5)  Conclusion

Jörg Waitelonis, HPI | THESEUS | Innovationszentrum | 28.-29.11.2012

Project Mediaglobe

• Semantic Search Engine for Media Archives

•  Enable exploratory and semantic search in
Audiovisual Media Archives

http://www.projekt-mediaglobe.de/

3

Seach in multimedia archive?!

Automated Audiovisual Analysis!

Concept "
Analysis
Classiﬁcation:"
Studio"
Indoor"
News Show
Logo "
Overlay " Face "
Text
Detection
Detection

Scene"
Audio-Mining
Text

Structural" Automated" Speaker"
Speech"
Analysis
Recognition
Identiﬁcation


Common OCR vs. Video OCR
•  optimized for Scans •  low resolution
•  high resolution
•  heterogeneous background
•  usually white on black
•  (motion) blurring
•  homogenous background •  perspective distortion
•  uneven illumination
•  shading, rotation
•  large amounts of data (Images)


Related Works
6 Most of proposed text detection methods take use of texture features, edges, colors
and some text representative features e.g., stroke width feature.

Chen et al.[4] Text detection and recognition in images and video frames:
•  edge based approaches achive a high recall rate
•  but may also produce many false alarms

Epshtein et al. [1] proposed the SWT (Stroke Width Transform) for text detection of nature
scene images. Shortcomings of the original SWT approach:
•  Robust to distinguish text like non-text objects
• The computation of SWT quite costly for images
with complex contents.


Text Detector
7
Workﬂow of edge based text detector:

(a) Original image (b) Vertical edge map (c) Vertical dilation map

(d) Binary map of (c) (e) Binary map after (f) After projection- (g) Detection result
connected profiling refinement
Componet analysis

Text Verification – Workflow

8

e.g


SWT Based Text Verification
9
Stroke Width Transformation
(a)
Boundary detection
(b)
From each boundary pixel p send a ray along the text gradient direction, this leads to ﬁnd

another boundary pixel q.
(c)
Calculate the potential stroke width value between p und q

(a) (b) (c)

10
Stroke Width Transformation result example

An example output image from stroke width transform for
character w.


11

SWT Verification Constrains:
A text candidate component is discarded if:
•  Its stroke width variance is lying inbetween (MinVar, MaxVar) threshold
•  Its mean stroke width is lying inbetween (MinStroke, MaxStroke) threshold

•  Generating of the character component by merging candidate
components with similar stroke width value.

•  Then, creating character chains by merging character components with

a similar color and a small distance.

•  The final verified text line must have more than 2 character chains.


12 Edge detection projection profiling

→ →

SWT Text Verification on profiling candidates

→ →


Evaluation and Experimental
Results
13
Experiment setup:
Test set:

•  Mediaglobe test set (31 images)

•  German TV news test set (72 images)

•  Microsoft common test set (45 images)


Evaluation Results
15 •  Evaluation Microsoft common test set
Method
Recall Precision F1 measure
Zhao et al.[10] 0.94 0.98 0.96
Thillou et. Al [11] 0.91 0.94 0.92
Lienhard et. al.[12] 0.91 0.94 0.92
Shivakumara et. al. [4] 0.92 0.90 0.91
Gllavata et. al. [13] 0.90 0.87 0.88
0.93 0.94 0.93
Our

•  Evaluation other test sets
Testset
Recall Precision F1 measure
TV News 0.86 0.81 0.83

Mediaglobe 0.75 0.81 0.77

•  Example images: http://yovisto.com/labs/VideoOCR/visualResult/

Conclusion
16 We have presented a localization-verification
scheme for text detection in video images.

•  Using fast edge text detector and an adaptive refinement

to reduce the false alarms

•  The proposed method is quite competitive to

other existing methods

•  Detect differenced writing systems (English, Japanese, Arabic )


Reference
17
[1] B. Epshtein, E. Ofek, Y. Wexler. “Detecting Text in Natural Scene with Stroke Width Transform,” in
Proc. of Computer Vision and Pattern Recognition, 2010, pp. 2963–2970.

[2] Y. Zhong, H-J. Zhang, and A. Jain, “Automatic caption localization in compressed video,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, pp. 385– 392, 2000

[3] X. Qian, G. Liu, H. Wang, and R. Su, “Text detection, localization and tracking in compressed
video,” in Proc. of Signal Processing: Image Communication, 2007, pp. 752–768

[4] Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8
(6), 679–698 (1986). DOI 10.1109/TPAMI.1986.4767851. URL http: //dx.doi.org/10.1109/TPAMI.
1986.4767851

[5] http://yovisto.com/labs/VideoOCR/
[6] http://www.cs.cityu.edu.hk/~liuwy/PE_VTDetect/


Text detection in video images using
adaptive edge detection and
stroke width veriﬁcation

Thank you for your attention!

Bernhard Quehl
Hasso-Plattner-Institut Potsdam
Prof.-Dr.-Helmert Str. 2-4
14482 Potsdam
phone: #+49 (0)331-5509-548#

email: bernhard.quehl@hpi.uni-potsdam.de#
web: http://www.hpi.uni-potsdam.de/#


Presentation iwssip2012

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Presentation iwssip2012

Ähnlich wie Presentation iwssip2012 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Presentation iwssip2012