Skip to main content

An Efficient Method for Text Detection in Video Based on Stroke Width Similarity

  • Conference paper
Computer Vision – ACCV 2007 (ACCV 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4843))

Included in the following conference series:

Abstract

Text appearing in video provides semantic knowledge and significant information for video indexing and retrieval system. This paper proposes an effective method for text detection in video based on the similarity in stroke width of text (which is defined as the distance between two edges of a stroke). From the observation that text regions can be characterized by a dominant fixed stroke width, edge detection with local adaptive thresholds is first devised to keep text- while reducing background-regions. Second, morphological dilation operator with adaptive structuring element size determined by stroke width value is exploited to roughly localize text regions. Finally, to reduce false alarm and refine text location, a new multi-frame refinement method is applied. Experimental results show that the proposed method is not only robust to different levels of background complexity, but also effective to different fonts (size, color) and languages of text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhu, Q., Yeh, M.C., Cheng, K.T.: Multimodal fusion using learned text concepts for image categorization. In: Proc. of ACM Int’l. Conf. on Multimedia, pp. 211–220. ACM Press, New York (2006)

    Google Scholar 

  2. Lienhart, R.: Dynamic video summarization of home video. In: Proc. of SPIE, vol. 3972, pp. 378–389 (1999)

    Google Scholar 

  3. Fan, J., Luo, H., Elmagarmid, A.K.: Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing. IEEE Trans. on Image Processing 13, 974–992 (2004)

    Article  Google Scholar 

  4. Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Pattern Recognition 28, 1523–1536 (1995)

    Article  Google Scholar 

  5. Jain, A.K., Yu, B.: Automatic text location in images and video frames. In: Proc. of Int’l. Conf. on Pattern Recognition, vol. 2, pp. 1497–1499 (August 1998)

    Google Scholar 

  6. Ohya, J., Shio, A., Akamatsu, S.: Recognition characters in scene images. IEEE Trans. on Pattern Analysis and Machine Intelligence 16, 214–220 (1994)

    Article  Google Scholar 

  7. Qiao, Y.L., Li, M., Lu, Z.M., Sun, S.H.: Gabor filter based text extraction from digital document images. In: Proc. of Int’l. Conf. on Intelligent Information Hiding and Multimedia Signal Processing, pp. 297–300 (December 2006)

    Google Scholar 

  8. Li, H., Doermann, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Trans. on Image Processing, 147–156 (2000)

    Google Scholar 

  9. Chen, D., Bourlard, H., Thiran, J.P.: Text identification in complex background using SVM. In: Proc. of Int’l. Conf. on Document Analysis and Recognition, vol. 2, pp. 621–626 (December 2001)

    Google Scholar 

  10. Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. on Circuits Systems Video Technology, 243–255 (2005)

    Google Scholar 

  11. Jung, K.C., Han, J.H., Kim, K.I., Park, S.H.: Support vector machines for text location in news video images. In: Proc. of Int’l. Conf. on System Technology, pp. 176–189 (September 2000)

    Google Scholar 

  12. Gonzalez, R.-C., Woods, R.E.: Digital Image Processing, 2nd edn., pp. 602–608. Prentice-Hall, Englewood Cliffs (2002)

    Google Scholar 

  13. Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. on Circuits Systems Video Technology, 256–268 (2002)

    Google Scholar 

  14. Li, H., Doermann, D.: Text enhancement in digital video using multiple frame integration. In: Proc. of ACM Int’l. Conf. on Multimedia, pp. 19–22. ACM Press, New York (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Yasushi Yagi Sing Bing Kang In So Kweon Hongbin Zha

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dinh, V.C., Chun, S.S., Cha, S., Ryu, H., Sull, S. (2007). An Efficient Method for Text Detection in Video Based on Stroke Width Similarity. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds) Computer Vision – ACCV 2007. ACCV 2007. Lecture Notes in Computer Science, vol 4843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76386-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76386-4_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76385-7

  • Online ISBN: 978-3-540-76386-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics