Skip to main content

A Robust Video Text Extraction and Recognition Approach Using OCR Feedback Information

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing -- PCM 2015 (PCM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9314))

Included in the following conference series:

Abstract

Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

This work was supported by National Natural Science Foundation of China (Grant No. 61401023) and Fundamental University Research Fund of BIT (Grand No. 20140842001).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://www.youtube.com/yt/press/statistics.html

  2. Zhang, D., Chang, S.: Event detection in basketball video using superimposed caption recognition. In: Proceedings of the ACM MM, pp. 315–318 (2002)

    Google Scholar 

  3. Zhang, D., Rajendran, R., Chang, S.: General and domain-specific techniques for detecting and recognizing superimposed text in video. In: Proceedings of ICIP, pp. I-593–I-596

    Google Scholar 

  4. Kim, H.H.: Toward video semantic search based on a structured folksonomy. J. Am. Soc. Inf. Sci. Technol. 62(3), 478–492 (2011)

    Google Scholar 

  5. Bhute, A.N., Meshram, B.B.: Text based approach for indexing and retrieval of image and video: a review. Adv. Vis. Comput. 1(1), 27–38 (2014)

    Google Scholar 

  6. Mitra, V., Franco, H., Graciarena, M., Vergyri, D.: Medium-duration modulation cepstral feature for robust speech recognition. In: Proceedings of ICASSP, pp. 1749–1753 (2014)

    Google Scholar 

  7. Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circ. Syst. Video Technol. 15(2), 243–255 (2005)

    Article  Google Scholar 

  8. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Circ. Syst. Video Technol. 9(1), 62–66 (1979)

    Google Scholar 

  9. Leedham, G., Yan, C., Takru, K., Tan, J.H.N., Mian, L.: Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In: Proceedings of ICDAR, pp. 859–864 (2003)

    Google Scholar 

  10. Ngo, C.W., Chan, C.K.: Video text detection and segmentation for optical character recognition. Multimedia Syst. 10(3), 261–272 (2005)

    Article  Google Scholar 

  11. Kim, W., Kim, C.: A new approach for overlay text detection and extraction from complex video scene. IEEE Trans. Image Process. 18(2), 401–411 (2009)

    Article  MathSciNet  Google Scholar 

  12. Gao, J., Yang, J.: An adaptive algorithm for text detection from natural scenes. In: Proceedings of CVPR, pp. II-84–II-89 (2001)

    Google Scholar 

  13. Chen, D., Olobez, J.M., Bourlard, H.: Text segmentation and recognition in complex background based on Markov random field. In: Proceedings of ICPR, pp. 227–230 (2002)

    Google Scholar 

  14. Fu, H., Liu, X., Jia, Y., Deng, H.: Gaussian mixture modeling of neighbor characters for multilingual text extraction in images. In: Proceedings of ICIP, pp. 3321–3324 (2006)

    Google Scholar 

  15. Roy, A., Parui, S.K., Roy, U.: A pair-copula based scheme for text extraction from digital images. In: Proceedings of ICDA, pp. 892–896 (2013)

    Google Scholar 

  16. Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circ. Syst. Video Technol. 12(4), 256–268 (2002)

    Article  Google Scholar 

  17. Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In: Proceedings of ICIS, pp. 185–190 (2008)

    Google Scholar 

  18. Li, X., Wang, W., Huang, Q., Gao, W., Qing, L.: A hybrid text segmentation approach. In: Proceedings of ICME, pp. 510–513 (2009)

    Google Scholar 

  19. Li, Z., Liu, G., Qian, X., Guo, D., Jiang, H.: Effective and efficient video text extraction using key text points. IET Image Process. 5(8), 671–683 (2011)

    Article  MathSciNet  Google Scholar 

  20. Liu, Y., Song, Y., Zhang, Y., Meng, Q.: A novel multi-oriented Chinese text extraction approach from videos. In: Proceedings of ICDAR, pp. 1355–1359 (2013)

    Google Scholar 

  21. Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M., Tan, C.L.: A new gradient based character segmentation method for video text recognition. In: ICDAR, pp. 126–130 (2011)

    Google Scholar 

  22. Huang, X., Ma, H., Zhang, H.: A new video text extraction approach. In: Proceedings of ICME 2009, pp. 650–653 (2009)

    Google Scholar 

  23. Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)

    Article  Google Scholar 

  24. Huang, X., Ma, H., Yuan, H.: A novel video text detection and localization approach. In: Huang, Y.-M.R., Xu, C., Cheng, K.-S., Yang, J.-F.K., Swamy, M.N.S., Li, S., Ding, J.-W. (eds.) PCM 2008. LNCS, vol. 5353, pp. 525–534. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangyu Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gao, G., Zhang, H., Chen, H. (2015). A Robust Video Text Extraction and Recognition Approach Using OCR Feedback Information. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24075-6_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24074-9

  • Online ISBN: 978-3-319-24075-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics