Skip to main content
Log in

Korean-English bilingual videotext recognition for news headline generation based on a split-merge strategy

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

This paper deals with Korean-English bilingual videotext recognition for news headline generation. Because videotext contains semantic content information, it can be effectively used for understanding videos. Despite its usefulness, it is a challengeable task to apply text recognition technologies to practical video applications because of the computational complexity and recognition accuracy. In this paper, we propose a novel Korean-English bilingual videotext recognition method to overcome the computational complexity as well as achieve comparable recognition accuracy. To recognize both Korean and English characters effectively, the proposed method employs an elaborate split-merge strategy in which the split segments are merged into characters using the recognition scores. Moreover, it avoids unnecessary computation using geometric features such as squareness and internal gap, and thus its computational overhead is remarkably reduced. Therefore, the proposed method is successfully employed in generating news headlines. The effectiveness and efficiency of the proposed method are verified by extensive experiments on a challenging database containing 51,290 text images (176,884 characters).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Schoeffmann, K., Hopfgartner, F., Marques, O., Boeszoermenyi, L., Jose, J.M.: Video browsing interfaces and applications: a review. SPIE Rev. 1, 018004 (2010)

    Google Scholar 

  2. Lee, C.C., Shih, C.Y., Huang, H.M.: Story-related caption detection and localization in news video. Opt. Eng. 48, 037005 (2009)

    Article  Google Scholar 

  3. Dimitrova, N., Zhang, H.J., Shahraray, B., Sezan, I., Zakhor, A., Huang, T.: Applications of video content analysis and retrieval. IEEE Multimed. 9, 43–55 (2002)

    Article  Google Scholar 

  4. Dimitrova, N., McGee, T., Elenbaas, H.: Video key-frame extraction and filtering: a key-frame is not a key-frame to everyone. In: Proceedings of ACM International Conference on Knowledge and Information Management, pp. 113–120 (1997)

  5. Jasinschi, R. S., Dimitrova, N., McGee, T., Agnihotri, L., Zimmerman, J., Li, D.: Integrated multimedia processing for topic segmentation and classification. In: Proceedings of IEEE International Conference on Image Processing, pp. 366–369 (2001)

  6. Kim, J.G., Chang, H.S., Kang, K., Kim, M., Kim, J., Kim, H.M.: Summarization of news video and its description for content-based access. Int. J. Imaging Syst. Technol. 13, 267–274 (2003)

    Article  Google Scholar 

  7. Merialdo, B., Lee, K.T., Luparello, D., Roudaire, J.: Automatic construction of personalized TV news program. In: Proceedings of ACM International Conference on Multimedia, pp. 323–331 (1999)

  8. Liu, J., He, Y., Peng, M.: NewsBR: a content-based news video browsing and retrieval system. In: Proceedings of Computer and Information Technology, pp. 857–863 (2004)

  9. Kim, S.K., Hwang, D.S., Kim, J.Y., Seo, Y.S.: An effective news anchorperson shot detection method based on adaptive audio/visual method generation. Lect. Notes Comput. Sci. 3568, 276–285 (2005)

    Article  Google Scholar 

  10. Gao, X., Li, J., Yang, B.: A graph-theoretical clustering based anchor person shot detection for news video indexing. In: Proceedings of International Conference on Computational Intelligence and Multimedia Applications, pp. 108–113 (2003)

  11. Zhu, W., Toklu, C., Liou, S.P.: Automatic news video segmentation and categorization based on closed-captioned text. In: Proceedings of IEEE International Conference on Multimedia and Expo, pp. 1036–1039 (2001)

  12. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)

  13. Jung, C., Liu, Q., Kim, J.K.: A new approach for text segmentation using a stroke filter. Signal Process. 88, 1907–1916 (2008)

    Article  MATH  Google Scholar 

  14. Jung, C., Liu, Q., Kim, J.K.: Accurate text localization in images based on SVM output scores. Image Vis. Comput. 27, 1295–1301 (2009)

    Article  Google Scholar 

  15. Jung, C., Liu, Q., Kim, J.K.: A stroke filter and its application to text localization. Pattern Recogn. Lett. 30, 114–122 (2009)

    Article  Google Scholar 

  16. Sato, T., Kanade, T., Highes, E.K., Smith, M.A.: Video OCR for digital news archive. In: Proceedings of IEEE Workshop on Content-Based Access of Image and Video Database, pp. 52–60 (1998)

  17. Sato, T., Kanade, T., Hughes, E.K., Smith, M.A., Satoh, S.: Video OCR: indexing digital news libraries by recognition of superimposed captions. Multimed. Syst. 7, 385–395 (1999)

    Article  Google Scholar 

  18. Chang, F., Chen, G.C., Lin, C.C., Lin, W.H.: Caption analysis and recognition for building video indexing systems. Multimed. Syst. 10, 344–355 (2005)

    Article  Google Scholar 

  19. Lee, S., Kim, J.: Complementary combination of holistic and component analysis for recognition of low-resolution video character image. Pattern Recogn. Lett. 29, 383–391 (2008)

    Article  Google Scholar 

  20. Wang, F., Ngo, C.W., Pong, T.C.: Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis. Pattern Recogn. 41, 3257–3269 (2008)

    Article  Google Scholar 

  21. Park, J., Lee, G., Kim, E., Lim, J., Kim, S., Yang, H., Lee, M., Hwang, S.: Automatic detection and recognition of Korean text in outdoor signboard images. Pattern Recogn. Lett. 31, 1728–1739 (2010)

    Article  Google Scholar 

  22. Chang, Y., Chen, D., Zhang, Y., Yang, J.: An image-based automatic Arabic translation system. Pattern Recogn. 42, 2127–2134 (2009)

    Article  MATH  Google Scholar 

  23. Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents. Pattern Anal. Appl. 6, 309–326 (2003)

    MathSciNet  Google Scholar 

  24. Chen, D., Odobez, J.M., Bourlard, H.: Text detection and recognition in images and video frames. Pattern Recogn. 13, 595–608 (2004)

    Article  Google Scholar 

  25. Tang, X., Gao, X., Liu, J., Zhang, H.: A spatio-temporal approach for video caption detection and recognition. IEEE Trans. Neural Netw. 13, 961–971 (2002)

    Article  Google Scholar 

  26. Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circuit Syst. Video Technol. 12, 256–267 (2002)

    Article  Google Scholar 

  27. Yang, H., Siebert, M., Lühne, P., Sack, H., Meinel, C.: Automatic lecture video indexing using video OCR technology. In: Proceedings of IEEE International Symposium on Multimedia, pp. 111–116 (2011)

  28. Sarfraz, M.S., Shahzad, A., Elahi, M.A., Fraz, M.: Real-time automatic license plate recognition for CCTV forensic applications. J. Real Time Image Process. (2011). doi:10.1007/s11554-011-0232-7

  29. Chin, S., Choi, Y., Choo, M.: A skew free Korean character recognition system for PDA devices. In: Proceedings of International Conference on Intelligent Computing, pp. 476–483 (2006)

  30. Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings of IAPR International Workshop on Document Analysis Systems, pp. 63–68 (2012)

  31. Kim, M.S., Cho, K.T., Kwag, H.K., Kim, J.H.: Segmentation of handwritten characters for digitalizing Korean historical documents. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 114–124 (2004)

  32. Tseng, Y.H., Lee, H.J.: Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm. Pattern Recogn. Lett. 20, 791–806 (1999)

    Article  Google Scholar 

  33. Kang, K.W., Kim, J.H.: Utilization of hierarchical, stochastic relationship modeling for Hangul character recognition. IEEE Trans. Pattern Recogn. Mach. Intell. 26, 1185–1195 (2004)

    Article  MathSciNet  Google Scholar 

  34. Kim, J.H., Kim, K.K., Chien, S.I.: Korean and English character recognition system using hierarchical classification neural network. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp. 759–764 (1995)

  35. Lim, K.T.: A study on machine printed character recognition based on character type classification. J. Electron. Eng. Korea 40, 26–39 (2003)

    Google Scholar 

  36. Kwak, N., Choi, C.H.: Input feature selection by mutual information based on Parzen window. IEEE Trans. Pattern Recogn. Mach. Intell. 24, 1667–1771 (2002)

    Article  Google Scholar 

  37. Fisher, R.A.: The statistical utilization of multiple measurements. Ann. Eugen. 8, 376–386 (1938)

    Article  Google Scholar 

  38. Ryu, S., Kim, J.H.: A language model using variable length tokens for open-vocabulary Hangul text recognition. Pattern Recogn. 37, 1549–1552 (2004)

    Article  Google Scholar 

  39. Ryu, S., Kim, J.H.: Learning the lexicon from raw texts for open-vocabulary Korean word recognition. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 202–206 (2003)

  40. Bagdanov, A., Kanai, J.: Projection profile based skew estimation algorithm for JBIG compressed images. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 401–405 (1997)

Download references

Acknowledgments

The partial work reported in this paper was conducted while the first author was with Samsung Electronics. The authors are grateful to Prof. Jinhyung Kim and Mr. Kyutae Cho in KAIST for their helpful discussion and the anonymous reviewers for their useful comments. This work was supported by the National Natural Science Foundation of China (Nos. 61050110144, 60803097, 60972148, 60971128, 60970066, 61072106, 61075041, 61003198, 61001206, and 61077009), the National Research Foundation for the Doctoral Program of Higher Education of China (No. 200807010003 and 20100203120005), the National Science and Technology Ministry of China (Nos. 9140A07011810DZ0107 and 9140A07021010DZ0131), the Key Project of Ministry of Education of China (No. 108115), and the Fundamental Research Funds for the Central Universities (Nos. JY10000902001, K50510020001, and JY10000902045).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheolkon Jung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jung, C., Jiao, L. Korean-English bilingual videotext recognition for news headline generation based on a split-merge strategy. J Real-Time Image Proc 11, 167–177 (2016). https://doi.org/10.1007/s11554-012-0298-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-012-0298-x

Keywords

Navigation