Korean-English bilingual videotext recognition for news headline generation based on a split-merge strategy

Jung, Cheolkon; Jiao, Licheng

doi:10.1007/s11554-012-0298-x

Korean-English bilingual videotext recognition for news headline generation based on a split-merge strategy

Original Research Paper
Published: 17 November 2012

Volume 11, pages 167–177, (2016)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Cheolkon Jung¹ &
Licheng Jiao¹

396 Accesses
1 Altmetric
Explore all metrics

Abstract

This paper deals with Korean-English bilingual videotext recognition for news headline generation. Because videotext contains semantic content information, it can be effectively used for understanding videos. Despite its usefulness, it is a challengeable task to apply text recognition technologies to practical video applications because of the computational complexity and recognition accuracy. In this paper, we propose a novel Korean-English bilingual videotext recognition method to overcome the computational complexity as well as achieve comparable recognition accuracy. To recognize both Korean and English characters effectively, the proposed method employs an elaborate split-merge strategy in which the split segments are merged into characters using the recognition scores. Moreover, it avoids unnecessary computation using geometric features such as squareness and internal gap, and thus its computational overhead is remarkably reduced. Therefore, the proposed method is successfully employed in generating news headlines. The effectiveness and efficiency of the proposed method are verified by extensive experiments on a challenging database containing 51,290 text images (176,884 characters).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PolygloNet: Multilingual Approach for Scene Text Recognition Without Language Constraints

Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening

Article Open access 06 March 2025

LSECA: local semantic enhancement and cross aggregation for video-text retrieval

Article 22 July 2024

References

Schoeffmann, K., Hopfgartner, F., Marques, O., Boeszoermenyi, L., Jose, J.M.: Video browsing interfaces and applications: a review. SPIE Rev. 1, 018004 (2010)
Google Scholar
Lee, C.C., Shih, C.Y., Huang, H.M.: Story-related caption detection and localization in news video. Opt. Eng. 48, 037005 (2009)
Article Google Scholar
Dimitrova, N., Zhang, H.J., Shahraray, B., Sezan, I., Zakhor, A., Huang, T.: Applications of video content analysis and retrieval. IEEE Multimed. 9, 43–55 (2002)
Article Google Scholar
Dimitrova, N., McGee, T., Elenbaas, H.: Video key-frame extraction and filtering: a key-frame is not a key-frame to everyone. In: Proceedings of ACM International Conference on Knowledge and Information Management, pp. 113–120 (1997)
Jasinschi, R. S., Dimitrova, N., McGee, T., Agnihotri, L., Zimmerman, J., Li, D.: Integrated multimedia processing for topic segmentation and classification. In: Proceedings of IEEE International Conference on Image Processing, pp. 366–369 (2001)
Kim, J.G., Chang, H.S., Kang, K., Kim, M., Kim, J., Kim, H.M.: Summarization of news video and its description for content-based access. Int. J. Imaging Syst. Technol. 13, 267–274 (2003)
Article Google Scholar
Merialdo, B., Lee, K.T., Luparello, D., Roudaire, J.: Automatic construction of personalized TV news program. In: Proceedings of ACM International Conference on Multimedia, pp. 323–331 (1999)
Liu, J., He, Y., Peng, M.: NewsBR: a content-based news video browsing and retrieval system. In: Proceedings of Computer and Information Technology, pp. 857–863 (2004)
Kim, S.K., Hwang, D.S., Kim, J.Y., Seo, Y.S.: An effective news anchorperson shot detection method based on adaptive audio/visual method generation. Lect. Notes Comput. Sci. 3568, 276–285 (2005)
Article Google Scholar
Gao, X., Li, J., Yang, B.: A graph-theoretical clustering based anchor person shot detection for news video indexing. In: Proceedings of International Conference on Computational Intelligence and Multimedia Applications, pp. 108–113 (2003)
Zhu, W., Toklu, C., Liou, S.P.: Automatic news video segmentation and categorization based on closed-captioned text. In: Proceedings of IEEE International Conference on Multimedia and Expo, pp. 1036–1039 (2001)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Jung, C., Liu, Q., Kim, J.K.: A new approach for text segmentation using a stroke filter. Signal Process. 88, 1907–1916 (2008)
Article MATH Google Scholar
Jung, C., Liu, Q., Kim, J.K.: Accurate text localization in images based on SVM output scores. Image Vis. Comput. 27, 1295–1301 (2009)
Article Google Scholar
Jung, C., Liu, Q., Kim, J.K.: A stroke filter and its application to text localization. Pattern Recogn. Lett. 30, 114–122 (2009)
Article Google Scholar
Sato, T., Kanade, T., Highes, E.K., Smith, M.A.: Video OCR for digital news archive. In: Proceedings of IEEE Workshop on Content-Based Access of Image and Video Database, pp. 52–60 (1998)
Sato, T., Kanade, T., Hughes, E.K., Smith, M.A., Satoh, S.: Video OCR: indexing digital news libraries by recognition of superimposed captions. Multimed. Syst. 7, 385–395 (1999)
Article Google Scholar
Chang, F., Chen, G.C., Lin, C.C., Lin, W.H.: Caption analysis and recognition for building video indexing systems. Multimed. Syst. 10, 344–355 (2005)
Article Google Scholar
Lee, S., Kim, J.: Complementary combination of holistic and component analysis for recognition of low-resolution video character image. Pattern Recogn. Lett. 29, 383–391 (2008)
Article Google Scholar
Wang, F., Ngo, C.W., Pong, T.C.: Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis. Pattern Recogn. 41, 3257–3269 (2008)
Article Google Scholar
Park, J., Lee, G., Kim, E., Lim, J., Kim, S., Yang, H., Lee, M., Hwang, S.: Automatic detection and recognition of Korean text in outdoor signboard images. Pattern Recogn. Lett. 31, 1728–1739 (2010)
Article Google Scholar
Chang, Y., Chen, D., Zhang, Y., Yang, J.: An image-based automatic Arabic translation system. Pattern Recogn. 42, 2127–2134 (2009)
Article MATH Google Scholar
Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents. Pattern Anal. Appl. 6, 309–326 (2003)
MathSciNet Google Scholar
Chen, D., Odobez, J.M., Bourlard, H.: Text detection and recognition in images and video frames. Pattern Recogn. 13, 595–608 (2004)
Article Google Scholar
Tang, X., Gao, X., Liu, J., Zhang, H.: A spatio-temporal approach for video caption detection and recognition. IEEE Trans. Neural Netw. 13, 961–971 (2002)
Article Google Scholar
Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circuit Syst. Video Technol. 12, 256–267 (2002)
Article Google Scholar
Yang, H., Siebert, M., Lühne, P., Sack, H., Meinel, C.: Automatic lecture video indexing using video OCR technology. In: Proceedings of IEEE International Symposium on Multimedia, pp. 111–116 (2011)
Sarfraz, M.S., Shahzad, A., Elahi, M.A., Fraz, M.: Real-time automatic license plate recognition for CCTV forensic applications. J. Real Time Image Process. (2011). doi:10.1007/s11554-011-0232-7
Chin, S., Choi, Y., Choo, M.: A skew free Korean character recognition system for PDA devices. In: Proceedings of International Conference on Intelligent Computing, pp. 476–483 (2006)
Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings of IAPR International Workshop on Document Analysis Systems, pp. 63–68 (2012)
Kim, M.S., Cho, K.T., Kwag, H.K., Kim, J.H.: Segmentation of handwritten characters for digitalizing Korean historical documents. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 114–124 (2004)
Tseng, Y.H., Lee, H.J.: Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm. Pattern Recogn. Lett. 20, 791–806 (1999)
Article Google Scholar
Kang, K.W., Kim, J.H.: Utilization of hierarchical, stochastic relationship modeling for Hangul character recognition. IEEE Trans. Pattern Recogn. Mach. Intell. 26, 1185–1195 (2004)
Article MathSciNet Google Scholar
Kim, J.H., Kim, K.K., Chien, S.I.: Korean and English character recognition system using hierarchical classification neural network. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp. 759–764 (1995)
Lim, K.T.: A study on machine printed character recognition based on character type classification. J. Electron. Eng. Korea 40, 26–39 (2003)
Google Scholar
Kwak, N., Choi, C.H.: Input feature selection by mutual information based on Parzen window. IEEE Trans. Pattern Recogn. Mach. Intell. 24, 1667–1771 (2002)
Article Google Scholar
Fisher, R.A.: The statistical utilization of multiple measurements. Ann. Eugen. 8, 376–386 (1938)
Article Google Scholar
Ryu, S., Kim, J.H.: A language model using variable length tokens for open-vocabulary Hangul text recognition. Pattern Recogn. 37, 1549–1552 (2004)
Article Google Scholar
Ryu, S., Kim, J.H.: Learning the lexicon from raw texts for open-vocabulary Korean word recognition. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 202–206 (2003)
Bagdanov, A., Kanai, J.: Projection profile based skew estimation algorithm for JBIG compressed images. In: Proceedings of International Conference on Document Analysis and Recognition, pp. 401–405 (1997)

Download references

Acknowledgments

The partial work reported in this paper was conducted while the first author was with Samsung Electronics. The authors are grateful to Prof. Jinhyung Kim and Mr. Kyutae Cho in KAIST for their helpful discussion and the anonymous reviewers for their useful comments. This work was supported by the National Natural Science Foundation of China (Nos. 61050110144, 60803097, 60972148, 60971128, 60970066, 61072106, 61075041, 61003198, 61001206, and 61077009), the National Research Foundation for the Doctoral Program of Higher Education of China (No. 200807010003 and 20100203120005), the National Science and Technology Ministry of China (Nos. 9140A07011810DZ0107 and 9140A07021010DZ0131), the Key Project of Ministry of Education of China (No. 108115), and the Fundamental Research Funds for the Central Universities (Nos. JY10000902001, K50510020001, and JY10000902045).

Author information

Authors and Affiliations

Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi’an, 710071, China
Cheolkon Jung & Licheng Jiao

Authors

Cheolkon Jung
View author publications
You can also search for this author inPubMed Google Scholar
Licheng Jiao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Cheolkon Jung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jung, C., Jiao, L. Korean-English bilingual videotext recognition for news headline generation based on a split-merge strategy. J Real-Time Image Proc 11, 167–177 (2016). https://doi.org/10.1007/s11554-012-0298-x

Download citation

Received: 25 October 2011
Accepted: 25 October 2012
Published: 17 November 2012
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11554-012-0298-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Korean-English bilingual videotext recognition for news headline generation based on a split-merge strategy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PolygloNet: Multilingual Approach for Scene Text Recognition Without Language Constraints

Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening

LSECA: local semantic enhancement and cross aggregation for video-text retrieval

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now