Abstract
Video text usually provides us a lot of useful information that is important for video analysis, indexing and retrieval. However, it is still a challenging work to detect text from video images due to variation of text patterns and complexity of background. In this paper, an automatic video text detection method is proposed. Firstly, K-means is utilized to classify pixels in gradient images into text and non-text regions. Subsequently, morphological operations are performed on text regions to form connected candidate text components, followed by projection profile boundary refinement. Finally, the detection results are verified by geometry and BP-Adaboost identifications. The experimental results on our manually selected dataset and the publicly available Microsoft Asia dataset show the effectiveness and feasibility of the proposed method.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig6_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig7_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig8_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig9_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig10_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig11_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig12_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig13_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig14_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig15_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2690-6/MediaObjects/11042_2015_2690_Fig16_HTML.gif)
Similar content being viewed by others
References
Cai M, Song J, Lyu MR (2002) A new approach for video text detection. In: Proceedings of IEEE International Conference on Image Processing, pp I-117
Gui W, Liu J, Yang C, Chen N, Liao X (2013) Color co-occurrence matrix based froth image texture extraction for mineral flotation. Miner Eng 46:60–67
Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621
He M, Yang C, Wang X, Gui W, Wei L (2013) Nonparametric density estimation of froth colour texture distribution for monitoring sulphur flotation process. Miner Eng 53:203–212
Hua XS, Wenyin L, Zhang HJ (2004) An automatic performance evaluation protocol for video text detection algorithms. IEEE Trans Circ Syst Vid 14(4):498–507
Kim W, Kim C (2009) A new approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18(2):401–411
Li Z, Liu G, Qian X, Guo D, Jiang H (2011) Effective and efficient video text extraction using key text points. IET Image Process 5(8):671–683
Liu X, Wang W (2012) Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Trans Multimed 14(2):482–489
Liu C, Wang C, Dai R (2005) Text detection in images based on unsupervised classification of edge-based features. In: Proceedings of IEEE International Conference on Document Analysis and Recognition, pp 610–614
Mariano VY, Kasturi R (2000) Locating uniform-colored text in video frames. In: Proceedings of IEEE International Conference on Pattern Recognition, pp 539–542
Phan TQ, Shivakumara P, Tan CL (2009) A Laplacian method for video text detection. In: Proceedings of IEEE International Conference on Document Analysis and Recognition, pp 66–70
Qian X, Wang H, Hou X (2014) Video text detection and localization in intra-frames of H. 264/AVC compressed video[J]. Multimed Tools Appl 70(3):1487–1502
Shivakumara P, Phan TQ, Tan CL (2011) A Laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419
Shivakumara P, Sreedhar RP, Phan TQ, Lu S, Tan CL (2012) Multioriented video scene text detection through Bayesian classification and boundary growing. IEEE Trans Circ Syst Vid 22(8):1227–1235
Suzuki K, Horiba I, Sugie N (2003) Linear-time connected-component labeling based on sequential local operations. Comput Vis Image Und 89(1):1–23
Wei YC, Lin CH (2012) A robust video text detection approach using SVM. Expert Syst Appl 39(12):10832–10840
Wong EK, Chen M (2003) A new robust algorithm for video text extraction. Pattern Recognit 36(6):1397–1406
Wu Y, Shivakumara P, Wei W, et al. A new ring radius transform-based thinning method for multi-oriented video characters [J]. Int J Doc Anal Recog (IJDAR), 2015: 1–15
Yang H, Quehl B, Sack H (2014) A framework for improved video text detection and recognition[J]. Multimed Tools Appl 69(1):217–245
Zhao M, Li S, Kwok J (2010) Text detection in images using sparse representation with discriminative dictionaries. Image Vision Comput 28(12):1590–1599
Acknowledgments
This work is partly supported by the National Natural Science Foundation of China (Grant Nos. 61172184, 61173122, 61174210, 61379107, and 61402539), Key Project of Hunan Provincial Natural Science Foundation of China (Grant No. 12JJ2038), Program for New Century Excellent Talents in University of Education Ministry in China (Grant No. NCET-13-0603), Specialized Research Fund for the Doctoral Program of Higher Education in China (Grant No. 20130162110016), Program for Hunan Province Science and Technology Basic Construction (Grant No. 20131199), and China Postdoctoral Science Foundation (Grant No. 2012 M521554), the Fundamental Research Funds for the Central Universities of Central South University (Grant No. 2015zzts052).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, H., Zou, Bj., Zhao, Yq. et al. An automatic video text detection method based on BP-adaboost. Multimed Tools Appl 75, 7715–7738 (2016). https://doi.org/10.1007/s11042-015-2690-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2690-6