Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction

Ghai, Deepika; Jain, Neelu

doi:10.1007/s11277-019-06574-w

Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction

Published: 18 May 2019

Volume 109, pages 455–490, (2019)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Deepika Ghai¹ &
Neelu Jain¹

366 Accesses
11 Citations
Explore all metrics

Abstract

Text present in images provides important information for automatic annotation, indexing and retrieval. Therefore, its extraction is a well-known research area in computer vision. However, variations of text due to differences in orientation, alignment, font, size, contrast and complex background makes the problem of text extraction extremely challenging. In this paper, we propose an efficient method to extract text regions even under complex background using DWT and k-means clustering along with voting decision process. As textures of text have abrupt variation and irregular texture property in the wavelet transform domain, so wavelet transform seems to be the best choice for achieving the objective of image segmentation. A small size overlapping sliding window is used to scan high frequency component sub-bands from which texture features are extracted. On the basis of these features, k-means clustering is employed to classify the image into text and background clusters. Finally, voting decision process and area-based filtering are used to locate text regions accurately. We examined and evaluated the performance by varying wavelet functions and decomposition levels. The proposed method is evaluated on four standard datasets (ICDAR 2013, KAIST, MSRA-TD500, SVT) and own created dataset. Further, performance analysis reveals that this method is robust and efficient for extracting text regions under various conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Overview of Image Enhancement Techniques

Article 23 April 2021

Review of wavelet denoising algorithms

Article 03 April 2023

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

Article 09 February 2021

References

Zhang, H., Zhao, K., Song, Y. Z., & Guo, J. (2013). Text extraction from natural scene image: A survey. Neurocomputing, 122, 310–323.
Article Google Scholar
Jung, K., Kim, K. I., & Jain, A. K. (2004). Text information extraction in images and video: A survey. Pattern Recognition, 37, 977–997.
Article Google Scholar
Antani, S., Kasturi, R., & Jain, R. (2002). A survey on the use of pattern recognition methods for abstraction, indexing, and retrieval of images and video. Pattern Recognition, 35, 945–965.
Article Google Scholar
Sumathi, C. P., Santhanam, T., & Devi, G. G. (2012). A survey on various approaches of text extraction in images. International Journal of Computer Science & Engineering Survey, 3, 27–42.
Article Google Scholar
Liu, X., & Samarabandu, J. (2005). An edge-based text region extraction algorithm for indoor mobile robot navigation. In Proceedings of the IEEE international conference on mechatronics & automation (pp. 701–706). Niagara Falls: IEEE.
Liu, C., Wang, C., & Dai, R. (2005). Text detection in images based on unsupervised classification of edge-based features. In Proceedings of the 8th international conference on document analysis and recognition (pp. 610–614). IEEE Computer Society.
Lyu, M. R., Song, J., & Cai, M. (2005). A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology, 15, 243–255.
Article Google Scholar
Dinh, T. N., Park, J., & Lee, G. (2008). Low-complexity text extraction in Korean signboards for mobile applications. In 8th IEEE international conference on computer and information technology (pp. 333–337). Sydney, NSW: IEEE.
Lai, A. N., & Lee, G. (2008). Binarization by local k-means clustering for Korean text extraction. In IEEE international symposium on signal processing and information technology (pp. 117–122). Sarajevo: IEEE.
Grover, S., Arora, K., & Mitra, S. K. (2009). Text extraction from document images using edge information. In Annual IEEE India conference (pp. 1–4). Gujarat: IEEE.
Phan, T. Q., Shivakumara, P., & Tan, C. L. (2009). A Laplacian method for video text detection. In 10th International conference on document analysis and recognition (pp. 66–70). Barcelona: IEEE Computer Society.
Shivakumara, P., Phan, T. Q., & Tan, C. L. (2009). Video text detection based on filters and edge features. In IEEE international conference on multimedia and expo (pp. 514–517). New York: IEEE.
Shivakumara, P., Phan, T. Q., & Tan, C. L. (2009). A gradient difference based technique for video text detection. In 10th International conference on document analysis and recognition (pp. 156–160). Barcelona: IEEE Computer Society.
Zhang, X., Sun, F., & Gu, L. (2010). A combined algorithm for video text extraction. In 7th International conference on fuzzy systems and knowledge discovery (pp. 2294–2298), Yantai, Shandong: IEEE.
Anoual, H., Aboutajdine, D., Ensias, S. E., & Enset, A. J. (2010). Features extraction for text detection and localization. In 5th International symposium on I/V on communications and mobile network (pp. 1–4). Rabat: IEEE.
Shah, S., Modi, C., & Patel, M. (2011). Novel approach for text extraction from natural images using ISEF edge detection. In International conference on emerging trends in networks and computer communications (pp. 487–491). Udaipur: IEEE.
Seeri, S. V., Giraddi, S., & Prashant, B. M. (2012). A novel approach for Kannada text extraction. In Proceedings of the international conference on pattern recognition, informatics and medical engineering (pp. 444–448). Salem, Tamilnadu: IEEE.
Zheng, L., He, X., Samali, B., & Yang, L. T. (2013). An algorithm for accuracy enhancement of license plate recognition. Journal of Computer and System Sciences, 79, 245–255.
Article MathSciNet Google Scholar
Yao, J. L., Wang, Y. Q., Weng, L. B., & Yang, Y. P. (2007). Locating text based on connected component and SVM. In Proceedings of the 2007 international conference on wavelet analysis and pattern recognition (pp. 1418–1423). Beijing: IEEE.
Kim, W., & Kim, C. (2009). A new approach for overlay text detection and extraction from complex video scene. IEEE Transactions on Image Processing, 18, 401–411.
Article MathSciNet Google Scholar
Sun, L., Liu, G., Qian, X., & Guo, D. (2009). A novel text detection and localization method based on corner response. In IEEE international conference on multimedia and expo (pp. 390–393). New York: IEEE.
Kumar, M., Kim, Y. C., & Lee, G. S. (2010). Text detection using multilayer separation in real scene images. In 10th IEEE international conference on computer and information technology (pp. 1413–1417). Bradford: IEEE Computer Society.
Zhang, Y., Wang, C., Xiao, B., & Shi, C. (2012). A new text extraction method incorporating local information. In International conference on frontiers in handwriting recognition (pp. 252–255). Bari: IEEE.
Raj, H., & Ghosh, R. (2014). Devanagari text extraction from natural scene images. In International conference on advances in computing, communications and informatics (pp. 513–517). New Delhi: IEEE.
Qiao, Y. L., Li, M., Lu, Z. M., & Sun, S. H. (2006). Gabor filter based text extraction from digital document images. In Proceedings of the 2006 international conference on intelligent information hiding and multimedia signal processing (pp. 297–300). Pasadena: IEEE Computer Society.
Angadi, S. A., & Kodabagi, M. M. (2010). A texture based methodology for text region extraction from low resolution natural scene images. International Journal of Image Processing, 3, 229–245.
Google Scholar
Nagabhushan, P., & Nirmala, S. (2010). Text extraction in complex color document images for enhanced readability. Intelligent Information Management, 2, 120–133.
Article Google Scholar
Aradhya, V. N. M., Pavithra, M. S., & Naveena, C. (2012). A robust multilingual text detection approach based on transforms and wavelet entropy. Procedia Technology, 4, 232–237.
Article Google Scholar
Azadboni, M. K., & Behrad, A. (2012). Text detection and character extraction in color images using FFT domain filtering and SVM classification. In 6th International symposium on telecommunications (pp. 794–799). Tehran: IEEE.
Shekar, B. H., Smitha, M. L., & Shivakumara, P. (2014). Discrete wavelet transform and gradient difference based approach for text localization in videos. In 5th International conference on signals and image processing (pp. 280–284). Jeju Island: IEEE.
Bai, B., Yin, F., & Liu, C. L. (2014). A seed-based segmentation method for scene text extraction. In 11th IAPR international workshop on document analysis systems (pp. 262–266). Tours: IEEE.
Kumar, A., & Awasthi, N. (2013). An efficient algorithm for text localization and extraction in complex video text images. In 2nd International conference on information management in the knowledge economy (pp. 14–19). Chandigarh: IEEE.
Shivakumara, P., Sreedhar, R. P., Phan, T. Q., Lu, S., & Tan, C. L. (2012). Multioriented video scene text detection through bayesian classification and boundary growing. IEEE Transactions on Circuits and Systems for Video Technology, 22, 1227–1235.
Article Google Scholar
Yi, C., & Tian, Y. (2012). Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Transactions on Image Processing, 21, 4256–4268.
Article MathSciNet Google Scholar
Khodadadi, M., & Behrad, A. (2012). Text localization, extraction and inpainting in color images (pp. 1035–1040). In 20th Iranian conference on electrical engineering. Tehran: IEEE.
Zhao, M., Li, S., & Kwok, J. (2010). Text detection in images using sparse representation with discriminative dictionaries. Image and Vision Computing, 28, 1590–1599.
Article Google Scholar
Pan, Y. F., Hou, X., & Liu, C. L. (2009). Text localization in natural scene images based on conditional random field. In 10th International conference on document analysis and recognition (pp. 6–10). Barcelona: IEEE Computer Society.
Jung, C., Liu, Q., & Kim, J. (2009). Accurate text localization in images based on SVM output scores. Image and Vision Computing, 27, 1295–1301.
Article Google Scholar
Zhao, T., Sun, G., Zhang, C., & Chen, D. (2008). Study on video text processing. In IEEE International symposium on industrial electronics (pp. 1215–1218). Cambridge: IEEE.
Saeedi, J., Safabakhsh, R., & Mozaffari, S. (2009). Document image segmentation using fuzzy classifier and the dual-tree DWT. In Proceedings of the 14th international CSI computer conference (pp. 385–391). Tehran: IEEE.
Shivakumara, P., Huang, W., Phan, T. Q., & Tan, C. L. (2010). Accurate video text detection through classification of low and high contrast images. Pattern Recognition, 43, 2165–2185.
Article Google Scholar
Wei, Y. C., & Lin, C. H. (2012). A robust video text detection approach using SVM. Expert Systems with Applications, 39, 10832–10840.
Article Google Scholar
Xu, H., & Su, F. (2015). A robust hierarchical detection method for scene text based on convolutional neural networks. In IEEE international conference on multimedia and expo (pp. 1–6). Turin: IEEE.
Zhang, Z., Shen, W., Yao, C., & Bai, X. (2015). Symmetry-based text line detection in natural scenes. In IEEE conference on computer vision and pattern recognition (pp. 2558–2567). Boston: IEEE.
Chen, K., Yin, F., Hussain, A., & Liu, C. L. (2015). Efficient text localization in born-digital images by local contrast-based segmentation. In 13th International conference on document analysis and recognition (pp. 291–295). Tunis: IEEE.
Jung, J., Lee, S., Min Su, C., & Kim, J. H. (2011). Touch TT: Scene text extractor using touchscreen interface. ETRI Journal, 33, 78–88.
Article Google Scholar
Gomez, L., & Karatzas, D. (2014). A fast hierarchical method for multi-script and arbitrary oriented scene text extraction. Computer Vision and Pattern Recognition. arXiv:1407.7504v1 [cs.CV].
Khatib, T., Karajeh, H., Mohammad, H., & Rajab, L. (2015). A hybrid multilevel text extraction algorithm in scene images. Scientific Research and Essays, 10, 105–113.
Article Google Scholar
Yao, C., Zhang, X., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012). Detecting texts of arbitrary orientations in natural images. In IEEE conference on computer vision and pattern recognition (pp. 1083–1090). RI: IEEE.
Kang, L., Li, Y., & Doermann, D. (2014). Orientation robust text line detection in natural images. In IEEE conference on computer vision and pattern recognition (pp. 4034–4041). Columbus, Ohio: IEEE Computer Society.
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., & Bai, X. (2016). Multi-oriented text detection with fully convolutional networks. In IEEE conference on computer vision and pattern recognition (pp. 4159–4167). Los Alamitos, CA: IEEE Computer Society.
Neumann, L., & Matas, J. (2012). Real-time scene text localization and recognition. In 25th IEEE conference on computer vision and pattern recognition (pp. 3538–3545). RI: IEEE.
Lu, S., Chen, T., Tian, S., Lim, J. H., & Tan, C. L. (2015). Scene text extraction based on edges and support vector regression. International Journal on Document Analysis and Recognition, 18, 125–135.
Article Google Scholar
Lucas, S. M., Panaretos, A., Sosa, L., Tang, A., Wong, S., & Young, R. (2003). ICDAR 2003 robust reading competitions. In Proceedings of the seventh international conference on document analysis and recognition (pp. 682–687). Edinburgh: IEEE Computer Society.
Shahab, A., Shafait, F., & Dengel, A. (2011). ICDAR 2011 robust reading competition challenge 2: Reading text in scene images. In 11th International conference on document analysis and recognition (pp. 1491–1496). Beijing: IEEE Computer Society.
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., & Bigorda, L. G. (2013). ICDAR 2013 robust reading competition. In 12th International conference on document analysis and recognition (pp. 1484–1493). Washington, DC: IEEE.
Wolf, C., & Jolion, J. M. (2006). Object count/area graphs for the evaluation of object detection and segmentation algorithms. International Journal on Document Analysis and Recognition, 8, 280–296.
Article Google Scholar
Kim, J. H., & Lee, S. (2011). KAIST scene text database. http://www.iapr-tc11.org/mediawiki/index.php/KAIST_Scene_Text_Database. Accessed 17 Oct 2012.
Yao, C. (2012). MSRA text detection 500 database (MSRA-TD500). http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500). Accessed 30 Oct 2012.
Wang, K. (2014). The street view text dataset (SVT). http://tc11.cvc.uab.es/datasets/SVT_1. Accessed 13 Jan 2014.

Download references

Acknowledgements

Authors would like to thank ECE Department, PEC University of Technology, Chandigarh for providing necessary facilities and CSIR for providing funds (grant file No:08/423(0001)/2015-EMR-1) required for carrying out this research work.

Author information

Authors and Affiliations

ECE Department, PEC University of Technology, Sector-12, Chandigarh, UT, 160 012, India
Deepika Ghai & Neelu Jain

Authors

Deepika Ghai
View author publications
You can also search for this author in PubMed Google Scholar
Neelu Jain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepika Ghai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghai, D., Jain, N. Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction. Wireless Pers Commun 109, 455–490 (2019). https://doi.org/10.1007/s11277-019-06574-w

Download citation

Published: 18 May 2019
Issue Date: November 2019
DOI: https://doi.org/10.1007/s11277-019-06574-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Overview of Image Enhancement Techniques

Review of wavelet denoising algorithms

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Overview of Image Enhancement Techniques

Review of wavelet denoising algorithms

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation