Cloud of Line Distribution and Random Forest Based Text Detection from Natural/Video Scene Images

Wang, Wenhai; Wu, Yirui; Shivakumara, Palaiahnakote; Lu, Tong

doi:10.1007/978-3-319-73600-6_5

Wenhai Wang²¹,
Yirui Wu²²,
Palaiahnakote Shivakumara²³ &
…
Tong Lu²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10705))

Included in the following conference series:

International Conference on Multimedia Modeling

2806 Accesses

Abstract

Text detection in natural and video scene images is still considered to be challenging due to unpredictable nature of scene texts. This paper presents a new method based on Cloud of Line Distribution (COLD) and Random Forest Classifier for text detection in both natural and video images. The proposed method extracts unique shapes of text components by studying the relationship between dominant points such as straight or cursive over contours of text components, which is called COLD in polar domain. We consider edge components as text candidates if the edge components in Canny and Sobel of an input image share the COLD property. For each text candidate, we further study its COLD distribution at component level to extract statistical features and angle oriented features. Next, these features are fed to a random forest classifier to eliminate false text candidates, which results representatives. We then perform grouping using representatives to form text lines based on the distances between edge components in the edge image. The statistical and angle orientated features are finally extracted at word level for eliminating false positives, which results in text detection. The proposed method is tested on standard database, namely, SVT, ICDAR 2015 scene, ICDAR2013 scene and video databases, to show its effectiveness and usefulness compared with the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. PAMI 1480–1500 (2015)
Google Scholar
Yin, X.C., Zuo, Z.Y., Tian, S., Liu, C.L.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 2752–2773 (2016)
Google Scholar
Feng, Y., Song, Y., Zhang, Y.: Scene text detection based on multi-scale SWT and edge filtering. In: Proceedings of ICPR, pp. 634–639 (2016)
Google Scholar
Pei, W.Y., Yang, C., Kau, L.J., Yin, X.C.: Multi-orientation scene text detection with multi-information fusion. In: Proceedings of ICPR, pp. 646–651 (2016)
Google Scholar
Wu, H., Zou, B., Zhao, Y.Q., Chen, Z., Zhu, C., Guo, J.: Natural scene text detection by multi-scale adaptive color clustering and non-text filtering. Neurocomputing 1011–1025 (2016)
Google Scholar
Zheng, Y., Li, Q., Ju, J., Hu, H., Li, G., Zhang, S.: A cascaded method for text detection in natural scene images. Neurocomputing 1–9 (2017)
Google Scholar
Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. PAMI 36(5), 970–983 (2014)
Article Google Scholar
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of CVPR, pp. 3538–3545 (2012)
Google Scholar
Li, Y., Jia, W., Shen, C., Hengel, A.V.D.: Characterness: an indicator of text in the wild. IEEE Trans. IP 23(4), 1666–1677 (2014)
MathSciNet MATH Google Scholar
Mosleh, A., Bouguila, N., Hamza, A.B.: Automatic inpainting scheme for video text detection and removal. IEEE Trans. IP 22(11), 4460–4472 (2013)
MathSciNet MATH Google Scholar
Mittal, A., Roy, P.P., Singh, P., Raman, B.: Rotation and script independent text detection from video using sub pixel mapping. Vis. Commun. Image Represent. (2017, to appear)
Google Scholar
Wu, Y., Shivakumara, P., Lu, T., Lim Tan, C., Blumenstein, M., Kumar, G.H.: Contour restoration of text components for recognition in video/scene images. IEEE Trans. IP 25(12), 5622–5634 (2016)
MathSciNet Google Scholar
Shivakumara, P., Wu, L., Lu, T., Tan, C.L.: Fractals based multi-oriented text detection system for recognition in mobile video images. Pattern Recogn. 158–174 (2017)
Google Scholar
Shivakumara, P., Raghavendra, R., Qin, L., Raja, K.B., Lu, T., Pal, U.: A new multi-modal approach to bib number/text detection and recognition in Marathon images. Pattern Recogn. 479–491 (2017)
Google Scholar
Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 497–511. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_33
Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. IJCV 116(1), 1–20 (2016)
Article MathSciNet Google Scholar
He, S., Schomaker, L.: Beyond OCR: multi-faceted understanding of handwritten document characteristics. Pattern Recogn. 321–333 (2017)
Google Scholar
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings of ICDAR, pp. 1115–1124 (2013)
Google Scholar
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanow, A., Iwamura, M., Matas, J., Neumann, L., Chandrsekhar, V.R.: ICDAR 2015 competition on robust reading. In: Proceedings of ICDAR, pp. 1156–1160 (2015)
Google Scholar

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of China under Grant 61672273, Grant 61272218, and Grant 61321491, the Science Foundation for Distinguished Young Scholars of Jiangsu under Grant BK20160021, the Science Foundation of JiangSu under Grant BK20170892, the Fundamental Research Funds for the Central Universities under Grant 2013/B16020141, and the open Project of the National Key Lab for Novel Software Technology in NJU under Grant KFKT2017B05.

Author information

Authors and Affiliations

National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China
Wenhai Wang & Tong Lu
College of Computer and Information, Hohai University, Nanjing, China
Yirui Wu
Department of Computer System and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Palaiahnakote Shivakumara

Authors

Wenhai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yirui Wu
View author publications
You can also search for this author in PubMed Google Scholar
Palaiahnakote Shivakumara
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong Lu .

Editor information

Editors and Affiliations

Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Klaus Schoeffmann
Chulalongkorn University, Bangkok, Thailand
Thanarat H. Chalidabhongse
City University of Hong Kong, Hong Kong, China
Chong Wah Ngo
Chulalongkorn University, Bangkok, Thailand
Supavadee Aramvith
Dublin City University, Dublin, Ireland
Noel E. O’Connor
Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj
Rutgers University, Piscataway, New Jersey, USA
Ahmed Elgammal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, W., Wu, Y., Shivakumara, P., Lu, T. (2018). Cloud of Line Distribution and Random Forest Based Text Detection from Natural/Video Scene Images. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10705. Springer, Cham. https://doi.org/10.1007/978-3-319-73600-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-73600-6_5
Published: 13 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73599-3
Online ISBN: 978-3-319-73600-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics