Skip to main content

Scene Text Detection with Cascaded Filtering and Grouping Modules

  • Conference paper
  • First Online:
Book cover Internet Multimedia Computing and Service (ICIMCS 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 819))

Included in the following conference series:

  • 1386 Accesses

Abstract

In this paper, we present a new scene text detection approach with cascaded filtering and grouping modules. Firstly, a coarse-to-fine distance based pair validation scheme is proposed to determine the pairwise relations of character candidates after the extraction and filtering of Extremal Regions. Secondly, an additional module is added to detect text lines with single character or two characters behind the text lines’ grouping module. Thirdly, a text-line-level classifier based on the similarity of characters is designed to exclude non-text objects. Experimental results on ICDAR 2011 and ICDAR 2013 robust reading competition datasets demonstrate that our method yields state-of-the-art performance both in recall and precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 107.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hong, R., Yang, Y., Wang, M., Hua, X.S.: Learning visual semantic relationships for efficient visual retrieval. IEEE Trans. Big Data 1(4), 152–161 (2015)

    Article  Google Scholar 

  2. Hong, R., Zhang, L., Zhang, C., Zimmermann, R.: Flickr circles: aesthetic tendency discovery by multi-view regularized topic modeling. IEEE Trans. Multimedia 18(8), 1555–1567 (2016)

    Article  Google Scholar 

  3. Hong, R., Hu, Z., Wang, R., Wang, M., Tao, D.: Multi-view object retrieval via multi-scale topic models. IEEE Trans. Image Process. 25(12), 5814–5827 (2016)

    Article  MathSciNet  Google Scholar 

  4. Li, Z., Tang, J.: Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans. Multimedia 17(11), 1989–1999 (2015)

    Article  Google Scholar 

  5. Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: PhotoOCR: reading text in uncontrolled conditions. In: IEEE International Conference on Computer Vision, pp. 785–792 (2014)

    Google Scholar 

  6. Zhu, Z., Liang, D., Zhang, S., Huang, X., Li, B., Hu, S.: Traffic-sign detection and classification in the wild. In: Computer Vision and Pattern Recognition, pp. 2110–2118 (2016)

    Google Scholar 

  7. Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)

    Article  Google Scholar 

  8. Zhang, Z., Shen, W., Yao, C., Bai, X.: Symmetry-based text line detection in natural scenes. In: Computer Vision and Pattern Recognition, pp. 2558–2567 (2015)

    Google Scholar 

  9. Wang, K., Belongie, S.: Word spotting in the wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15549-9_43

    Chapter  Google Scholar 

  10. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)

    Google Scholar 

  11. Koo, H.I., Kim, D.H.: Scene text detection via connected component clustering and nontext filtering. IEEE Press (2013)

    Google Scholar 

  12. Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538–3545 (2012)

    Google Scholar 

  13. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)

    Article  Google Scholar 

  14. Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  15. Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 687–691. IEEE (2011)

    Google Scholar 

  16. Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., i Bigorda, L.G., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., de las Heras, L.P.: ICDAR 2013 robust reading competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1484–1493. IEEE (2013)

    Google Scholar 

  17. Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1457–1464. IEEE (2011)

    Google Scholar 

  18. Tian, S., Pan, Y., Huang, C., Lu, S., Yu, K., Lim Tan, C.: Text flow: a unified text detection system in natural scene images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4651–4659 (2015)

    Google Scholar 

  19. Li, Z., Liu, J., Tang, J., Lu, H.: Robust structured subspace learning for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 37(10), 2085–2098 (2015)

    Article  Google Scholar 

  20. Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1491–1496. IEEE (2011)

    Google Scholar 

  21. Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recogn. (IJDAR) 8(4), 280–296 (2006)

    Article  Google Scholar 

  22. Yin, X., Yin, X.C., Hao, H.W., Iqbal, K.: Effective text localization in natural scene images with MSER, geometry-based grouping and AdaBoost. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 725–728. IEEE (2012)

    Google Scholar 

  23. Li, Y., Shen, C., Jia, W., Hengel, A.V.D.: Leveraging surrounding context for scene text detection. In: IEEE International Conference on Image Processing, pp. 2264–2268 (2013)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by the Natural Science Foundation of China under Grant 61301106, 61327013 and U1611461.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinguang Xiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, L., Xiang, X. (2018). Scene Text Detection with Cascaded Filtering and Grouping Modules. In: Huet, B., Nie, L., Hong, R. (eds) Internet Multimedia Computing and Service. ICIMCS 2017. Communications in Computer and Information Science, vol 819. Springer, Singapore. https://doi.org/10.1007/978-981-10-8530-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8530-7_46

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8529-1

  • Online ISBN: 978-981-10-8530-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics