Skip to main content
Log in

A new ring radius transform-based thinning method for multi-oriented video characters

  • Special Issue Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Thinning that preserves visual topology of characters in video is challenging in the field of document analysis and video text analysis due to low resolution and complex background. This paper proposes to explore ring radius transform (RRT) to generate a radius map from Canny edges of each input image to obtain its medial axis. A radius value contained in the radius map here is the nearest distance to the edge pixels on contours. For the radius map, the method proposes a novel idea for identifying medial axis (middle pixels between two strokes) for arbitrary orientations of the character. Iterative-maximal-growing is then proposed to connect missing medial axis pixels at junctions and intersections. Next, we perform histogram on color information of medial axes with clustering to eliminate false medial axis segments. The method finally restores the shape of the character through radius values of medial axis pixels for the purpose of recognition with the Google Open source OCR (Tesseract). The method has been tested on video, natural scene and handwritten characters from ICDAR 2013, SVT, arbitrary-oriented data from MSRA-TD500, multi-script character data and MPEG7 object data to evaluate its performances at thinning level as well as recognition level. Experimental results comparing with the state-of-the-art methods show that the proposed method is generic and outperforms the existing methods in terms of obtaining skeleton, preserving visual topology and recognition rate. The method is also robust to handle characters of arbitrary orientations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Chatbri, H., Kameyama, K.: Using scale space filtering to make thinning algorithm robust against noise sketch images. Pattern Recognit. Lett. 42, 1–10 (2014)

  2. Su, Z., Cao, Z., Wang, Y.: Stroke extraction based ambiguous zone detection: a preprocessing step to recover dynamic information from handwritten Chinese characters. In: IJDAR, pp. 109–121 (2009)

  3. Guo, Z., Hall, R.W.: Parallel thinning with two-subiteration algorithms. Commun. ACM 32(3), 359–373 (1989)

  4. Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984)

  5. Ward, A.D., Hamarneh, G.: The groupwise medial axis transform for fuzzy skeletonization and pruning. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1084–1096 (2010)

  6. Alginahi, Y.M.: A survey on Arabic character segmentation. In: IJDAR, pp. 105–126 (2013)

  7. Lam, L., Lee, S.-W., Suen, C.Y.: Thinning methodologies—a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 869–885 (1992)

  8. Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings DAS, pp. 63–68 (2012)

  9. Zang, J., Kasturi, R.: Extraction of text objects in video documents: recent progress. In: Proceedings DAS, pp. 5–17 (2008)

  10. Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)

  11. Zhao, D., Shivakumara, P., Lu, S., Tan, C.L.: New spatial-gradient-features for video script identification. In: Proceedings DAS, pp. 38–42 (2012)

  12. Phan, T.Q., Shivakumara, P., Ding, Z., Lu, S., Tan, C.L.: Video script identification based on text lines. In: Proceedings ICDAR, pp. 1240–1244 (2011)

  13. Hoffman, M.E., Wong, E.K.: Scale-space approach to image thinning using the most prominent ridge line in the image pyramid data structure. In: Proceedings SPIE, pp. 242–252 (1998)

  14. Cai, J.: Robust filtering-based thinning algorithm for pattern recognition. Comput. J. 55(7), 887–896 (2012)

  15. Chen,Y.-S., Yu, Y.-T.: Thinning approach for noisy digital patterns. Pattern Recognit. 29(11), 1847–1862 (1996)

  16. Bag, S., Harit, G.: An improved contour-based thinning method for character images. Pattern Recognit. Lett. 32(14), 1836–1842 (2011)

  17. Shivakumara, P., Phan, T.Q., Bhowmick, S., Tan, C.L., Pal, U.: A novel ring radius transform for video character reconstruction. Pattern Recognit. 46(1), 131–140 (2013)

  18. Tian, S., Shivakumara, P., Phan, T.Q., Tan, C.L.: Scene character reconstruction through medial axis. In: Proceedings ICDAR, pp. 1360–1364 (2013)

  19. Shivakumara, P., Hong, D.B., Zhao, D., Tan, C.L., Pal, U.: A new iterative-midpoint-method for video character gap filling. In: Proceedings ICPR, pp. 673–676 (2012)

  20. Phan, T.Q., Shivakumara, P., Lu, S., Tan, C.L.: A gradient vector flow-based method for video character segmentation. In: Proceedings ICDAR, pp. 1024–1028 (2011)

  21. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings CVPR, pp. 2963–2970 (2010)

  22. Tesseract. http://code.google.com/p/tesseract-ocr/

  23. Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings ICDAR, pp. 1115–1124 (2013)

  24. Phan, T.Q., Shivakumara, P., Tian, S., Tan, C.L.: Recognizing text with perspective distortion in natural scenes. In: Proceedings ICCV, pp. 569–576 (2013)

  25. Yao, C., Bai, Z., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural scene imags. In: Proceedings CVPR, pp. 1083–1090 (2012)

  26. Latecki, L.J., Lakamper, R., Echardt, U.: Shape description for non-rigid shapes with a single closed conrour. In: Proceedings CVPR, pp. 424–429 (2000)

  27. Jalba, A., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 15(2), 331–341 (2006)

  28. Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR2013 Handwriting Segmentation Contest. In: Proceedings ICDAR, pp. 1402–1406 (2013)

  29. Jang, B.-K., Chin, R.T.: One-pass parallel thinning: analysis, properties, and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1129–1140 (1992)

Download references

Acknowledgments

The work described in this paper was supported by the Natural Science Foundation of China under Grant Nos. 61272218 and 61321491, and the Program for Chinese New Century Excellent Talents under NCET-11-0232. This research is also supported in part under Grant No. UM.TNC2/IPPP/UPGP/261/15 (BKP010-2013). We thank the anonymous reviewers for their constructive comments, which helped to improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Lu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Shivakumara, P., Wei, W. et al. A new ring radius transform-based thinning method for multi-oriented video characters. IJDAR 18, 137–151 (2015). https://doi.org/10.1007/s10032-015-0238-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-015-0238-y

Keywords

Navigation