A New Multi-modal Technique for Bib Number/Text Detection in Natural Images

Roy, Sangheeta; Shivakumara, Palaiahnakote; Mondal, Prabir; Raghavendra, R.; Pal, Umapada; Lu, Tong

doi:10.1007/978-3-319-24075-6_47

Sangheeta Roy¹⁸,
Palaiahnakote Shivakumara¹⁸,
Prabir Mondal¹⁹,
R. Raghavendra²⁰,
Umapada Pal¹⁹ &
…
Tong Lu²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9314))

Included in the following conference series:

Pacific Rim Conference on Multimedia

1855 Accesses
6 Citations

Abstract

The detection and recognition of racing bib number/text, which is printed on paper, cardboard tag, or t-shirt in natural images in marathon, race and sports, is challenging due to person movement, non-rigid surface, distortion by non-illumination, severe occlusions, orientation variations etc. In this paper, we present a multi-modal technique that combines both biometric and textual features to achieve good results for bib number/text detection. We explore face and skin features in a new way for identifying text candidate regions from input natural images. For each text candidate region, we propose to use text detection and recognition methods for detecting and recognizing bib numbers/texts, respectively. To validate the usefulness of the proposed multi-modal technique, we conduct text detection and recognition experiments before text candidate region detection and after text candidate region detection in terms of recall, precision and f-measure. Experimental results show that the proposed multi-modal technique outperforms the existing bib number detection method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ami, B., Basha, T., Avidan, S.: Racing bib number recognition. In: Proceedings of BMCV (2012)
Google Scholar
Klontz, J.C., Jain, A.K.: A case study of automated face recognition: the boston marathon bombing suspects. Computer 46, 91–94 (2013)
Article Google Scholar
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE. Trans. PAMI 37, 1480–1500 (2015)
Article Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of CVPR, pp. 2963–2970 (2010)
Google Scholar
Rong, L., Suyu, W., Shi, Z.X.: A two level algorithm for text detection in natural scene images. In: Proceedings of DAS, pp. 329–333 (2014)
Google Scholar
Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Shijian, L., Tan, C.L.: Multi-oriented video scene text detection through bayesian classification and boundary growing. IEEE Trans. CSVT 22, 1227–1235 (2012)
Google Scholar
Shivakumara, P., Phan, T.Q., Tan, C.L.: New fourier-statistical features in RGB space for video text detection. IEEE Trans. CSVT 20, 1520–1532 (2010)
Google Scholar
Shivakumara, P., Phan, T.Q., Tan, C.L.: New wavelet and color features for text detection in video. In: Proceedings of ICPR, pp. 3996–3999 (2010)
Google Scholar
Roy, S., Shivakumara, P., Roy, P., Tan, C.L.: Wavelet-gradient-fusion for video text binarization. In: Proceedings of ICPR, pp. 3300–3303 (2012)
Google Scholar
Chattopadhyay, T., Reddy, V.R., Garain, U.: Automatic selection of binarization method for robust OCR. In: Proceedings of ICDAR, pp. 1170–1174 (2013)
Google Scholar
Wolf, C., Jolion, J.M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: Proceedings of ICPR, pp. 1037–1040 (2002)
Google Scholar
Moghaddam, R.F., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recogn. 43, 2186–2198 (2010)
Article MATH Google Scholar
Howe, N.R.: Document binarization with automatic parameter tuning. IJDAR 16, 247–258 (2013)
Article Google Scholar
Shivakumara, P., Phan, T.Q., Lu, S., Tan, C.L.: Gradient vector flow and grouping based method for arbitrarily-oriented scene text detection in video images. IEEE Trans. CSVT 23, 1729–1739 (2013)
Google Scholar
Zhang, J., Kasturi, R.: A novel text detection system based on character and link energies. IEEE Trans. PAMI 23, 4187–4198 (2013)
MathSciNet Google Scholar
Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: SnooperText: a text detection system for automatic indexing of urban scenes. CVIU 122, 92–104 (2014)
Google Scholar
Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural scene images. In: Proceedings CVPR, pp. 4034–4041 (2014)
Google Scholar
Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. IEEE Trans. PAMI 36, 970–983 (2014)
Article Google Scholar
Yao, C., Bai, X., Liu, W.: A unified framework for multioriented text detection and recognition. IEEE Trans. IP 23, 4737–4749 (2014)
MathSciNet Google Scholar
Yi, C., Tian, Y.: Scene text recognition in mobile application by character descriptor and structure configuration. IEEE Trans. IP 23, 2972–2982 (2014)
MathSciNet Google Scholar
Cui, Y., Huang, Q.: Character extraction of license plate from video. In: Proceedings of CVPR, pp. 502–507 (1997)
Google Scholar
Suresh, K.V., Kumar, G.M., Rajagopalan, A.N.: Super resolution of license plates in real traffic videos. IEEE Trans. ITS 8, 321–331 (2007)
Google Scholar
Yu, S., Li, B., Zhang, Q., Liu, C., Meng, M.A.H.: A novel license plate location method based on wavelet transform and EMD analysis. PR 48, 114–125 (2015)
Google Scholar
Conaire, C.O., Connor, N.E.O., Smeaton, A.F.: Detector adaptation by maximizing agreement between independent data sources. In: Proceedings of CVPR, pp. 1–6 (2007)
Google Scholar
Kakumanu, P., Makrogiannis, S., Bourbakis, N.: A survey of skin-color modeling and detection methods. PR 40, 1106–1122 (2007)
MATH Google Scholar
Lienhart, R., Maydt, J.: An extended set of haar-like features for rapid object detection. In: Proceedings of ICIP, pp. 900–903 (2002)
Google Scholar
Tesseract: http://code.google.com/p/tesseract-ocr/
Lu, W., Tao, D.: Multiview hessian regularization for image annotation. IEEE Trans. IP 22, 2676–2687 (2013)
MathSciNet Google Scholar
Xu, C., Tao, D., Xu, C.: Large-margin multi-view information bottleneck. IEEE Trans. PAMI 36, 1559–1572 (2014)
Article Google Scholar

Download references

Acknowledgment

The work described in this paper was supported by the Natural Science Foundation of China under Grant No. 61272218 and No. 61321491, and the Program for New Century Excellent Talents under NCET-11-0232.

Author information

Authors and Affiliations

Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Sangheeta Roy & Palaiahnakote Shivakumara
Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India
Prabir Mondal & Umapada Pal
Norwegian Biometric Laboratory, Gjovik University College, Gjovik, Norway
R. Raghavendra
National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China
Tong Lu

Authors

Sangheeta Roy
View author publications
You can also search for this author in PubMed Google Scholar
Palaiahnakote Shivakumara
View author publications
You can also search for this author in PubMed Google Scholar
Prabir Mondal
View author publications
You can also search for this author in PubMed Google Scholar
R. Raghavendra
View author publications
You can also search for this author in PubMed Google Scholar
Umapada Pal
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Palaiahnakote Shivakumara .

Editor information

Editors and Affiliations

Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Chinese Academy of Sciences, Institute of Automation, Beijing, China
Jitao Sang
ICU, IVY Lab, KAIST, Daejeon, Korea (Republic of)
Yong Man Ro
KAIST, Daejeon, Korea (Republic of)
Junmo Kim
College of Computer Science, Zhejiang University, Hangzhou, China
Fei Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Roy, S., Shivakumara, P., Mondal, P., Raghavendra, R., Pal, U., Lu, T. (2015). A New Multi-modal Technique for Bib Number/Text Detection in Natural Images. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_47

Download citation

DOI: https://doi.org/10.1007/978-3-319-24075-6_47
Published: 22 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24074-9
Online ISBN: 978-3-319-24075-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics