ABSTRACT
There are a wide range of applications for scene text detection and recognition due to the increasing popularity of portable digital devices. However, large-scale evaluation benchmark with multilingual and multi-oriented texts is still slow to occur to facilitate the research on scene text detection and recognition. In this paper, a large-scale and well-annotated scene text dataset, namely STV2k, is presented, which can be used for scene text detection as well as scene text recognition. Since all the images are collected from streets by smart phone, the textual scenes are rich of variations in layouts, color, fonts and backgrounds. Two state-of-the-art algorithms for scene text recognition are tested on this newly built dataset. The preliminary experiments demonstrate how challenging the scene text recognition is in real scenario.
- J. J. Weinman, E. Learned-Miller, and A. R. Hanson. Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans. Pattern Anal. Mach. Intell., 31(10):1733--1746, October 2009. Google ScholarDigital Library
- Q. Ye and D. Doermann. Text detection and recognition in imagery: A survey. IEEE Trans. Pattern Anal. Mach. Intell., 37(7):1480--1500, July 2015.Google ScholarDigital Library
- Z. Zhang, C. Yao W. Shen, and X. Bai. Symmetry-based text line detection in natural scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2558--2567. IEEE, June 2015.Google ScholarCross Ref
- S. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young. Icdar 2003 robust reading competitions. In Proceedings of the Seventh International Conference on Document Analysis and Recognition, pages 682--687. IEEE, August 2003. Google ScholarDigital Library
- R. Nagy, A. Dicker, and K. Meyer-Wegener. Neocr: A configurable dataset for natural image text recognition. In Camera-Based Document Analysis and Recognition, pages 150--163. Springer, 2012. Google ScholarDigital Library
- C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu. Detecting texts of arbitrary orientations in natural images. In Proceedings of the IEEE Conference on Computer-Vision and Pattern Recognition, pages 1083--1090. IEEE, June 2012. Google ScholarDigital Library
- D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan, and L.-P. de las Heras. Icdar 2013 robust reading competition. In Proceedings of the Twelfth International Conference on Document Analysis and Recognition, pages 1484--1493. IEEE, August 2013. Google ScholarDigital Library
- X.-C. Yin, K. Huang X. Yin, and H.-W. Hao. Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell., 36(5):970--983, May 2014.Google ScholarCross Ref
- A. Shahab, F. Shafait, and A. Dengel. Icdar 2011 robust reading competition challenge 2: Reading text in scene images. In Proceedings of the Ninth International Conference on Document Analysis and Recognition, pages 1491--1496. IEEE, September 2011. Google ScholarDigital Library
- D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. Chandrasekhar, S. Lu, F. Shafait, S. Uchida, and E. Valveny. Icdar 2015 competition on robust reading. In Proceedings of the Thirteenth International Conference on Document Analysis and Recognition, pages 1156--1160. IEEE, August 2015. Google ScholarDigital Library
- K. Wang and S. Belongie. Word spotting in the wild. In Proceedings of the Eleventh European Conference on Computer Vision, pages 591--604. Springer, September 2010. Google ScholarDigital Library
- C. Yi and Y. Tian. Text string detection from natural scenes by structure-based partition and grouping. Trans. Img. Proc., 20(9):2594--2605, September 2011. Google ScholarDigital Library
- Y.-F. Pan, X. Hou, and C.-L. Liu. A hybrid approach to detect and localize texts in natural scene images. Trans. Img. Proc., 20(3):1057--7149, March 2011. Google ScholarDigital Library
- X. Yin, W. Pei, J. Zhang, and H. Hao. Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell., 37(9):1930--1937, September 2015.Google ScholarDigital Library
- S. Tian, U. Bhattacharya, S. Lu, B. Su, Q. Wang, X. Wei, Y. Lu, and C. L. Tan. Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognition, 51:125--134, March 2016. Google ScholarDigital Library
- M. V. Teofilo de Campos and Rakesh Babu. Character recognition in natural images. In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, pages 273--280.Google Scholar
- M. Jaderberg, A. Vedaldi, and A. Zisserman. Deep features for text spotting. In Proceedings of the Thirteenth European Conference on Computer Vision, pages 512--528.Google Scholar
- D. Zhang, D.-H. Wang, and H. Wang. Scene text recognition using sparse coding based feature. In IEEE International Conference on Image Processing, pages 1066--1070. IEEE, October 2014.Google ScholarCross Ref
Index Terms
- STV2k: A New Benchmark for Scene Text Detection and Recognition
Recommendations
Rectification and recognition of text in 3-D scenes
Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of ...
MAST: multi-script annotation toolkit for scenic text
MOCR_AND '11: Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text DataThis paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection ...
ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition
Document Analysis and Recognition - ICDAR 2023AbstractIn this report, we present the final results of the ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition. The RoadText challenge is based on the RoadText-1K dataset and aims to assess and enhance current methods for ...
Comments