research-article

STV2k: A New Benchmark for Scene Text Detection and Recognition

Authors:
Pingping Xiao

School of Information Science and Technology, Xiamen University, Fujian, China

School of Information Science and Technology, Xiamen University, Fujian, China
View Profile

,
Wan-Lei Zhao

School of Information Science and Technology, Xiamen University, Fujian, China

School of Information Science and Technology, Xiamen University, Fujian, China
View Profile

,
Da-Han Wang

School of Computer and Information Engineering, Xiamen University of Technology, Fujian, China

School of Computer and Information Engineering, Xiamen University of Technology, Fujian, China
View Profile

,
Hanzi Wang

School of Information Science and Technology, Xiamen University, Fujian, China

School of Information Science and Technology, Xiamen University, Fujian, China
View Profile

ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and ServiceAugust 2016Pages 344–348https://doi.org/10.1145/3007669.3008270

Published:19 August 2016Publication History

ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and Service

Pages 344–348

ABSTRACT

There are a wide range of applications for scene text detection and recognition due to the increasing popularity of portable digital devices. However, large-scale evaluation benchmark with multilingual and multi-oriented texts is still slow to occur to facilitate the research on scene text detection and recognition. In this paper, a large-scale and well-annotated scene text dataset, namely STV2k, is presented, which can be used for scene text detection as well as scene text recognition. Since all the images are collected from streets by smart phone, the textual scenes are rich of variations in layouts, color, fonts and backgrounds. Two state-of-the-art algorithms for scene text recognition are tested on this newly built dataset. The preliminary experiments demonstrate how challenging the scene text recognition is in real scenario.

References

J. J. Weinman, E. Learned-Miller, and A. R. Hanson. Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans. Pattern Anal. Mach. Intell., 31(10):1733--1746, October 2009. Google ScholarDigital Library
Q. Ye and D. Doermann. Text detection and recognition in imagery: A survey. IEEE Trans. Pattern Anal. Mach. Intell., 37(7):1480--1500, July 2015.Google ScholarDigital Library
Z. Zhang, C. Yao W. Shen, and X. Bai. Symmetry-based text line detection in natural scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2558--2567. IEEE, June 2015.Google ScholarCross Ref
S. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young. Icdar 2003 robust reading competitions. In Proceedings of the Seventh International Conference on Document Analysis and Recognition, pages 682--687. IEEE, August 2003. Google ScholarDigital Library
R. Nagy, A. Dicker, and K. Meyer-Wegener. Neocr: A configurable dataset for natural image text recognition. In Camera-Based Document Analysis and Recognition, pages 150--163. Springer, 2012. Google ScholarDigital Library
C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu. Detecting texts of arbitrary orientations in natural images. In Proceedings of the IEEE Conference on Computer-Vision and Pattern Recognition, pages 1083--1090. IEEE, June 2012. Google ScholarDigital Library
D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan, and L.-P. de las Heras. Icdar 2013 robust reading competition. In Proceedings of the Twelfth International Conference on Document Analysis and Recognition, pages 1484--1493. IEEE, August 2013. Google ScholarDigital Library
X.-C. Yin, K. Huang X. Yin, and H.-W. Hao. Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell., 36(5):970--983, May 2014.Google ScholarCross Ref
A. Shahab, F. Shafait, and A. Dengel. Icdar 2011 robust reading competition challenge 2: Reading text in scene images. In Proceedings of the Ninth International Conference on Document Analysis and Recognition, pages 1491--1496. IEEE, September 2011. Google ScholarDigital Library
D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. Chandrasekhar, S. Lu, F. Shafait, S. Uchida, and E. Valveny. Icdar 2015 competition on robust reading. In Proceedings of the Thirteenth International Conference on Document Analysis and Recognition, pages 1156--1160. IEEE, August 2015. Google ScholarDigital Library
K. Wang and S. Belongie. Word spotting in the wild. In Proceedings of the Eleventh European Conference on Computer Vision, pages 591--604. Springer, September 2010. Google ScholarDigital Library
C. Yi and Y. Tian. Text string detection from natural scenes by structure-based partition and grouping. Trans. Img. Proc., 20(9):2594--2605, September 2011. Google ScholarDigital Library
Y.-F. Pan, X. Hou, and C.-L. Liu. A hybrid approach to detect and localize texts in natural scene images. Trans. Img. Proc., 20(3):1057--7149, March 2011. Google ScholarDigital Library
X. Yin, W. Pei, J. Zhang, and H. Hao. Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell., 37(9):1930--1937, September 2015.Google ScholarDigital Library
S. Tian, U. Bhattacharya, S. Lu, B. Su, Q. Wang, X. Wei, Y. Lu, and C. L. Tan. Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognition, 51:125--134, March 2016. Google ScholarDigital Library
M. V. Teofilo de Campos and Rakesh Babu. Character recognition in natural images. In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, pages 273--280.Google Scholar
M. Jaderberg, A. Vedaldi, and A. Zisserman. Deep features for text spotting. In Proceedings of the Thirteenth European Conference on Computer Vision, pages 512--528.Google Scholar
D. Zhang, D.-H. Wang, and H. Wang. Scene text recognition using sparse coding based feature. In IEEE International Conference on Image Processing, pages 1066--1070. IEEE, October 2014.Google ScholarCross Ref

Index Terms

STV2k: A New Benchmark for Scene Text Detection and Recognition
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Optical character recognition
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Rectification and recognition of text in 3-D scenes

Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of ...
Read More
MAST: multi-script annotation toolkit for scenic text
MOCR_AND '11: Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data

This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection ...
Read More
ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition
Document Analysis and Recognition - ICDAR 2023
Abstract
In this report, we present the final results of the ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition. The RoadText challenge is based on the RoadText-1K dataset and aims to assess and enhance current methods for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and Service
August 2016
360 pages
ISBN:9781450348508
DOI:10.1145/3007669

Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 August 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Annotation tool
Chinese dataset
Scene text
Street image
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ICIMCS'16 Paper Acceptance Rate77of118submissions,65%Overall Acceptance Rate163of456submissions,36%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 153
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

STV2k: A New Benchmark for Scene Text Detection and Recognition

ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and Service

ABSTRACT

References

Cited By

Index Terms

Recommendations

Rectification and recognition of text in 3-D scenes

MAST: multi-script annotation toolkit for scenic text

ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

STV2k: A New Benchmark for Scene Text Detection and Recognition

ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and Service

ABSTRACT

References

Cited By

Index Terms

Recommendations

Rectification and recognition of text in 3-D scenes

MAST: multi-script annotation toolkit for scenic text

ICDAR 2023 Competition on RoadText Video Text Detection, Tracking and Recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media