research-article

A Robust Ensemble of ResNets for Character Level End-to-end Text Detection in Natural Scene Images

Authors:
Jinsu Kim

Korea Advanced Institute of Science and Technology, Daehak-ro, Yuseong-gu, Daejeon

Korea Advanced Institute of Science and Technology, Daehak-ro, Yuseong-gu, Daejeon
View Profile

,
Yoonhyung Kim

Korea Advanced Institute of Science and Technology, Daehak-ro, Yuseong-gu, Daejeon

Korea Advanced Institute of Science and Technology, Daehak-ro, Yuseong-gu, Daejeon
View Profile

,
Changick Kim

Korea Advanced Institute of Science and Technology, Daehak-ro, Yuseong-gu, Daejeon

Korea Advanced Institute of Science and Technology, Daehak-ro, Yuseong-gu, Daejeon
View Profile

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia IndexingJune 2017Article No.: 10Pages 1–6https://doi.org/10.1145/3095713.3095724

Published:19 June 2017Publication History

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

Pages 1–6

ABSTRACT

Detecting text in natural scene images is a challenging task. In this paper, we propose a character-level end-to-end text detection algorithm in natural scene images. In general, text detection tasks are categorized into three parts: text localization, text segmentation, and text recognition. The proposed method aims not only to localize but also to recognize text. To do these tasks successfully, the proposed method consists of four steps: character candidate patch extraction, patch classification using ensemble of ResNets, non-character region elimination, and character region grouping via self-tuning spectral clustering. In the character candidate patch extraction step, character candidate patches are extracted from the image by using both edge information from multi-scale images and Maximally Stable Extremal Regions (MSERs). Then each patch is classified into either character patch or non-character patch by using the deep network that is composed of three ResNets with different hyper-parameters. Text regions are determined by filtering out non-character patches. In order to make further reduction of classification errors, character characteristics are employed to compensate classification results of the ensemble of ResNets. To evaluate the text detection performance, character regions are grouped via self-tuning spectral clustering. The proposed method shows competitive performance on the ICDAR 2013 dataset.

References

H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk, and B. Girod. 2011. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In 18th IEEE International Conference on Image Processing. IEEE. Google ScholarCross Ref
H. Cho, M. Sung, and B. Jun. 2016. Canny Text Detector: Fast and Robust Scene Text Localization Algorithm. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
B. Epshtein, E. Ofek, and Y. Wexler. 2010. Detecting text in natural scenes with stroke width transform. In Computer Vision and Pattern Recognition (CVPR), IEEE Conference on. IEEE. Google ScholarCross Ref
K. He, X. Zhang, S. Ren, and J. Sun. 2015. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015).Google Scholar
T. He, W. Huang, Y. Qiao, and J. Yao. 2016. Text-attentional convolutional neural network for scene text detection. IEEE Transactions on Image Processing 25, 6 (2016), 2529--2541. Google ScholarDigital Library
W. Huang, Y. Qiao, and X. Tang. 2014. Robust scene text detection with convolution neural network induced mser trees. In European Conference on Computer Vision. Springer. Google ScholarCross Ref
D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, David F. M., J. A. Almazan, and L. P. de las Heras. 2013. ICDAR 2013 robust reading competition. In 12th International Conference on Document Analysis and Recognition. IEEE. Google ScholarDigital Library
H. I. Koo and D. H. Kim. 2013. Scene text detection via connected component clustering and nontext filtering. IEEE Transactions on Image Processing 22, 6 (2013), 2296--2305. Google ScholarDigital Library
J. Matas, O. Chum, M. Urban, and T. Pajdla. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image and vision computing 22, 10 (2004), 761--767. Google ScholarCross Ref
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
T. Wang, D. J. Wu, A. Coates, and A. Y. Ng. 2012. End-to-end text recognition with convolutional neural networks. In Pattern Recognition (ICPR), 21st International Conference on. IEEE.Google Scholar
L. Xu, C. Lu, Y. Xu, and J. Jia. 2011. Image smoothing via L0 gradient minimization. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 174.Google Scholar
X. C. Yin, X. Yin, K. Huang, and H. W. Hao. 2014. Robust text detection in natural scene images. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2014), 970--983. Google ScholarCross Ref
Lihi Zelnik-Manor and Pietro Perona. 2005. Self-tuning spectral clustering. (2005).Google Scholar
Zheng Zhang, Wei Shen, Cong Yao, and Xiang Bai. 2015. Symmetry-based text line detection in natural scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
S. Zhu and R. Zanibbi. 2016. A Text Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref

Index Terms

A Robust Ensemble of ResNets for Character Level End-to-end Text Detection in Natural Scene Images
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Visual content-based indexing and retrieval

Recommendations

Text detection in chart images

Common OCR (Optical Character Recognition) systems fail to detect and recognize small text strings of few characters, in particular when a text line is not horizontal. Such text regions are typical for chart images. In this paper we present an algorithm ...
Read More
A novel machine learning approach for scene text extraction
Abstract
Image based text extraction is a popular and challenging research field in computer vision in recent times. In this paper, an exigent aspect such as natural scene text identification and extraction has been investigated due to ...
Highlights
- A novel method is proposed for scene text extraction, recognition and correction.
Read More
A text reading algorithm for natural images

Reading text in natural images has focused again the attention of many researchers during the last few years due to the increasing availability of cheap image-capturing devices in low-cost products like mobile phones. Therefore, as text can be found on ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing
June 2017
237 pages
ISBN:9781450353335
DOI:10.1145/3095713

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 June 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ResNet
Text detection
deep network
ensemble
spectral clustering
text recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 158
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Robust Ensemble of ResNets for Character Level End-to-end Text Detection in Natural Scene Images

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Text detection in chart images

A novel machine learning approach for scene text extraction

A text reading algorithm for natural images

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Robust Ensemble of ResNets for Character Level End-to-end Text Detection in Natural Scene Images

CBMI '17: Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Text detection in chart images

A novel machine learning approach for scene text extraction

A text reading algorithm for natural images

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media