Paper
7 February 2011 Text extraction from web images
Changsong Liu, Cheng Yang, Xiaoqing Ding, Jian Fan
Author Affiliations +
Proceedings Volume 7879, Imaging and Printing in a Web 2.0 World II; 78790P (2011) https://doi.org/10.1117/12.880027
Event: IS&T/SPIE Electronic Imaging, 2011, San Francisco Airport, California, United States
Abstract
Web images constitute an important part of web document and become a powerful medium of expression, especially for the images containing text. The text embedded in web images often carry semantic information related to layout and content of the pages. Statistics show that there is a significant need to detect and recognize text from web images. In this paper, we first give a short review of these methods proposed for text detection and recognition in web images; then a framework to extract from web images is presented, including stages of text localization and recognition. In text localization stage, localization method is applied to generate text candidates and a two-stage strategy is utilized to select text candidates, then text regions are localized using a coarse-to-fine text lines extraction algorithm. For text recognition, two text region binarization methods have been proposed to improve the performance of text recognition in web images. Experimental results for text localization and recognition prove the effectiveness of these methods. Additionally, a recognition evaluation for text regions in web images has been conducted for benchmark.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Changsong Liu, Cheng Yang, Xiaoqing Ding, and Jian Fan "Text extraction from web images", Proc. SPIE 7879, Imaging and Printing in a Web 2.0 World II, 78790P (7 February 2011); https://doi.org/10.1117/12.880027
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Optical character recognition

Image resolution

Detection and tracking algorithms

Image processing algorithms and systems

Video

Binary data

Back to Top