Paper
21 March 2013 Non-Manhattan layout extraction algorithm
Aziza Satkhozhina, Ildus Ahmadullin, Jan P. Allebach, Qian Lin, Jerry Liu, Daniel Tretter, Eamonn O'Brien-Strain, Andrew Hunter
Author Affiliations +
Proceedings Volume 8664, Imaging and Printing in a Web 2.0 World IV; 86640A (2013) https://doi.org/10.1117/12.2009424
Event: IS&T/SPIE Electronic Imaging, 2013, Burlingame, California, United States
Abstract
Automated publishing requires large databases containing document page layout templates. The number of layout templates that need to be created and stored grows exponentially with the complexity of the document layouts. A better approach for automated publishing is to reuse layout templates of existing documents for the generation of new documents. In this paper, we present an algorithm for template extraction from a docu- ment page image. We use the cost-optimized segmentation algorithm (COS) to segment the image, and Voronoi decomposition to cluster the text regions. Then, we create a block image where each block represents a homo- geneous region of the document page. We construct a geometrical tree that describes the hierarchical structure of the document page. We also implement a font recognition algorithm to analyze the font of each text region. We present a detailed description of the algorithm and our preliminary results.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Aziza Satkhozhina, Ildus Ahmadullin, Jan P. Allebach, Qian Lin, Jerry Liu, Daniel Tretter, Eamonn O'Brien-Strain, and Andrew Hunter "Non-Manhattan layout extraction algorithm", Proc. SPIE 8664, Imaging and Printing in a Web 2.0 World IV, 86640A (21 March 2013); https://doi.org/10.1117/12.2009424
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Image processing algorithms and systems

Detection and tracking algorithms

Visualization

Algorithm development

Analytical research

Image processing

RELATED CONTENT

Archiving of line-drawing images
Proceedings of SPIE (November 21 1995)
Text segmentation for automatic document processing
Proceedings of SPIE (January 07 1999)
Benchmarking system for document analysis algorithms
Proceedings of SPIE (April 01 1998)
Benchmarking of document page segmentation
Proceedings of SPIE (December 22 1999)

Back to Top