research-article

Context-aware and content-based dynamic Voronoi page segmentation

Authors:

David DoermannAuthors Info & Claims

DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems

Pages 73 - 80

https://doi.org/10.1145/1815330.1815340

Published: 09 June 2010 Publication History

Abstract

This paper presents a dynamic approach to document page segmentation based on inter-component relationships, local patterns and context features. State-of-the art page segmentation algorithms segment zones based on local properties of neighboring connected components such as distance and orientation, and do not typically consider additional properties other than size. Our proposed approach uses a contextually aware and dynamically adaptive page segmentation scheme. The page is first over-segmented using a dynamically adaptive scheme of separation features based on [2] and adapted from [13]. A decision to form zones is then based on the context built from these local separation features and high-level content features. Zone-based evaluation was performed on sets of printed and handwritten documents in English and Arabic scripts with multiple font types, sizes and we achieved an increase of 15% over the accuracy reported in [2].

References

[1]

W. Abd Almageed, M. Agrawal, W. Seo, and D. Doermann. Document-zone classification using partial least squares and hybrid classifiers. Int'l Conf. on Patt. Reco., pages 1--4, 2008.

[2]

M. Agrawal and D. Doermann. Voronoi++: A dynamic page segmentation approach based on voronoi and docstrum features. In Proc. 10th Int'l Conf. on Doc. Analysis and Reco., pages 1011--1015, 2009.

Digital Library

[3]

A. Antonacopoulos and R. Ritchings. Flexible page segmentation using the background. In Proc. 12th Int'l Conf. on Patt. Reco., volume 2, pages 339--344, Oct 1994.

[4]

H. S. Baird. Background structure in document images. In Advances in Structural and Syntactic Pattern Recognition, pages 17--34. World Scientific, 1994.

[5]

T. M. Breuel. Two geometric algorithms for layout analysis. In Workshop on Document Analysis Systems, pages 188--199. Springer-Verlag, 2002.

Digital Library

[6]

V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid. Groups of adjacent contour segments for object detection. IEEE Trans. Patt. Anal. Mach. Intell., 30(1):36--51, 2008.

Digital Library

[7]

L. A. Fletcher and R. Kasturi. A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell., 10(6):910--918, 1988.

Digital Library

[8]

I. Guyon, R. M. Haralick, J. J. Hull, and I. T. Phillips. Data sets for ocr and document image understanding research. In Proc. of SPIE - Document Recognition IV, pages 779--799. World Scientific, 1997.

[9]

F. Hones and J. Lichter. Layout extraction of mixed-mode documents. Mach. Vision Appl., 7(4):237--246, 1994.

Digital Library

[10]

A. Jain and Y. Zhong. Page segmentation using texture analysis. Patt. Reco., 29(5):743--770, May 1996.

Digital Library

[11]

A. K. Jain and S. Bhattacharjee. Text segmentation using gabor filters for automatic document processing. Mach. Vision Appl., 5(3):169--184, 1992.

Digital Library

[12]

N. Kato, M. Suzuki, S. Omachi, H. Aso, and Y. Nemoto. A handwritten character recognition system using directional element feature and asymmetric mahalanobis distance. IEEE Trans. Patt. Anal. Mach. Intell., 21(3):258--262, 1999.

Digital Library

[13]

K. Kise, A. Sato, and M. Iwata. Segmentation of page images using the area voronoi diagram. Comput. Vis. Image Underst., 70(3):370--382, 1998.

Digital Library

[14]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. Int'l J. Comput. Vision, 60(2):91--110, 2004.

Digital Library

[15]

S. J. M. Roth and D. Doermann. Gedi: Ground truth. editor and document interface. In Summit on Arabic and Chinese Handwriting Recognition, 2006.

[16]

S. Mao and T. Kanungo. Automatic training of page segmentation algorithms: An optimization approach. In Proc. of Int'l Conf. on Patt. Reco., pages 531--534, 2000.

[17]

G. Nagy, S. Seth, and M. Viswanathan. A prototype document image analysis system for technical journals. Computer, 25(7):10--22, 1992.

Digital Library

[18]

N. Normand and C. Viard-Gaudin. A background based adaptive page segmentation algorithm. In Proc. 3rd Int'l Conf. on Doc. Analysis and Reco., page 138, Washington, DC, USA, 1995. IEEE Computer Society.

Digital Library

[19]

L. O'Gorman. The document spectrum for page layout analysis. IEEE Trans. Patt. Anal. Mach. Intell., 15(11):1162--1173, 1993.

Digital Library

[20]

T. Ojala, M. Pietikäinen, and T. Mäenpää. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell., 24(7):971--987, 2002.

Digital Library

[21]

T. Pavlidis and J. Zhou. Page segmentation and classification. CVGIP: Graph. Models Image Process., 54(6):484--496, 1992.

Digital Library

[22]

I. Sekita, R. Mori, K. Yamamoto, H. Yamada, and K. Toraichi. Feature extraction of handwritten japanese characters by spline functions for relaxation matching. Patt. Reco., 21(1):9--17, 1988.

Digital Library

[23]

W. Seo, M. Agrawal, and D. Doermann. Performance evaluation tools for zone segmentation and classification (PETS). Int'l Conf. on Patt. Reco., 2010.

Digital Library

[24]

F. Shafait, D. Keysers, and T. M. Breuel. Performance comparison of six algorithms for page segmentation. In 7th IAPR Workshop on Document Analysis Systems, pages 368--379. Springer, 2006.

Digital Library

[25]

Y. Wang, I. T. Phillips, and R. M. Haralick. Document zone content classification and its performance evaluation. Patt. Reco., 39(1):57--73, 2006.

Digital Library

[26]

K. Y. Wong, R. G. Casey, and F. M. Wahl. Document Analysis System. j-IBM-JRD, 26(6):647--656, Nov. 1982.

Digital Library

Cited By

Rigaud CNguyen NBurie J(2021)Text Block Segmentation in Comic Speech BubblesPattern Recognition. ICPR International Workshops and Challenges10.1007/978-3-030-68780-9_22(250-261)Online publication date: 25-Feb-2021
https://doi.org/10.1007/978-3-030-68780-9_22
Zhu QLoke STrujillo-Rasua RJiang FXiang Y(2019)Applications of Distributed Ledger Technologies to the Internet of ThingsACM Computing Surveys10.1145/335998252:6(1-34)Online publication date: 14-Nov-2019
https://dl.acm.org/doi/10.1145/3359982
Liu LZhu JLi ZLu YDeng YHan JYin SWei S(2019)A Survey of Coarse-Grained Reconfigurable Architecture and DesignACM Computing Surveys10.1145/335737552:6(1-39)Online publication date: 16-Oct-2019
https://dl.acm.org/doi/10.1145/3357375
Show More Cited By

Recommendations

A robust page segmentation method for Persian/Arabic documents
ISCGAV'05: Proceedings of the 5th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision

Optical Character Recognition (OCR) softwares are widely used in the office automation systems. One of the first steps in the recognition of the documents is to segment the input image. Various methods have been offered for the English language. For the ...
Voronoi++: A Dynamic Page Segmentation Approach Based on Voronoi and Docstrum Features
ICDAR '09: Proceedings of the 2009 10th International Conference on Document Analysis and Recognition

This paper presents a dynamic approach to document page segmentation. Current page segmentation algorithms lack the ability to dynamically adapt local variations in the size, orientation and distance of components within a page. Our approach builds upon ...
Extending Page Segmentation Algorithms for Mixed-Layout Document Processing
ICDAR '11: Proceedings of the 2011 International Conference on Document Analysis and Recognition

The goal of this work is to add the capability to segment documents containing text, graphics, and pictures in the open source OCR engine OCRopus. To achieve this goal, OCRopus' RAST algorithm was improved to recognize non-text regions so that mixed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems

June 2010

490 pages

ISBN:9781605587738

DOI:10.1145/1815330

General Chairs:
David Doermann
University of Maryland, College Park
,
Venu Govindaraju
University at Buffalo, SUNY
,
Daniel Lopresti
Lehigh University
,
Prem Natarajan
Raytheon BBN Technologies

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

US Government
Defense Advanced Research Projects Agency

Conference

DAS '10

DAS '10: The Eighth IAPR International Workshop on Document Analysis Systems

June 9 - 11, 2010

Massachusetts, Boston, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
244
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rigaud CNguyen NBurie J(2021)Text Block Segmentation in Comic Speech BubblesPattern Recognition. ICPR International Workshops and Challenges10.1007/978-3-030-68780-9_22(250-261)Online publication date: 25-Feb-2021
https://doi.org/10.1007/978-3-030-68780-9_22
Zhu QLoke STrujillo-Rasua RJiang FXiang Y(2019)Applications of Distributed Ledger Technologies to the Internet of ThingsACM Computing Surveys10.1145/335998252:6(1-34)Online publication date: 14-Nov-2019
https://dl.acm.org/doi/10.1145/3359982
Liu LZhu JLi ZLu YDeng YHan JYin SWei S(2019)A Survey of Coarse-Grained Reconfigurable Architecture and DesignACM Computing Surveys10.1145/335737552:6(1-39)Online publication date: 16-Oct-2019
https://dl.acm.org/doi/10.1145/3357375
Binmakhashen GMahmoud S(2019)Document Layout AnalysisACM Computing Surveys10.1145/335561052:6(1-36)Online publication date: 16-Oct-2019
https://dl.acm.org/doi/10.1145/3355610
Basavarajaiah MSharma P(2019)Survey of Compressed Domain Video Summarization TechniquesACM Computing Surveys10.1145/335539852:6(1-29)Online publication date: 16-Oct-2019
https://dl.acm.org/doi/10.1145/3355398
Lilis YSavidis A(2019)A Survey of Metaprogramming LanguagesACM Computing Surveys10.1145/335458452:6(1-39)Online publication date: 16-Oct-2019
https://dl.acm.org/doi/10.1145/3354584
Usman MJan MHe XChen J(2019)A Survey on Representation Learning Efforts in Cybersecurity DomainACM Computing Surveys10.1145/333117452:6(1-28)Online publication date: 16-Oct-2019
https://dl.acm.org/doi/10.1145/3331174
(2017)A comprehensive survey of mostly textual document segmentation algorithms since 2008Pattern Recognition10.1016/j.patcog.2016.10.02364:C(1-14)Online publication date: 1-Apr-2017
https://dl.acm.org/doi/10.1016/j.patcog.2016.10.023
Keefer RBourbakis N(2016)From Image to XMLHuman-Computer Interaction10.4018/978-1-4666-8789-9.ch063(1295-1318)Online publication date: 2016
https://doi.org/10.4018/978-1-4666-8789-9.ch063
Keefer RBourbakis N(2014)From Image to XMLInternational Journal of Monitoring and Surveillance Technologies Research10.4018/ijmstr.20140101022:1(22-43)Online publication date: 1-Jan-2014
https://doi.org/10.4018/ijmstr.2014010102
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten