Abstract
There is a significant need for performance evaluation of Layout Analysis methods. The greatest stumbling block is the lack of sufficient ground truth. In particular, there is currently no ground-truth for the evaluation of the performance of page segmentation methods dealing with complexshaped regions and documents with non-uniformly oriented regions. This paper describes a new, flexible, ground-truthing tool. It is fast and easy to use as it performs page segmentation to obtain a first description of regions. The ground-truthing system allows for the editing (merging, splitting and shape alteration) of each of the region outlines obtained from page segmentation. The resulting ground-truth regions are described in terms of isothetic polygons to ensure flexibility and wide applicability. The system also provides for the labelling of each of the ground truth regions according to the type of their content and their logical function. The former can be used to evaluate page classification, while the latter can be used in assessing logical layout structure extraction.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
G. Nagy, “Document Image Analysis: Automated Performance Evaluation”, Document Image Analysis Systems, A.L. Spitz and A. Dengel eds., World Scientific, 1995.
C.H. Lee and T. Kanungo, “The architecture of TRUEVIZ: A groundTRUth / metadata Editing and VisualiZing toolkit”, Symposium on Document Image Understanding Technology, April 23–25, 2001, Columbia, Maryland
I.T. Philips, S. Chen and R.M. Haralick, “CD-ROM Document Database Standard”, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR’93), Tsukuba, Japan, 1993, pp. 478–483.
J. Kanai, S.V. Rice, T.A. Nartker and G. Nagy, “Automated Evaluation of OCR Zoning”, IEEE Transactions on Pattern Recognition and Machine Intelligence, Vol. 17, No. 1, January, 1995, pp. 86–90.
B.A. Yanikoglu and L. Vincent, “Pink Panther: A Complete Environment for Ground-Truthing and Benchmarking Document Page Segmentation”, Pattern Recognition, Vol. 31, No. 9, 1998, pp. 1191–1204.
A. Antonacopoulos and A Brough, “Methodology for Flexible and Efficient Analysis of the Performance of Page Segmentation Algorithms”, Proceedings of 5th International Conference on Document Analysis and Recognition (ICDAR’99), Bangalore, India, 1999, IEEE-CS Press, pp. 451–454.
A. Antonacopoulos, “Page Segmentation Using the Description of the Background”, Computer Vision and Image Understanding, Special issue on Document Analysis and Retrieval, Vol. 70, No. 3, June 1998, pp. 350–369.
A. Antonacopoulos and R.T. Ritchings, “Representation and Classification of Complex-Shaped Printed Regions Using White Tiles”, Proceedings of 3rd International Conference on Document Analysis and Recognition (ICDAR’95), Montreal, Canada, 1995, Vol. 2, pp. 1132–1135.
B. Gatos, S.L. Mantzaris and A. Antonacopoulos, “First International Newspaper Contest”, Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR2001), Seattle, USA, September 2001, pp. 1190–1194.
A. Antonacopoulos, “Local Skew Angle Estimation from Background Space in Text Regions”, Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97), Ulm, Germany, August 18–20, 1997, IEEE-CS Press, pp. 684–688.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Antonacopoulos, A., Meng, H. (2002). A Ground-Truthing Tool for Layout Analysis Performance Evaluation. In: Lopresti, D., Hu, J., Kashi, R. (eds) Document Analysis Systems V. DAS 2002. Lecture Notes in Computer Science, vol 2423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45869-7_27
Download citation
DOI: https://doi.org/10.1007/3-540-45869-7_27
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44068-0
Online ISBN: 978-3-540-45869-2
eBook Packages: Springer Book Archive