skip to main content
10.1145/1860559.1860617acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
poster

Towards a common evaluation strategy for table structure recognition algorithms

Published: 21 September 2010 Publication History

Abstract

A number of methods for evaluating table structure recognition systems have been proposed in the literature, which have been used successfully for automatic and manual optimization of their respective algorithms. Unfortunately, the lack of standard, ground-truthed datasets coupled with the ambiguous nature of how humans interpret tabular data has made it difficult to compare the obtained results between different systems developed by different research groups.
With reference to these approaches, we describe our experiences in comparing our algorithm for table detection and structure recognition to another recently published system using a freely available dataset of 75 PDF documents. Based on examples from this dataset, we define several classes of errors and propose how they can be treated consistently to eliminate ambiguities and ensure the repeatability of the results and their comparability between different systems from different research groups.

References

[1]
}}F. Cesarini, S. Marinai, L. Sarti, and G. Soda. Trainable table location in document images. In Proc. of ICPR 2002, Vol. 3, pp. 236--240, 2002.
[2]
}}T. Hassan. Evaluating Table Structure Recognition Algorithms. PRIP Technical Report #125, ftp://ftp.prip.tuwien.ac.at/pub/publications/trs/tr125.pdf July 201.
[3]
}}T. Hassan and R. Baumgartner. Table recognition and understanding from PDF files. In Proc. of ICDAR 2007. vol. 2, pp. 1143--1147, 2007.
[4]
}}J. Hu, R. Kashi, D. Lopresti, and G. Wilfong. Table structure recognition and its evaluation. In Proc. of DR VIII, 2001.
[5]
}}J. Hu, R. Kashi, D. Lopresti, and G. Wilfong. Evaluating the performance of table processing algorithms. Intl. J. of Doc. Anal. and Recog., 4(3):140--153, March 2002.
[6]
}}J. Hu, R. Kashi, D. Lopresti, G. Wilfong, and G. Nagy. Why table ground-truthing is hard. In Proc. of ICDA. 2001, pp. 129--133, 2001.
[7]
}}T. Kieninger and A. Dengel. Applying the T-Recs table recognition system to the business letter domain. In Proc. of ICDAR 2001, pp. 518--522, 2001.
[8]
}}T. Kieninger and A. Dengel. An approach towards benchmarking of table structure recognition results. In Proc. of ICDAR 2005, pp. 1232--1236, 2005.
[9]
}}M. Ruffolo and E. Oro. PDF-TREX: An approach for recognizing and extracting tables from PDF documents. In Proc. of ICDAR 2009, pp. 906--910, 2009.
[10]
}}M. Ruffolo and E. Oro. PDF-TREX dataset. http://staff.icar.cnr.it/ruffolo/files/PDF-TREX/Dataset.zip accessed Sept. 2005.
[11]
}}B. Yildiz, K. Kaiser, and S. Miksch. pdf2table: A method to extract table information from PDF files. In Proc. of Indian Intl. Conf. on AI 2005, pp. 1773--1785, 2005.

Cited By

View all
  • (2023)GriTS: Grid Table Similarity Metric for Table Structure RecognitionDocument Analysis and Recognition - ICDAR 202310.1007/978-3-031-41734-4_33(535-549)Online publication date: 19-Aug-2023
  • (2013)Towards generic framework for tabular data extraction and management in documentsProceedings of the sixth workshop on Ph.D. students in information and knowledge management10.1145/2513166.2513175(3-10)Online publication date: 1-Nov-2013
  • (2012)A methodology for evaluating algorithms for table understanding in PDF documentsProceedings of the 2012 ACM symposium on Document engineering10.1145/2361354.2361365(45-48)Online publication date: 4-Sep-2012

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '10: Proceedings of the 10th ACM symposium on Document engineering
September 2010
298 pages
ISBN:9781450302319
DOI:10.1145/1860559
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 September 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. evaluation
  2. ground truth
  3. precision
  4. recall
  5. table detection
  6. table recognition
  7. table structure recognition

Qualifiers

  • Poster

Conference

DocEng2010
Sponsor:
DocEng2010: ACM Symposium on Document Engineering
September 21 - 24, 2010
Manchester, United Kingdom

Acceptance Rates

DocEng '10 Paper Acceptance Rate 13 of 42 submissions, 31%;
Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)GriTS: Grid Table Similarity Metric for Table Structure RecognitionDocument Analysis and Recognition - ICDAR 202310.1007/978-3-031-41734-4_33(535-549)Online publication date: 19-Aug-2023
  • (2013)Towards generic framework for tabular data extraction and management in documentsProceedings of the sixth workshop on Ph.D. students in information and knowledge management10.1145/2513166.2513175(3-10)Online publication date: 1-Nov-2013
  • (2012)A methodology for evaluating algorithms for table understanding in PDF documentsProceedings of the 2012 ACM symposium on Document engineering10.1145/2361354.2361365(45-48)Online publication date: 4-Sep-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media