skip to main content
10.1145/1815330.1815391acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdasConference Proceedingsconference-collections
research-article

Detecting and recognizing tables in spreadsheets

Published: 09 June 2010 Publication History

Abstract

Detecting tables in a spreadsheet is the first step needed to make spreadsheet documents accessible to individuals with visual disabilities. Techniques to enable aural presentation and navigation of tables have been proposed, but they assume a thorough knowledge of the structure of the table; on the other hand, boundaries and structure of tables in a spreadsheet are not evident without a visual exploration. This paper presents an algorithm for table recognition in spreadsheets. The algorithm uses three types of cells as its basis: title cell, header cell, and data cell. Different attributes of the cells are used to identify the cell type within a spreadsheet. Hierarchical clustering is used to aggregate cells to compose the functional components of a table. The algorithm has been evaluated on a diverse set of benchmarks with very encouraging results.

References

[1]
G. Douglas et al. The role of the who icf as a framework to interpret barriers and to inclusion. British J. Visual Impairment, 25(1):32--50, 2007.
[2]
D. W. Embley et al. Table-processing paradigms: a research survey. Int. J. Document Analysis and Recognition, 8:66--86, 2006.
[3]
D. W. Embley et al. Notes on Contemporary Table Recognition. DAS2006, LNCS3872, pp. 164--175, 2006.
[4]
R. Filepp et al. Improving the accessibility of aurally rendered HTML tables. ASSETS, ACM Press, 2002.
[5]
O. Hori and D. S. Doermann. Robust table-form structure analysis based on box-driven reasoning. Int. Conf Document Analysis & Recognition, 1995.
[6]
J. Hu et al. A system for understanding and reformulating tables. 4th ICPR Workshop on Document Analysis Systems, pp. 361--372, 2000.
[7]
J. Hu et al. Evaluating the performance of table processing algorithms. Int. J. Document Analysis and Recognition, 4(3):140--153, 2002.
[8]
J. Hu et al. Medium-independent table detection. In SPIE, volume 3967, pp. 291--302, 1999.
[9]
J. Hu et al. Table structure recognition and its evaluation. In SPIE, vol. 4307, pp. 44--55, 2001.
[10]
M. Hurst. Layout and language: An efficient algorithm for detecting text blocks based on spatial and linguistic evidence. Document Recognition and Retrieval VIII, pp. 56--67, 2001.
[11]
M. Hurst and S. Douglas. Layout and language: Preliminary investigations in recognizing the structure of tables. 4th. Int. Conf. Document Analysis and Recognition (ICDAR'97), pp. 1043--1047, 1997.
[12]
K. Itonori. Table structure recognition based on textblock arrangement andruled line position. 2nd Int. Conf. Document Analysis & Recognition, 1993.
[13]
T. Kieninger. Table structure recognition based on robust block segmentation. Document Recognition V, SPIE, pp. 22--32, 1998.
[14]
H. T. Ng et al. Learning to recognize tables in free text. 37th Annual Meeting of the ACL, 1999.
[15]
T. Oogane and C. Asakawa. An Interactive Method for Accessing Tables in HTML. ASSETS, ACM Press, pp. 126--128, 1998.
[16]
D. Pinto et al. Table extraction using conditional random fields. Annual Nat. Conf. Digital government research, pp. 1--4, 2003.
[17]
E. Pontelli et al. Navigation of html tables, frames, and xml fragments. ASSETS, ACM Press, 2002.
[18]
X. Wang. Tabular abstraction, editing, and formatting. PhD thesis, University of Waterloo, 1996.
[19]
Y. Wang and J. Hu. Detecting tables in html documents. In Document Analysis Systems V, pp. 609--614. 2002.
[20]
Y. Yesilada et al. Rendering tables in audio: the interaction of structure and reading styles. ASSETS, ACM Press, pp. 16--23, 2004.
[21]
R. Zanibbi et al. A survey of table recognition: Models, observations, transformations, and inferences. Int. J. Document Analysis & Recognition, 7(1), 2004.
[22]
K. Zuyev. Table image segmentation. 4th Int. Conf. document analysis and recognition, pp. 705--708, 1997.

Cited By

View all
  • (2024)Designing Unobtrusive Modulated Electrotactile Feedback on Fingertip Edge to Assist Blind and Low Vision (BLV) People in Comprehending ChartsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642546(1-20)Online publication date: 11-May-2024
  • (2022)Table understanding: Problem overviewWIREs Data Mining and Knowledge Discovery10.1002/widm.148213:1Online publication date: 21-Nov-2022
  • (2021)Extracting Tabular data for Question-Answering from DocumentsProceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)10.1145/3430984.3430992(400-404)Online publication date: 2-Jan-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
June 2010
490 pages
ISBN:9781605587738
DOI:10.1145/1815330
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. spreadsheet analysis
  2. spreadsheets
  3. table navigation
  4. table recognition
  5. visual impairments

Qualifiers

  • Research-article

Conference

DAS '10

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Designing Unobtrusive Modulated Electrotactile Feedback on Fingertip Edge to Assist Blind and Low Vision (BLV) People in Comprehending ChartsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642546(1-20)Online publication date: 11-May-2024
  • (2022)Table understanding: Problem overviewWIREs Data Mining and Knowledge Discovery10.1002/widm.148213:1Online publication date: 21-Nov-2022
  • (2021)Extracting Tabular data for Question-Answering from DocumentsProceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)10.1145/3430984.3430992(400-404)Online publication date: 2-Jan-2021
  • (2021)LGPMA: Complicated Table Structure Recognition with Local and Global Pyramid Mask AlignmentDocument Analysis and Recognition – ICDAR 202110.1007/978-3-030-86549-8_7(99-114)Online publication date: 2-Sep-2021
  • (2021)SpLyCI: Integrating Spreadsheets by Recognising and Solving Layout ConstraintsAdvances in Intelligent Data Analysis XIX10.1007/978-3-030-74251-5_32(402-413)Online publication date: 13-Apr-2021
  • (2020)Algoritmos para el reconocimiento de estructuras de tablasIngenius10.17163/ings.n25.2021.05(50-61)Online publication date: 31-Dec-2020
  • (2020)Table Header Correction Algorithm Based on Heuristics for Improving Spreadsheet Data ExtractionInformation and Software Technologies10.1007/978-3-030-59506-7_13(147-158)Online publication date: 8-Oct-2020
  • (2019)Rethinking Table Recognition using Graph Neural Networks2019 International Conference on Document Analysis and Recognition (ICDAR)10.1109/ICDAR.2019.00031(142-147)Online publication date: Sep-2019
  • (2015)Interpretation of Construction Patterns for Biodiversity SpreadsheetsEnterprise Information Systems10.1007/978-3-319-22348-3_22(397-414)Online publication date: 31-Jul-2015
  • (2013)Teaching spreadsheets to visually-impaired students in an environment similar to a mainstream classProceedings of the 18th ACM conference on Innovation and technology in computer science education10.1145/2462476.2462477(99-104)Online publication date: 1-Jul-2013
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media