research-article

WEATHERGOV+: A Table Recognition and Summarization Dataset to Bridge the Gap Between Document Image Analysis and Natural Language Generation

Authors:

Alexandra Branzan AlbuAuthors Info & Claims

DocEng '23: Proceedings of the ACM Symposium on Document Engineering 2023

Article No.: 10, Pages 1 - 10

https://doi.org/10.1145/3573128.3604901

Published: 22 August 2023 Publication History

Abstract

Tables, ubiquitous in data-oriented documents like scientific papers and financial statements, organize and convey relational information. Automatic table recognition from document images, which involves detection within the page, structural segmentation into rows, columns, and cells, and information extraction from cells, has been a popular research topic in document image analysis (DIA). With recent advances in natural language generation (NLG) based on deep neural networks, data-to-text generation, in particular for table summarization, offers interesting solutions to time-intensive data analysis. In this paper, we aim to bridge the gap between efforts in DIA and NLG regarding tabular data: we propose WEATHERGOV+, a dataset building upon the WEATHERGOV dataset, the standard for tabular data summarization techniques, that allows for the training and testing of end-to-end methods working from input document images to generate text summaries as output. WEATHERGOV+ contains images of tables created from the tabular data of WEATHERGOV using visual variations that cover various levels of difficulty, along with the corresponding human-generated table summaries of WEATHERGOV. We also propose an end-to-end pipeline that compares state-of-the-art table recognition methods for summarization purposes. We analyse the results of the proposed pipeline by evaluating WEATHERGOV+ at each stage of the pipeline to identify the effects of error propagation and the weaknesses of the current methods, such as OCR errors. With this research (dataset and code available here1), we hope to encourage new research for the processing and management of inter- and intra-document collections.

References

[1]

R. Barzilay and M. Lapata. 2005. Collective content selection for concept-to-text generation. In EMNLP. ACL, 331--8.

[2]

Z. Chi, H. Huang, H.-D. Xu, et al. 2019. Complicated table structure recognition. arXiv preprint arXiv:1908.04729 (2019).

[3]

B. Coüasnon and A. Lemaitre. 2014. Recognition of tables and forms. In Handb. Doc. Image Process. Recognit., D. Doermann and K. Tombre (Eds.). Springer, 647--77.

[4]

Y. Deng, D. Rosenberg, and G. Mann. 2019. Challenges in end-to-end neural scientific table recognition. In ICDAR. IEEE, 894--901.

[5]

P. A. Duboue and K. R. McKeown. 2003. Statistical acquisition of content selection rules for natural language generation. In EMNLP. 121--8.

[6]

J. Fang, X. Tao, Z. Tang, et al. 2012. Dataset, ground-truth and performance metrics for table detection evaluation. In DAS. IEEE, 445--9.

[7]

P. Fischer, A. Smajic, G. Abrami, et al. 2021. Multi-Type-TD-TSR--Extracting tables from document images using a multi-stage pipeline for table detection and table structure recognition: From OCR to structured table representations. In KI 2021: Adv. Artif. Intell., LNAI, Vol. 12873. Springer, 95--108.

Digital Library

[8]

L. Gao, Y. Huang, H. Déjean, et al. 2019. ICDAR 2019 competition on table detection and recognition (cTDaR). In ICDAR. IEEE, 1510--5.

[9]

L. Gao, X. Yi, Z. Jiang, et al. 2017. ICDAR2017 competition on page object detection. In ICDAR, Vol. 1. IEEE, 1417--22.

[10]

A. Gatt and E. Krahmer. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. J. Artif. Intell. Res. 61 (2018), 65--170.

[11]

M. Göbel, T. Hassan, E. Oro, et al. 2013. ICDAR 2013 table competition. In ICDAR. IEEE, 1449--53.

[12]

K. A. Hashmi, M. Liwicki, D. Stricker, et al. 2021. Current status and performance analysis of table recognition in document images with deep neural networks. IEEE Access 9 (2021), 87663--85.

[13]

K. A. Hashmi, A. Pagani, M. Liwicki, et al. 2021. CasTabDetectoRS: Cascade network for table detection in document images with recursive feature pyramid and switchable atrous convolution. J. Imaging 7, 10 (2021), 214.

[14]

P. Jain, A. Laha, K. Sankaranarayanan, et al. 2018. A mixed hierarchical attention based encoder-decoder approach for standard table summarization. In NAACL, Volume 2. ACL, 622--7.

[15]

A. Jimeno Yepes, P. Zhong, and D. Burdick. 2021. ICDAR 2021 competition on scientific literature parsing. In ICDAR. Springer, 605--17.

[16]

P. Kayal, M. Anand, H. Desai, et al. 2021. ICDAR 2021 competition on scientific table image recognition to LaTeX. In ICDAR. Springer, 754--66.

[17]

P. Kayal, M. Anand, H. Desai, et al. 2022. Tables to LaTeX: structure and content extraction from scientific tables. Int. J. Doc. Anal. Recognit. (2022), 1--10.

[18]

T. Kazdar, W. S. Mseddi, M. A. Akhloufi, et al. 2023. DCTable: A dilated CNN with optimizing anchors for accurate table detection. J. Imaging 9, 3 (2023), 62.

[19]

I. Konstas and M. Lapata. 2013. A global model for concept-to-text generation. J. Artif. Intell. Res. 48 (2013), 305--46.

[20]

R. Lebret, D. Grangier, and M. Auli. 2016. Neural text generation from structured data with application to the biography domain. In EMNLP. 1203--13.

[21]

E. Lee, J. Park, H. I. Koo, et al. 2022. Deep-learning and graph-based approach to table structure recognition. Multimed. Tools Appl. 81, 4 (2022), 5827--48.

Digital Library

[22]

M. Li, L. Cui, S. Huang, et al. 2020. TableBank: Table benchmark for image-based table detection and recognition. In LREC. 1918--25.

[23]

P. Liang, M. I. Jordan, and D. Klein. 2009. Learning semantic correspondences with less supervision. In AFNLP. ACL and AFNLP, 91--9.

[24]

C.-Y. Lin and E. Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In HLT-NAACL. 150--7.

[25]

J. Liu, X. Liu, J. Sheng, et al. 2019. Pyramid mask text detector. arXiv preprint arXiv:1903.11800 (2019).

[26]

S. Liu, J. Cao, R. Yang, et al. 2022. Long text and multi-table summarization: Dataset and method. In EMNLP. 1995--2010.

[27]

D. Lopresti and G. Nagy. 2001. A tabular survey of automated table processing. In GREC. Springer, 93--120.

[28]

N. Lu, W. Yu, X. Qi, et al. 2021. MASTER: Multi-aspect non-local network for scene text recognition. Pattern Recognit. 117 (2021), 107980.

[29]

C. Ma, W. Lin, L. Sun, et al. 2023. Robust table detection and structure recognition from heterogeneous document images. Pattern Recognit. 133 (2023), 109006.

Digital Library

[30]

H. Mei, M. Bansal, and M. R. Walter. 2016. What to talk about and how? Selective generation using LSTMs with coarse-to-fine alignment. In NAACL. 720--30.

[31]

K. Papineni, S. Roukos, T. Ward, et al. 2002. BLEU: a method for automatic evaluation of machine translation. In ACL. 311--8.

[32]

A. Parikh, X. Wang, S. Gehrmann, et al. 2020. ToTTo: A controlled table-to-text generation dataset. In EMNLP. 1173--86.

[33]

R. Puduppully, L. Dong, and M. Lapata. 2019. Data-to-text generation with content selection and planning. In AAAI, Vol. 33. 6908--15.

Digital Library

[34]

R. Puduppully, L. Dong, and M. Lapata. 2019. Data-to-text generation with entity modeling. In ACL. 2023--35.

[35]

R. Puduppully and M. Lapata. 2021. Data-to-text generation with macro planning. Trans. Assoc. Comput. Linguist. 9 (2021), 510--27.

[36]

S. R. Qasim, H. Mahmood, and F. Shafait. 2019. Rethinking table recognition using graph neural networks. In ICDAR. IEEE, 142--7.

[37]

L. Qiao, Z. Li, Z. Cheng, et al. 2021. LGPMA: Complicated table structure recognition with local and global pyramid mask alignment. In ICDAR. Springer, 99--114.

[38]

C. Rebuffel, M. Roberti, L. Soulier, et al. 2022. Controlling hallucinations at word level in data-to-text generation. Data Min. Knowl. Discov. (2022), 1--37.

[39]

E. Reiter, S. Sripada, J. Hunter, et al. 2005. Choosing words in computer-generated weather forecasts. Artif. Intell. 167, 1-2 (2005), 137--69.

[40]

K. Selçuk Candan, H. Cao, Y. Qi, et al. 2009. AlphaSum: Size-constrained table summarization using value lattices. In EDBT. 96--107.

[41]

L. Sha, L. Mou, T. Liu, et al. 2018. Order-planning neural text generation from structured data. In AAAI, Vol. 32. 5414--21.

[42]

A. Shahab, F. Shafait, T. Kieninger, et al. 2010. An open approach towards the benchmarking of table structure recognition systems. In DAS. 113--20.

[43]

A. Shigarov. 2023. Table understanding: Problem overview. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 13, 1 (2023), e1482.

[44]

S. A. Siddiqui, I. A. Fateh, S. T. R. Rizvi, et al. 2019. DeepTabStR: Deep learning based table structure recognition. In ICDAR. IEEE, 1403--9.

[45]

S. A. Siddiqui, M. I. Malik, S. Agne, et al. 2018. DeCNT: Deep deformable CNN for table detection. IEEE Access 6 (2018), 74151--61.

[46]

B. Smock, R. Pesala, and R. Abraham. 2022. PubTables-1M: Towards comprehensive table extraction from unstructured documents. In CVPR. 4634--42.

[47]

T. Tang, J. Li, Z. Chen, et al. 2022. TextBox 2.0: a text generation library with pre-trained language models. In EMNLP. ACL, 435--44.

[48]

T. Tang, J. Li, W. X. Zhao, and other. 2022. MVP: Multi-task supervised pre-training for natural language generation. arXiv preprint arXiv:2206.12131 (2022).

[49]

C. van der Lee, E. Krahmer, and S. Wubben. 2017. PASS: A Dutch data-to-text system for soccer, targeted towards specific audiences. In NLG. 95--104.

[50]

A. Vaswani, N. Shazeer, N. Parmar, et al. 2017. Attention is all you need. Adv. Neural. Inf. Process. Syst. 30 (2017).

[51]

W. Wang, E. Xie, X. Li, et al. 2019. Shape robust text detection with progressive scale expansion network. In CVPR. 9336--45.

[52]

S. Wiseman, S. M. Shieber, and A. M. Rush. 2017. Challenges in data-to-document generation. In EMNLP. 2253--63.

[53]

Y. W. Wong and R. Mooney. 2007. Generation by inverting a semantic parser that uses statistical machine translation. In HLT-NAACL. 172--9.

[54]

F. Yang, L. Hu, X. Liu, et al. 2023. A large-scale dataset for end-to-end table recognition in the wild. Sci. Data 10, 1 (2023), 110.

[55]

J. Ye, X. Qi, Y. He, et al. 2021. PingAn-VCGroup's solution for ICDAR 2021 competition on scientific literature parsing task B: Table recognition to HTML. arXiv preprint arXiv:2105.01848 (2021).

[56]

X. Zheng, D. Burdick, L. Popa, et al. 2021. Global table extractor (GTE): A framework for joint table identification and cell structure recognition using visual context. In WACV. 697--706.

[57]

X. Zhong, E. ShafieiBavani, and A. Jimeno Yepes. 2020. Image-based table recognition: data, model, and evaluation. In ECCV. Springer, 564--80.

Index Terms

WEATHERGOV+: A Table Recognition and Summarization Dataset to Bridge the Gap Between Document Image Analysis and Natural Language Generation
1. Applied computing
  1. Document management and text processing
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
    2. Natural language processing
      1. Information extraction
      2. Natural language generation

Recommendations

Research on Text Generation Techniques Combining Machine Learning and Deep Learning
IPEC '22: Proceedings of the 3rd Asia-Pacific Conference on Image Processing, Electronics and Computers

Natural language generation (NLG) is a part of natural language processing (NLP), the main purpose of which is to build a natural language text generation system capable of generating human-understandable languages such as Chinese and English through ...
A minimal and sufficient way of introducing external knowledge for table recognition in archival documents
GREC'05: Proceedings of the 6th international conference on Graphics Recognition: ten Years Review and Future Perspectives

We present a system that recognizes tables in archival documents. Many works were carried out on table recognition but very few on tables of historical documents. These are difficult to analyze because they are often damaged due to their age and ...
Table Content Understanding in SmartFIX
ICDAR '11: Proceedings of the 2011 International Conference on Document Analysis and Recognition

The analysis of table structures and the retrieval of table contents is widely agreed to be a difficult challenge in the area of document analysis systems. Instead of extracting the layout of tables, we are interested in understanding their content. In ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DocEng '23: Proceedings of the ACM Symposium on Document Engineering 2023

August 2023

187 pages

ISBN:9798400700279

DOI:10.1145/3573128

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

DocEng '23

Sponsor:

SIGWEB

DocEng '23: ACM Symposium on Document Engineering 2023

August 22 - 25, 2023

Limerick, Ireland

Acceptance Rates

DocEng '23 Paper Acceptance Rate 9 of 27 submissions, 33%;

Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
113
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)4

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten