Deep-learning and graph-based approach to table structure recognition

Lee, Eunji; Park, Jaewoo; Koo, Hyung Il; Cho, Nam Ik

doi:10.1007/s11042-021-11819-7

Deep-learning and graph-based approach to table structure recognition

Published: 30 December 2021

Volume 81, pages 5827–5848, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Eunji Lee¹,
Jaewoo Park¹,
Hyung Il Koo ORCID: orcid.org/0000-0002-6955-8083² &
…
Nam Ik Cho^1,3

968 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Table structure recognition is a key component in document understanding. Many prior methods have addressed this problem with three sequential steps: table detection, table component extraction, and structure analysis based on pairwise relations. However, they have limitations in addressing complexly structured tables and/or practical scenarios (e.g., scanned documents). In this paper, we propose a novel graph-based table structure recognition framework. In order to handle complex tables, we formulate tables as planar graphs, whose faces are cell-regions. Then, we compute vertex (junction) confidence maps and line fields with the heatmap regression networks having a small number of parameters (about 1M) and reconstruct tables by solving a constrained optimization problem. We demonstrate the robustness of the proposed system through experiments on ICDAR 2019 dataset and on challenging table images. Experimental results show that the proposed method outperforms the conventional method for a range of scenarios and delivers good generalization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Image-Based Relation Classification Approach for Table Structure Recognition

TRACE: Table Reconstruction Aligned to Corner and Edges

LRATNet: Local-Relationship-Aware Transformer Network for Table Structure Recognition

References

Bhowmik S, Kundu S, Sarkar R (2021) Binyas: a complex document layout analysis system. Multimedia Tools and Applications 80(6):8471–8504
Article Google Scholar
Bulat A, Tzimiropoulos G (2016) Human pose estimation via convolutional part heatmap regression. In: European conference on computer vision. Springer, pp 717–732
Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2019) Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(1):172–186
Article Google Scholar
Chi Z, Huang H, Xu HD, Yu H, Yin W, Mao XL (2019) Complicated table structure recognition. arXiv:1908.04729
Coüasnon B, Lemaitre A (2014) Recognition of tables and forms. In: Handbook of document image processing and recognition. pp 647–677
Deng Y, Kanervisto A, Rush AM (2016) What you get is what you see: A visual markup decompiler. 10:32–37. arXiv:1609.04938
Gao L, Huang Y, Déjean H, Meunier JL, Yan Q, Fang Y, Kleber F, Lang E (2019) Icdar 2019 competition on table detection and recognition (ctdar). In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1510–1515
Gilani A, Qasim SR, Malik I, Shafait F (2017) Table detection using deep learning. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1. IEEE, pp 771–776
Gurobi Optimization L (2021) Gurobi optimizer reference manual. http://www.gurobi.com
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp. 2961–2969
Hirayama Y (1995) A method for table structure analysis using dp matching. In: Proceedings of 3rd international conference on document analysis and recognition, vol 2. IEEE, pp 583–586
Itonori K (1993) Table structure recognition based on textblock arrangement and ruled line position. In: Proceedings of 2nd international conference on document analysis and recognition (ICDAR’93). IEEE, pp 765–768
Khan SA, Khalid SMD, Shahzad MA, Shafait F (2019) Table structure extraction with bi-directional gated recurrent unit networks. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1366–1371
Kieninger T, Dengel A (1998) The t-recs table recognition and analysis system. In: International workshop on document analysis systems. Springer, pp 255–270
Kieninger TG (1998) Table structure recognition based on robust block segmentation. In: Document recognition V, vol 3305, pp. 22–32. International Society for Optics and Photonics
Koo HI, Cho NI (2016) Robust skew estimation using straight lines in document images. Journal of Electronic Imaging 25(3):033014
Article Google Scholar
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV). pp 734–750
Le Vine N, Zeigenfuse M, Rowan M (2019) Extracting tables from documents using conditional generative adversarial networks and genetic algorithms. In: 2019 international joint conference on neural networks (IJCNN). IEEE pp 1–8
Li M, Cui L, Huang S, Wei F, Zhou M, Li Z (2020) Tablebank: Table benchmark for image-based table detection and recognition. In: Proceedings of The 12th language resources and evaluation conference. pp 1918–1925
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision. Springer, pp. 483–499
Paliwal SS, Vishwanath D, Rahul R, Sharma M, Vig L (2019) Tablenet: Deep learning model for end-to-end table detection and tabular data extraction from scanned document images. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 128–133
Pavlakos G, Zhu L, Zhou X, Daniilidis K (2018) Learning to estimate 3d human pose and shape from a single color image. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 459–468
Prasad D, Gadpal A, Kapadni K, Visave M, Sultanpure K (2020) Cascadetabnet: An approach for end to end table detection and structure recognition from image-based documents. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. pp 572–573
Qasim SR, Mahmood H, Shafait F (2019) Rethinking table recognition using graph neural networks. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 142–147
Raja S, Mondal A, Jawahar C (2020) Table structure recognition using top-down and bottom-up cues. In: European conference on computer vision. Springer, pp 70–86
Schreiber S, Agne S, Wolf I, Dengel A, Ahmed S (2017) Deepdesrt: Deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1. IEEE, pp 1162–1167
Seo W, Koo HI, Cho NI (2015) Junction-based table detection in camera-captured document images. International Journal on Document Analysis and Recognition (IJDAR) 18(1):47–57
Article Google Scholar
Shigarov A, Mikhailov A, Altaev A (2016) Configurable table structure recognition in untagged pdf documents. In: Proceedings of the 2016 ACM symposium on document engineering. pp 119–122
Siddiqui SA, Fateh IA, Rizvi STR, Dengel A, Ahmed S (2019) Deeptabstr: Deep learning based table structure recognition. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1403–1409
Siddiqui SA, Khan PI, Dengel A, Ahmed S (2019) Rethinking semantic segmentation for table structure recognition in documents. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1397–1402
Siddiqui SA, Malik MI, Agne S, Dengel A, Ahmed S (2018) Decnt: Deep deformable cnn for table detection. IEEE Access 6:74151–74161
Article Google Scholar
Tensmeyer C, Morariu VI, Price B, Cohen S, Martinez T (2019) Deep splitting and merging for table structure decomposition. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 114–121
Vanhoucke V (2014) Learning visual representations at scale. ICLR Invited Talk 1:2
Google Scholar
Wang Y, Phillips IT, Haralick RM (2004) Table structure understanding and its performance evaluation. Pattern Recognition 37(7):1479–1497
Article Google Scholar
Zanibbi R, Blostein D, Cordy JR (2004) A survey of table recognition. Document Analysis and Recognition 7(1):1–16
Google Scholar
Zheng X, Burdick D, Popa L, Zhong X, Wang NXR (2021) Global table extractor (gte): A framework for joint table identification and cell structure recognition using visual context. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 697–706
Zhong X, ShafieiBavani E, Yepes AJ (2019) Image-based table recognition: data, model, and evaluation. arXiv:1911.10683

Download references

Acknowledgements

This work was supported in part by the Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2021-0-01062, Development of personal information processing technology for collection/utilization of high-quality and trusted training data for autonomous driving), and in part by LG AI Research.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, INMC, Seoul National University, Seoul, 08826, South Korea
Eunji Lee, Jaewoo Park & Nam Ik Cho
Department of Electrical and Computer Engineering, Ajou University, Suwon, 16499, South Korea
Hyung Il Koo
School of Data Science, Seoul National University, Seoul, 08826, South Korea
Nam Ik Cho

Authors

Eunji Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jaewoo Park
View author publications
You can also search for this author in PubMed Google Scholar
Hyung Il Koo
View author publications
You can also search for this author in PubMed Google Scholar
Nam Ik Cho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hyung Il Koo.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, E., Park, J., Koo, H.I. et al. Deep-learning and graph-based approach to table structure recognition. Multimed Tools Appl 81, 5827–5848 (2022). https://doi.org/10.1007/s11042-021-11819-7

Download citation

Received: 01 June 2021
Revised: 07 September 2021
Accepted: 14 December 2021
Published: 30 December 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11042-021-11819-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep-learning and graph-based approach to table structure recognition

Abstract

Access this article

Similar content being viewed by others

Image-Based Relation Classification Approach for Table Structure Recognition

TRACE: Table Reconstruction Aligned to Corner and Edges

LRATNet: Local-Relationship-Aware Transformer Network for Table Structure Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep-learning and graph-based approach to table structure recognition

Abstract

Access this article

Similar content being viewed by others

Image-Based Relation Classification Approach for Table Structure Recognition

TRACE: Table Reconstruction Aligned to Corner and Edges

LRATNet: Local-Relationship-Aware Transformer Network for Table Structure Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation