Skip to main content
Log in

Deep-learning and graph-based approach to table structure recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Table structure recognition is a key component in document understanding. Many prior methods have addressed this problem with three sequential steps: table detection, table component extraction, and structure analysis based on pairwise relations. However, they have limitations in addressing complexly structured tables and/or practical scenarios (e.g., scanned documents). In this paper, we propose a novel graph-based table structure recognition framework. In order to handle complex tables, we formulate tables as planar graphs, whose faces are cell-regions. Then, we compute vertex (junction) confidence maps and line fields with the heatmap regression networks having a small number of parameters (about 1M) and reconstruct tables by solving a constrained optimization problem. We demonstrate the robustness of the proposed system through experiments on ICDAR 2019 dataset and on challenging table images. Experimental results show that the proposed method outperforms the conventional method for a range of scenarios and delivers good generalization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bhowmik S, Kundu S, Sarkar R (2021) Binyas: a complex document layout analysis system. Multimedia Tools and Applications 80(6):8471–8504

    Article  Google Scholar 

  2. Bulat A, Tzimiropoulos G (2016) Human pose estimation via convolutional part heatmap regression. In: European conference on computer vision. Springer, pp 717–732

  3. Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2019) Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(1):172–186

    Article  Google Scholar 

  4. Chi Z, Huang H, Xu HD, Yu H, Yin W, Mao XL (2019) Complicated table structure recognition. arXiv:1908.04729

  5. Coüasnon B, Lemaitre A (2014) Recognition of tables and forms. In: Handbook of document image processing and recognition. pp 647–677

  6. Deng Y, Kanervisto A, Rush AM (2016) What you get is what you see: A visual markup decompiler. 10:32–37. arXiv:1609.04938

  7. Gao L, Huang Y, Déjean H, Meunier JL, Yan Q, Fang Y, Kleber F, Lang E (2019) Icdar 2019 competition on table detection and recognition (ctdar). In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1510–1515

  8. Gilani A, Qasim SR, Malik I, Shafait F (2017) Table detection using deep learning. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1. IEEE, pp 771–776

  9. Gurobi Optimization L (2021) Gurobi optimizer reference manual. http://www.gurobi.com

  10. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp. 2961–2969

  11. Hirayama Y (1995) A method for table structure analysis using dp matching. In: Proceedings of 3rd international conference on document analysis and recognition, vol 2. IEEE, pp 583–586

  12. Itonori K (1993) Table structure recognition based on textblock arrangement and ruled line position. In: Proceedings of 2nd international conference on document analysis and recognition (ICDAR’93). IEEE, pp 765–768

  13. Khan SA, Khalid SMD, Shahzad MA, Shafait F (2019) Table structure extraction with bi-directional gated recurrent unit networks. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1366–1371

  14. Kieninger T, Dengel A (1998) The t-recs table recognition and analysis system. In: International workshop on document analysis systems. Springer, pp 255–270

  15. Kieninger TG (1998) Table structure recognition based on robust block segmentation. In: Document recognition V, vol 3305, pp. 22–32. International Society for Optics and Photonics

  16. Koo HI, Cho NI (2016) Robust skew estimation using straight lines in document images. Journal of Electronic Imaging 25(3):033014

    Article  Google Scholar 

  17. Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV). pp 734–750

  18. Le Vine N, Zeigenfuse M, Rowan M (2019) Extracting tables from documents using conditional generative adversarial networks and genetic algorithms. In: 2019 international joint conference on neural networks (IJCNN). IEEE pp 1–8

  19. Li M, Cui L, Huang S, Wei F, Zhou M, Li Z (2020) Tablebank: Table benchmark for image-based table detection and recognition. In: Proceedings of The 12th language resources and evaluation conference. pp 1918–1925

  20. Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision. Springer, pp. 483–499

  21. Paliwal SS, Vishwanath D, Rahul R, Sharma M, Vig L (2019) Tablenet: Deep learning model for end-to-end table detection and tabular data extraction from scanned document images. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 128–133

  22. Pavlakos G, Zhu L, Zhou X, Daniilidis K (2018) Learning to estimate 3d human pose and shape from a single color image. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 459–468

  23. Prasad D, Gadpal A, Kapadni K, Visave M, Sultanpure K (2020) Cascadetabnet: An approach for end to end table detection and structure recognition from image-based documents. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. pp 572–573

  24. Qasim SR, Mahmood H, Shafait F (2019) Rethinking table recognition using graph neural networks. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 142–147

  25. Raja S, Mondal A, Jawahar C (2020) Table structure recognition using top-down and bottom-up cues. In: European conference on computer vision. Springer, pp 70–86

  26. Schreiber S, Agne S, Wolf I, Dengel A, Ahmed S (2017) Deepdesrt: Deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1. IEEE, pp 1162–1167

  27. Seo W, Koo HI, Cho NI (2015) Junction-based table detection in camera-captured document images. International Journal on Document Analysis and Recognition (IJDAR) 18(1):47–57

    Article  Google Scholar 

  28. Shigarov A, Mikhailov A, Altaev A (2016) Configurable table structure recognition in untagged pdf documents. In: Proceedings of the 2016 ACM symposium on document engineering. pp 119–122

  29. Siddiqui SA, Fateh IA, Rizvi STR, Dengel A, Ahmed S (2019) Deeptabstr: Deep learning based table structure recognition. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1403–1409

  30. Siddiqui SA, Khan PI, Dengel A, Ahmed S (2019) Rethinking semantic segmentation for table structure recognition in documents. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 1397–1402

  31. Siddiqui SA, Malik MI, Agne S, Dengel A, Ahmed S (2018) Decnt: Deep deformable cnn for table detection. IEEE Access 6:74151–74161

    Article  Google Scholar 

  32. Tensmeyer C, Morariu VI, Price B, Cohen S, Martinez T (2019) Deep splitting and merging for table structure decomposition. In: 2019 international conference on document analysis and recognition (ICDAR). IEEE, pp 114–121

  33. Vanhoucke V (2014) Learning visual representations at scale. ICLR Invited Talk 1:2

    Google Scholar 

  34. Wang Y, Phillips IT, Haralick RM (2004) Table structure understanding and its performance evaluation. Pattern Recognition 37(7):1479–1497

    Article  Google Scholar 

  35. Zanibbi R, Blostein D, Cordy JR (2004) A survey of table recognition. Document Analysis and Recognition 7(1):1–16

    Google Scholar 

  36. Zheng X, Burdick D, Popa L, Zhong X, Wang NXR (2021) Global table extractor (gte): A framework for joint table identification and cell structure recognition using visual context. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 697–706

  37. Zhong X, ShafieiBavani E, Yepes AJ (2019) Image-based table recognition: data, model, and evaluation. arXiv:1911.10683

Download references

Acknowledgements

This work was supported in part by the Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2021-0-01062, Development of personal information processing technology for collection/utilization of high-quality and trusted training data for autonomous driving), and in part by LG AI Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyung Il Koo.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, E., Park, J., Koo, H.I. et al. Deep-learning and graph-based approach to table structure recognition. Multimed Tools Appl 81, 5827–5848 (2022). https://doi.org/10.1007/s11042-021-11819-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11819-7

Keywords

Navigation