Skip to main content

DeepDoT: Deep Framework for Detection of Tables in Document Images

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2020)

Abstract

An efficient table detection process offers a solution for enterprises dealing with automated analysis of digital documents. Table detection is a challenging task due to low inter-class and high intra-class dissimilarities in document images. Further, the foreground-background class imbalance problem limits the performance of table detectors (especially single stage table detectors). The existing table detectors rely on a bottom-up scheme that efficiently captures the semantic features but fails in accounting for the resolution enriched features, thus, affecting the overall detection performance. We propose an end to end trainable framework (DeepDoT), which effectively detect the tables (of different sizes) over arbitrary scales in document images. The DeepDoT utilizes a top-down as well as a bottom-up approach, and additionally, it uses focal loss for handling the pervasive class imbalance problem for accurate predictions. We consider multiple benchmark datasets: ICDAR-2013, UNLV, ICDAR-2017 POD, and MARMOT for a thorough evaluation. The proposed approach yields comparatively better performance in terms of F1-score as compared to state-of-the-art table detection approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cesarini, F., Marinai, S., Sarti, L., Soda, G.: Trainable table location in document images. In: Object Recognition Supported by user Interaction for Service Robots, vol. 3, pp. 236–240. IEEE (2002)

    Google Scholar 

  2. e Silva, A.C.: Learning rich hidden markov models in document analysis: table location. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 843–847. IEEE (2009)

    Google Scholar 

  3. Tsung-Yi L., Priya G., Ross, G., Kaiming, H., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  4. Hassan, T., Baumgartner, R.: Table recognition and understanding from pdf files. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 1143–1147. IEEE (2007)

    Google Scholar 

  5. Shigarov, A., Mikhailov, A., Altaev, A.: Configurable table structure recognition in untagged pdf documents. In: Proceedings of the 2016 ACM Symposium on Document Engineering, pp. 119–122 (2016)

    Google Scholar 

  6. Rastan, R., Paik, H.-Y., Shepherd, J.: Texus: a unified framework for extracting and understanding tables in pdf documents. Inf. Proc. Manage. 56(3), 895–918 (2019)

    Article  Google Scholar 

  7. Hao, L., Gao, L., Yi, X., Tang, Z.: A table detection method for pdf documents based on convolutional neural networks. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 287–292. IEEE (2016)

    Google Scholar 

  8. Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: Deepdesrt: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1162–1167. IEEE (2017)

    Google Scholar 

  9. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  10. Gao, L., Yi, X., Jiang, Z., Hao, L., Tang, Z.: ICDAR 2017 competition on page object detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1417–1422. IEEE (2017)

    Google Scholar 

  11. Gilani, A., Qasim, S.R., Malik, M.I., Shafait, F.: Table detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 771–776 (2017)

    Google Scholar 

  12. Kavasidis, I.: A saliency-based convolutional neural network for table and chart detection in digitized documents. arXiv preprint arXiv:1804.06236 (2018)

  13. Yang, X., Yumer, E., Asente, P., Kraley, M., Kifer, D., Lee Giles, C.: Learning to extract semantic structure from documents using multimodal fully convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5315–5324 (2017)

    Google Scholar 

  14. Siddiqui, S.A., Malik, M.I., Agne, S., Dengel, A., Ahmed, S.: Decnt: deep deformable CNN for table detection. IEEE Access 6, 74151–74161 (2018)

    Google Scholar 

  15. Gobel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE (2013)

    Google Scholar 

  16. https://www.icst.pku.edu.cn/cpdp/ (2015)

  17. https://www.iapr-tc11.org/mediawiki/ (2010)

  18. Tran, D.N., Tran, T.A., Oh, A., Kim, S.H., Na, I.S.: Table detection from document image using vertical arrangement of text blocks. Int. J. Contents 11(4), 77–85 (2015)

    Article  Google Scholar 

  19. Silva, A.C.: Parts that add up to a whole: a framework for the analysis of tables. Edinburgh University, UK (2010)

    Google Scholar 

  20. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  21. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

Download references

Acknowledgement

This research is supported by the IIT Ropar under ISIRD grant 9-231/2016/IIT-RPR/1395 and by the DST under CSRI grant DST/CSRI/2018/234.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mandhatya Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Singh, M., Goyal, P. (2021). DeepDoT: Deep Framework for Detection of Tables in Document Images. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1377. Springer, Singapore. https://doi.org/10.1007/978-981-16-1092-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1092-9_35

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1091-2

  • Online ISBN: 978-981-16-1092-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics