Abstract
An efficient table detection process offers a solution for enterprises dealing with automated analysis of digital documents. Table detection is a challenging task due to low inter-class and high intra-class dissimilarities in document images. Further, the foreground-background class imbalance problem limits the performance of table detectors (especially single stage table detectors). The existing table detectors rely on a bottom-up scheme that efficiently captures the semantic features but fails in accounting for the resolution enriched features, thus, affecting the overall detection performance. We propose an end to end trainable framework (DeepDoT), which effectively detect the tables (of different sizes) over arbitrary scales in document images. The DeepDoT utilizes a top-down as well as a bottom-up approach, and additionally, it uses focal loss for handling the pervasive class imbalance problem for accurate predictions. We consider multiple benchmark datasets: ICDAR-2013, UNLV, ICDAR-2017 POD, and MARMOT for a thorough evaluation. The proposed approach yields comparatively better performance in terms of F1-score as compared to state-of-the-art table detection approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cesarini, F., Marinai, S., Sarti, L., Soda, G.: Trainable table location in document images. In: Object Recognition Supported by user Interaction for Service Robots, vol. 3, pp. 236–240. IEEE (2002)
e Silva, A.C.: Learning rich hidden markov models in document analysis: table location. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 843–847. IEEE (2009)
Tsung-Yi L., Priya G., Ross, G., Kaiming, H., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Hassan, T., Baumgartner, R.: Table recognition and understanding from pdf files. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 1143–1147. IEEE (2007)
Shigarov, A., Mikhailov, A., Altaev, A.: Configurable table structure recognition in untagged pdf documents. In: Proceedings of the 2016 ACM Symposium on Document Engineering, pp. 119–122 (2016)
Rastan, R., Paik, H.-Y., Shepherd, J.: Texus: a unified framework for extracting and understanding tables in pdf documents. Inf. Proc. Manage. 56(3), 895–918 (2019)
Hao, L., Gao, L., Yi, X., Tang, Z.: A table detection method for pdf documents based on convolutional neural networks. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 287–292. IEEE (2016)
Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: Deepdesrt: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1162–1167. IEEE (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Gao, L., Yi, X., Jiang, Z., Hao, L., Tang, Z.: ICDAR 2017 competition on page object detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1417–1422. IEEE (2017)
Gilani, A., Qasim, S.R., Malik, M.I., Shafait, F.: Table detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 771–776 (2017)
Kavasidis, I.: A saliency-based convolutional neural network for table and chart detection in digitized documents. arXiv preprint arXiv:1804.06236 (2018)
Yang, X., Yumer, E., Asente, P., Kraley, M., Kifer, D., Lee Giles, C.: Learning to extract semantic structure from documents using multimodal fully convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5315–5324 (2017)
Siddiqui, S.A., Malik, M.I., Agne, S., Dengel, A., Ahmed, S.: Decnt: deep deformable CNN for table detection. IEEE Access 6, 74151–74161 (2018)
Gobel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE (2013)
Tran, D.N., Tran, T.A., Oh, A., Kim, S.H., Na, I.S.: Table detection from document image using vertical arrangement of text blocks. Int. J. Contents 11(4), 77–85 (2015)
Silva, A.C.: Parts that add up to a whole: a framework for the analysis of tables. Edinburgh University, UK (2010)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Acknowledgement
This research is supported by the IIT Ropar under ISIRD grant 9-231/2016/IIT-RPR/1395 and by the DST under CSRI grant DST/CSRI/2018/234.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Singh, M., Goyal, P. (2021). DeepDoT: Deep Framework for Detection of Tables in Document Images. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1377. Springer, Singapore. https://doi.org/10.1007/978-981-16-1092-9_35
Download citation
DOI: https://doi.org/10.1007/978-981-16-1092-9_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1091-2
Online ISBN: 978-981-16-1092-9
eBook Packages: Computer ScienceComputer Science (R0)