Abstract
Due to the idiosyncrasies of historical document images (HDI), growing attention over the last decades is being paid for proposing robust HDI analysis solutions. Many research studies have shown that Gabor filters are among the low-level descriptors that best characterize texture information in HDI. On the other side, deep neural networks (DNN) have been successfully used for HDI segmentation. As a consequence, we propose in this paper a HDI segmentation method that is based on combining Gabor features and DNN. The segmentation method focuses on classifying each document image pixel to either graphic, text or background. The novelty of the proposed method lies mainly in feeding a DNN with a Gabor filtered image (obtained by applying specific multichannel Gabor filters) instead of an original image as input. The proposed method is decomposed into three steps: a) filtered image generation using Gabor filters, b) feature learning with stacked autoencoder, and c) image segmentation with 2D U-Net. In order to evaluate its performance, experiments are conducted using two different datasets. The results are reported and compared with those of a recent state-of-the-art method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The annotations of the SynDoc12k dataset are available at this url.
- 3.
- 4.
References
Okun, O., Pietikäinen, M.: A survey of texture-based methods for document layout analysis. In: Series in Machine Perception and Artificial Intelligence: Texture Analysis in Machine Vision, pp. 165–177 (2000)
Nicolas, S., Kessentini, Y., Paquet, T., Heutte, L.: Handwritten document segmentation using hidden Markov random fields. In: International Conference on Document Analysis and Recognition, pp. 212–216 (2005)
Keysers, D., Shafait, F., Breuel, T.: Document image zone classification - a simple high-performance approach. In: International Conference on Computer Vision Theory and Applications, pp. 44–51 (2007)
Journet, N., Ramel, J., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recogn. 11(1), 9–18 (2008)
Nikolaou, N., Makridis, M., Gatos, B., Stamatopoulos, N., Papamarkos, N.: Segmentation of historical machine-printed documents using adaptive run-length smoothing and skeleton segmentation paths. Image Vis. Comput. 28(4), 590–604 (2010)
Bhowmik, T., Kar, M.: Text localization in historical document images with local binary patterns and variance models. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 501–508 (2013)
Ferrer, M., Morales, A., Pal, U.: LBP based line-wise script identification. In: International Conference on Document Analysis and Recognition, pp. 369–373 (2013)
Asi, A., Cohen, R., Kedem, K., El-Sana, J., Dinstein, I.: A coarse-to-fine approach for layout analysis of ancient manuscripts. In: International Conference on Frontiers in Handwriting Recognition, pp. 140–145 (2014)
Chen, K., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Robust text line segmentation for historical manuscript images using color and texture. In: International Conference on Pattern Recognition, pp. 2978–2983 (2014)
Nicolaou, A., Slimane, F., Märgner, V., Liwicki, M.: Local binary patterns for Arabic optical font recognition. In: International Workshop on Document Analysis Systems, pp. 76–80 (2014)
Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recogn. Lett. 35, 23–33 (2014)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)
Grana, C., Serra, G., Manfredi, M., Coppi, D., Cucchiara, R.: Layout analysis and content enrichment of digitized books. Multimedia Tools Appl. 75(7), 3879–3900 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Yao, H., Chuyi, L., Dan, H., Weiyu, Y.: Gabor feature based convolutional neural network for object recognition in natural scene. In: International Conference on Information Science and Control Engineering, pp. 386–390 (2016)
Alberti, M., Seuret, M., Pondenkandath, V., Ingold, R., Liwicki, M.: Historical document image segmentation with LDA-initialized deep neural networks. In: International Workshop on Historical Document Imaging and Processing, pp. 95–100 (2017)
Chen, K., Seuret, M., Hennebert, J., Ingold, R.: Convolutional neural networks for page segmentation of historical document images. In: International Conference on Document Analysis and Recognition, pp. 965–970 (2017)
Mehri, M., Héroux, P., Mullot, R., Moreux, J., Coüasnon, B., Barrett, B.: HBA 1.0: a pixel-based annotated dataset for historical book analysis. In: International Workshop on Historical Document Imaging and Processing, pp. 107–112 (2017)
Mehri, M., Héroux, P., Gomez-Krämer, P., Mullot, R.: Texture feature benchmarking and evaluation for historical document image analysis. Int. J. Doc. Anal. Recogn. 20(1), 1–35 (2017)
Tang, X., Hao, K., Wei, H., Ding, Y.: Using line segments to train multi-stream stacked autoencoders for image classification. Pattern Recogn. Lett. 94, 55–61 (2017)
Wei, H., Seuret, M., Liwicki, M., Ingold, R.: The use of Gabor features for semi-automatically generated polyon-based ground truth of historical document images. Digit. Scholarsh. Human. 32, i134–i149 (2017)
Wei, H., Seuret, M., Liwicki, M., Ingold, R., Fu, P.: Selecting fine-tuned features for layout analysis of historical documents. In: International Conference on Document Analysis and Recognition, pp. 281–286 (2017)
Kaddas, P., Gatos, B.: A deep convolutional encoder-decoder network for page segmentation of historical handwritten documents into text zones. In: International Conference on Frontiers in Handwriting Recognition, pp. 259–264 (2018)
Kim, N., So, H.: Directional statistical Gabor features for texture classification. Pattern Recogn. Lett. 112, 18–26 (2018)
Liu, C., Ding, W., Wang, X., Zhang, B.: Hybrid Gabor convolutional networks. Pattern Recogn. Lett. 116, 164–169 (2018)
Luan, S., Chen, C., Zhang, B., Han, J., Liu, J.: Gabor convolutional networks. IEEE Trans. Image Process. 27(9), 4357–4366 (2018)
Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2018)
Wick, C., Puppe, F.: Fully convolutional neural networks for page segmentation of historical document images. In: International Workshop on Document Analysis Systems, pp. 287–292 (2018)
Zaragoza, J., Castellanos, F., Vigliensoni, G., Fujinaga, I.: Deep neural networks for document processing of music score images. Appl. Sci. 8(5), 654 (2018)
Do, T., Terrades, O., Tabbone, S.: DSD: document sparse-based denoising algorithm. Pattern Anal. Appl. 22(1), 177–186 (2019)
Dumitrescu, C., Dumitrache, I.: Combining deep learning technologies with multi-level Gabor features for facial recognition in biometric automated systems. Stud. Inform. Control 28(2), 221–230 (2019)
Sehad, A., Chibani, Y., Hedjam, R., Cheriet, M.: Gabor filter-based texture for ancient degraded document image binarization. Pattern Anal. Appl. 22(1), 1–22 (2019)
Weinman, J.J., Chen, Z., Gafford, B., Gifford, N., Lamsal, A., Staab, L.: Deep neural networks for text detection and recognition in historical maps. In: International Conference on Document Analysis and Recognition, pp. 902–909 (2019)
Droby, A., Barakat, B., Madi, B., Alaasam, R., El-Sana, J.: Unsupervised deep learning for handwritten page segmentation. In: International Conference on Frontiers in Handwriting Recognition, pp. 240–245 (2020)
Lombardi, F., Marinai, S.: Deep learning for historical document analysis and recognition - a survey. J. Imaging 6(10), 110 (2020)
Liebl, B., Burghardt, M.: An evaluation of DNN architectures for page segmentation of historical newspapers. In: International Conference on Pattern Recognition, pp. 5153–5160 (2020)
Monnier, T., Aubry, M.: docExtractor: an off-the-shelf historical document element extraction. In: International Conference on Frontiers in Handwriting Recognition, pp. 91–96 (2020)
Saire, D., Tabbone, S.: Documents counterfeit detection through a deep learning approach. In: International Conference on Pattern Recognition, pp. 3915–3922 (2020)
Thanh Le, H., Phung, S.L., Chapple, P.B., Bouzerdoum, A., Ritz, C.H., Tran, L.C.: Deep Gabor neural network for automatic detection of mine-like objects in sonar imagery. IEEE Access 8, 94126–94139 (2020)
Alam, N., Ahsan, M.M., Based, M.A., Haider, J., Kowalski, M.: An intelligent system for automatic fingerprint identification using feature fusion by Gabor filter and deep learning. Comput. Electr. Eng. 95, 107387 (2021)
Aubry, M.: Deep learning for historical data analysis. In: Workshop on Structuring and Understanding of Multimedia heritAge Contents (2021)
Mechi, O., Mehri, M., Ingold, R., Amara, N.: A two-step framework for text line segmentation in historical Arabic and Latin document images. Int. J. Doc. Anal. Recogn. 24(3), 197–218 (2021)
Sellami, A., Tabbone, S.: EDNets: deep feature learning for document image classification based on multi-view encoder-decoder neural networks. In: International Conference on Document Analysis and Recognition, pp. 318–332 (2021)
Markewich, L., et al.: Segmentation for document layout analysis: not dead yet. Int. J. Doc. Anal. Recogn. 25(2), 67–77 (2022)
Sellami, A., Tabbone, S.: Deep neural networks-based relevant latent representation learning for hyperspectral image classification. Pattern Recogn. 121, 108224 (2022)
Acknowledgments
This work has been funded under the “19PEJC-08-02” grant agreement number by the Tunisian Ministry of Higher Education and Scientific Research that is gratefully acknowledged.
The authors would like also to thank IDMC-Institut des sciences du Digital, Management & Cognition for supporting this research work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mehri, M., Sellami, A., Tabbone, S. (2023). Historical Document Image Segmentation Combining Deep Learning and Gabor Features. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14190. Springer, Cham. https://doi.org/10.1007/978-3-031-41685-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-41685-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41684-2
Online ISBN: 978-3-031-41685-9
eBook Packages: Computer ScienceComputer Science (R0)