Historical Document Image Segmentation Combining Deep Learning and Gabor Features

Mehri, Maroua; Sellami, Akrem; Tabbone, Salvatore

doi:10.1007/978-3-031-41685-9_25

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14190))

Included in the following conference series:

International Conference on Document Analysis and Recognition

583 Accesses

Abstract

Due to the idiosyncrasies of historical document images (HDI), growing attention over the last decades is being paid for proposing robust HDI analysis solutions. Many research studies have shown that Gabor filters are among the low-level descriptors that best characterize texture information in HDI. On the other side, deep neural networks (DNN) have been successfully used for HDI segmentation. As a consequence, we propose in this paper a HDI segmentation method that is based on combining Gabor features and DNN. The segmentation method focuses on classifying each document image pixel to either graphic, text or background. The novelty of the proposed method lies mainly in feeding a DNN with a Gabor filtered image (obtained by applying specific multichannel Gabor filters) instead of an original image as input. The proposed method is decomposed into three steps: a) filtered image generation using Gabor filters, b) feature learning with stacked autoencoder, and c) image segmentation with 2D U-Net. In order to evaluate its performance, experiments are conducted using two different datasets. The results are reported and compared with those of a recent state-of-the-art method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/monniert/docExtractor.
2.
The annotations of the SynDoc12k dataset are available at this url.
3.
http://icdar2017hba.litislab.eu/.
4.
https://gallica.bnf.fr/ark:/12148/bpt6k840383d/f1.planchecontact.r.

References

Okun, O., Pietikäinen, M.: A survey of texture-based methods for document layout analysis. In: Series in Machine Perception and Artificial Intelligence: Texture Analysis in Machine Vision, pp. 165–177 (2000)
Google Scholar
Nicolas, S., Kessentini, Y., Paquet, T., Heutte, L.: Handwritten document segmentation using hidden Markov random fields. In: International Conference on Document Analysis and Recognition, pp. 212–216 (2005)
Google Scholar
Keysers, D., Shafait, F., Breuel, T.: Document image zone classification - a simple high-performance approach. In: International Conference on Computer Vision Theory and Applications, pp. 44–51 (2007)
Google Scholar
Journet, N., Ramel, J., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recogn. 11(1), 9–18 (2008)
Article Google Scholar
Nikolaou, N., Makridis, M., Gatos, B., Stamatopoulos, N., Papamarkos, N.: Segmentation of historical machine-printed documents using adaptive run-length smoothing and skeleton segmentation paths. Image Vis. Comput. 28(4), 590–604 (2010)
Article Google Scholar
Bhowmik, T., Kar, M.: Text localization in historical document images with local binary patterns and variance models. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 501–508 (2013)
Google Scholar
Ferrer, M., Morales, A., Pal, U.: LBP based line-wise script identification. In: International Conference on Document Analysis and Recognition, pp. 369–373 (2013)
Google Scholar
Asi, A., Cohen, R., Kedem, K., El-Sana, J., Dinstein, I.: A coarse-to-fine approach for layout analysis of ancient manuscripts. In: International Conference on Frontiers in Handwriting Recognition, pp. 140–145 (2014)
Google Scholar
Chen, K., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Robust text line segmentation for historical manuscript images using color and texture. In: International Conference on Pattern Recognition, pp. 2978–2983 (2014)
Google Scholar
Nicolaou, A., Slimane, F., Märgner, V., Liwicki, M.: Local binary patterns for Arabic optical font recognition. In: International Workshop on Document Analysis Systems, pp. 76–80 (2014)
Google Scholar
Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recogn. Lett. 35, 23–33 (2014)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)
Google Scholar
Grana, C., Serra, G., Manfredi, M., Coppi, D., Cucchiara, R.: Layout analysis and content enrichment of digitized books. Multimedia Tools Appl. 75(7), 3879–3900 (2016)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Yao, H., Chuyi, L., Dan, H., Weiyu, Y.: Gabor feature based convolutional neural network for object recognition in natural scene. In: International Conference on Information Science and Control Engineering, pp. 386–390 (2016)
Google Scholar
Alberti, M., Seuret, M., Pondenkandath, V., Ingold, R., Liwicki, M.: Historical document image segmentation with LDA-initialized deep neural networks. In: International Workshop on Historical Document Imaging and Processing, pp. 95–100 (2017)
Google Scholar
Chen, K., Seuret, M., Hennebert, J., Ingold, R.: Convolutional neural networks for page segmentation of historical document images. In: International Conference on Document Analysis and Recognition, pp. 965–970 (2017)
Google Scholar
Mehri, M., Héroux, P., Mullot, R., Moreux, J., Coüasnon, B., Barrett, B.: HBA 1.0: a pixel-based annotated dataset for historical book analysis. In: International Workshop on Historical Document Imaging and Processing, pp. 107–112 (2017)
Google Scholar
Mehri, M., Héroux, P., Gomez-Krämer, P., Mullot, R.: Texture feature benchmarking and evaluation for historical document image analysis. Int. J. Doc. Anal. Recogn. 20(1), 1–35 (2017)
Article Google Scholar
Tang, X., Hao, K., Wei, H., Ding, Y.: Using line segments to train multi-stream stacked autoencoders for image classification. Pattern Recogn. Lett. 94, 55–61 (2017)
Article Google Scholar
Wei, H., Seuret, M., Liwicki, M., Ingold, R.: The use of Gabor features for semi-automatically generated polyon-based ground truth of historical document images. Digit. Scholarsh. Human. 32, i134–i149 (2017)
Article Google Scholar
Wei, H., Seuret, M., Liwicki, M., Ingold, R., Fu, P.: Selecting fine-tuned features for layout analysis of historical documents. In: International Conference on Document Analysis and Recognition, pp. 281–286 (2017)
Google Scholar
Kaddas, P., Gatos, B.: A deep convolutional encoder-decoder network for page segmentation of historical handwritten documents into text zones. In: International Conference on Frontiers in Handwriting Recognition, pp. 259–264 (2018)
Google Scholar
Kim, N., So, H.: Directional statistical Gabor features for texture classification. Pattern Recogn. Lett. 112, 18–26 (2018)
Article Google Scholar
Liu, C., Ding, W., Wang, X., Zhang, B.: Hybrid Gabor convolutional networks. Pattern Recogn. Lett. 116, 164–169 (2018)
Article Google Scholar
Luan, S., Chen, C., Zhang, B., Han, J., Liu, J.: Gabor convolutional networks. IEEE Trans. Image Process. 27(9), 4357–4366 (2018)
Article MathSciNet Google Scholar
Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2018)
Google Scholar
Wick, C., Puppe, F.: Fully convolutional neural networks for page segmentation of historical document images. In: International Workshop on Document Analysis Systems, pp. 287–292 (2018)
Google Scholar
Zaragoza, J., Castellanos, F., Vigliensoni, G., Fujinaga, I.: Deep neural networks for document processing of music score images. Appl. Sci. 8(5), 654 (2018)
Article Google Scholar
Do, T., Terrades, O., Tabbone, S.: DSD: document sparse-based denoising algorithm. Pattern Anal. Appl. 22(1), 177–186 (2019)
Article MathSciNet Google Scholar
Dumitrescu, C., Dumitrache, I.: Combining deep learning technologies with multi-level Gabor features for facial recognition in biometric automated systems. Stud. Inform. Control 28(2), 221–230 (2019)
Article Google Scholar
Sehad, A., Chibani, Y., Hedjam, R., Cheriet, M.: Gabor filter-based texture for ancient degraded document image binarization. Pattern Anal. Appl. 22(1), 1–22 (2019)
Article MathSciNet Google Scholar
Weinman, J.J., Chen, Z., Gafford, B., Gifford, N., Lamsal, A., Staab, L.: Deep neural networks for text detection and recognition in historical maps. In: International Conference on Document Analysis and Recognition, pp. 902–909 (2019)
Google Scholar
Droby, A., Barakat, B., Madi, B., Alaasam, R., El-Sana, J.: Unsupervised deep learning for handwritten page segmentation. In: International Conference on Frontiers in Handwriting Recognition, pp. 240–245 (2020)
Google Scholar
Lombardi, F., Marinai, S.: Deep learning for historical document analysis and recognition - a survey. J. Imaging 6(10), 110 (2020)
Article Google Scholar
Liebl, B., Burghardt, M.: An evaluation of DNN architectures for page segmentation of historical newspapers. In: International Conference on Pattern Recognition, pp. 5153–5160 (2020)
Google Scholar
Monnier, T., Aubry, M.: docExtractor: an off-the-shelf historical document element extraction. In: International Conference on Frontiers in Handwriting Recognition, pp. 91–96 (2020)
Google Scholar
Saire, D., Tabbone, S.: Documents counterfeit detection through a deep learning approach. In: International Conference on Pattern Recognition, pp. 3915–3922 (2020)
Google Scholar
Thanh Le, H., Phung, S.L., Chapple, P.B., Bouzerdoum, A., Ritz, C.H., Tran, L.C.: Deep Gabor neural network for automatic detection of mine-like objects in sonar imagery. IEEE Access 8, 94126–94139 (2020)
Google Scholar
Alam, N., Ahsan, M.M., Based, M.A., Haider, J., Kowalski, M.: An intelligent system for automatic fingerprint identification using feature fusion by Gabor filter and deep learning. Comput. Electr. Eng. 95, 107387 (2021)
Article Google Scholar
Aubry, M.: Deep learning for historical data analysis. In: Workshop on Structuring and Understanding of Multimedia heritAge Contents (2021)
Google Scholar
Mechi, O., Mehri, M., Ingold, R., Amara, N.: A two-step framework for text line segmentation in historical Arabic and Latin document images. Int. J. Doc. Anal. Recogn. 24(3), 197–218 (2021)
Article Google Scholar
Sellami, A., Tabbone, S.: EDNets: deep feature learning for document image classification based on multi-view encoder-decoder neural networks. In: International Conference on Document Analysis and Recognition, pp. 318–332 (2021)
Google Scholar
Markewich, L., et al.: Segmentation for document layout analysis: not dead yet. Int. J. Doc. Anal. Recogn. 25(2), 67–77 (2022)
Article Google Scholar
Sellami, A., Tabbone, S.: Deep neural networks-based relevant latent representation learning for hyperspectral image classification. Pattern Recogn. 121, 108224 (2022)
Article Google Scholar

Download references

Acknowledgments

This work has been funded under the “19PEJC-08-02” grant agreement number by the Tunisian Ministry of Higher Education and Scientific Research that is gratefully acknowledged.

The authors would like also to thank IDMC-Institut des sciences du Digital, Management & Cognition for supporting this research work.

Author information

Authors and Affiliations

Université de Sousse, Ecole Nationale d’Ingénieurs de Sousse, LATIS-Laboratory of Advanced Technology and Intelligent Systems, 4023, Sousse, Tunisie
Maroua Mehri
Université de Lorraine, IDMC-Institut des sciences du Digital, Management & Cognition, Pôle Herbert Simon, 13 Rue Michel Ney, 54000, Nancy, France
Maroua Mehri & Salvatore Tabbone
Université de Lille, CNRS, UMR 9189 CRIStAL, Campus scientifique, Bâtiment ESPRIT, Avenue Henri Poincaré, 59655, Villeneuve d’Ascq, France
Akrem Sellami
Université de Lorraine, CNRS, LORIA, UMR 7503, Campus Scientifique, 615 Rue du Jardin-Botanique, 54506, Vandœuvre-lès-Nancy, France
Salvatore Tabbone

Authors

Maroua Mehri
View author publications
You can also search for this author in PubMed Google Scholar
Akrem Sellami
View author publications
You can also search for this author in PubMed Google Scholar
Salvatore Tabbone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maroua Mehri .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Gernot A. Fink
Adobe, College Park, MN, USA
Rajiv Jain
Osaka Metropolitan University, Osaka, Japan
Koichi Kise
Rochester Institute of Technology, Rochester, NY, USA
Richard Zanibbi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mehri, M., Sellami, A., Tabbone, S. (2023). Historical Document Image Segmentation Combining Deep Learning and Gabor Features. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14190. Springer, Cham. https://doi.org/10.1007/978-3-031-41685-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-41685-9_25
Published: 19 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41684-2
Online ISBN: 978-3-031-41685-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Historical Document Image Segmentation Combining Deep Learning and Gabor Features