Skip to main content

Historical Document Image Segmentation Combining Deep Learning and Gabor Features

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Abstract

Due to the idiosyncrasies of historical document images (HDI), growing attention over the last decades is being paid for proposing robust HDI analysis solutions. Many research studies have shown that Gabor filters are among the low-level descriptors that best characterize texture information in HDI. On the other side, deep neural networks (DNN) have been successfully used for HDI segmentation. As a consequence, we propose in this paper a HDI segmentation method that is based on combining Gabor features and DNN. The segmentation method focuses on classifying each document image pixel to either graphic, text or background. The novelty of the proposed method lies mainly in feeding a DNN with a Gabor filtered image (obtained by applying specific multichannel Gabor filters) instead of an original image as input. The proposed method is decomposed into three steps: a) filtered image generation using Gabor filters, b) feature learning with stacked autoencoder, and c) image segmentation with 2D U-Net. In order to evaluate its performance, experiments are conducted using two different datasets. The results are reported and compared with those of a recent state-of-the-art method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/monniert/docExtractor.

  2. 2.

    The annotations of the SynDoc12k dataset are available at this url.

  3. 3.

    http://icdar2017hba.litislab.eu/.

  4. 4.

    https://gallica.bnf.fr/ark:/12148/bpt6k840383d/f1.planchecontact.r.

References

  1. Okun, O., Pietikäinen, M.: A survey of texture-based methods for document layout analysis. In: Series in Machine Perception and Artificial Intelligence: Texture Analysis in Machine Vision, pp. 165–177 (2000)

    Google Scholar 

  2. Nicolas, S., Kessentini, Y., Paquet, T., Heutte, L.: Handwritten document segmentation using hidden Markov random fields. In: International Conference on Document Analysis and Recognition, pp. 212–216 (2005)

    Google Scholar 

  3. Keysers, D., Shafait, F., Breuel, T.: Document image zone classification - a simple high-performance approach. In: International Conference on Computer Vision Theory and Applications, pp. 44–51 (2007)

    Google Scholar 

  4. Journet, N., Ramel, J., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recogn. 11(1), 9–18 (2008)

    Article  Google Scholar 

  5. Nikolaou, N., Makridis, M., Gatos, B., Stamatopoulos, N., Papamarkos, N.: Segmentation of historical machine-printed documents using adaptive run-length smoothing and skeleton segmentation paths. Image Vis. Comput. 28(4), 590–604 (2010)

    Article  Google Scholar 

  6. Bhowmik, T., Kar, M.: Text localization in historical document images with local binary patterns and variance models. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 501–508 (2013)

    Google Scholar 

  7. Ferrer, M., Morales, A., Pal, U.: LBP based line-wise script identification. In: International Conference on Document Analysis and Recognition, pp. 369–373 (2013)

    Google Scholar 

  8. Asi, A., Cohen, R., Kedem, K., El-Sana, J., Dinstein, I.: A coarse-to-fine approach for layout analysis of ancient manuscripts. In: International Conference on Frontiers in Handwriting Recognition, pp. 140–145 (2014)

    Google Scholar 

  9. Chen, K., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Robust text line segmentation for historical manuscript images using color and texture. In: International Conference on Pattern Recognition, pp. 2978–2983 (2014)

    Google Scholar 

  10. Nicolaou, A., Slimane, F., Märgner, V., Liwicki, M.: Local binary patterns for Arabic optical font recognition. In: International Workshop on Document Analysis Systems, pp. 76–80 (2014)

    Google Scholar 

  11. Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recogn. Lett. 35, 23–33 (2014)

    Article  Google Scholar 

  12. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)

    Google Scholar 

  13. Grana, C., Serra, G., Manfredi, M., Coppi, D., Cucchiara, R.: Layout analysis and content enrichment of digitized books. Multimedia Tools Appl. 75(7), 3879–3900 (2016)

    Article  Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  15. Yao, H., Chuyi, L., Dan, H., Weiyu, Y.: Gabor feature based convolutional neural network for object recognition in natural scene. In: International Conference on Information Science and Control Engineering, pp. 386–390 (2016)

    Google Scholar 

  16. Alberti, M., Seuret, M., Pondenkandath, V., Ingold, R., Liwicki, M.: Historical document image segmentation with LDA-initialized deep neural networks. In: International Workshop on Historical Document Imaging and Processing, pp. 95–100 (2017)

    Google Scholar 

  17. Chen, K., Seuret, M., Hennebert, J., Ingold, R.: Convolutional neural networks for page segmentation of historical document images. In: International Conference on Document Analysis and Recognition, pp. 965–970 (2017)

    Google Scholar 

  18. Mehri, M., Héroux, P., Mullot, R., Moreux, J., Coüasnon, B., Barrett, B.: HBA 1.0: a pixel-based annotated dataset for historical book analysis. In: International Workshop on Historical Document Imaging and Processing, pp. 107–112 (2017)

    Google Scholar 

  19. Mehri, M., Héroux, P., Gomez-Krämer, P., Mullot, R.: Texture feature benchmarking and evaluation for historical document image analysis. Int. J. Doc. Anal. Recogn. 20(1), 1–35 (2017)

    Article  Google Scholar 

  20. Tang, X., Hao, K., Wei, H., Ding, Y.: Using line segments to train multi-stream stacked autoencoders for image classification. Pattern Recogn. Lett. 94, 55–61 (2017)

    Article  Google Scholar 

  21. Wei, H., Seuret, M., Liwicki, M., Ingold, R.: The use of Gabor features for semi-automatically generated polyon-based ground truth of historical document images. Digit. Scholarsh. Human. 32, i134–i149 (2017)

    Article  Google Scholar 

  22. Wei, H., Seuret, M., Liwicki, M., Ingold, R., Fu, P.: Selecting fine-tuned features for layout analysis of historical documents. In: International Conference on Document Analysis and Recognition, pp. 281–286 (2017)

    Google Scholar 

  23. Kaddas, P., Gatos, B.: A deep convolutional encoder-decoder network for page segmentation of historical handwritten documents into text zones. In: International Conference on Frontiers in Handwriting Recognition, pp. 259–264 (2018)

    Google Scholar 

  24. Kim, N., So, H.: Directional statistical Gabor features for texture classification. Pattern Recogn. Lett. 112, 18–26 (2018)

    Article  Google Scholar 

  25. Liu, C., Ding, W., Wang, X., Zhang, B.: Hybrid Gabor convolutional networks. Pattern Recogn. Lett. 116, 164–169 (2018)

    Article  Google Scholar 

  26. Luan, S., Chen, C., Zhang, B., Han, J., Liu, J.: Gabor convolutional networks. IEEE Trans. Image Process. 27(9), 4357–4366 (2018)

    Article  MathSciNet  Google Scholar 

  27. Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2018)

    Google Scholar 

  28. Wick, C., Puppe, F.: Fully convolutional neural networks for page segmentation of historical document images. In: International Workshop on Document Analysis Systems, pp. 287–292 (2018)

    Google Scholar 

  29. Zaragoza, J., Castellanos, F., Vigliensoni, G., Fujinaga, I.: Deep neural networks for document processing of music score images. Appl. Sci. 8(5), 654 (2018)

    Article  Google Scholar 

  30. Do, T., Terrades, O., Tabbone, S.: DSD: document sparse-based denoising algorithm. Pattern Anal. Appl. 22(1), 177–186 (2019)

    Article  MathSciNet  Google Scholar 

  31. Dumitrescu, C., Dumitrache, I.: Combining deep learning technologies with multi-level Gabor features for facial recognition in biometric automated systems. Stud. Inform. Control 28(2), 221–230 (2019)

    Article  Google Scholar 

  32. Sehad, A., Chibani, Y., Hedjam, R., Cheriet, M.: Gabor filter-based texture for ancient degraded document image binarization. Pattern Anal. Appl. 22(1), 1–22 (2019)

    Article  MathSciNet  Google Scholar 

  33. Weinman, J.J., Chen, Z., Gafford, B., Gifford, N., Lamsal, A., Staab, L.: Deep neural networks for text detection and recognition in historical maps. In: International Conference on Document Analysis and Recognition, pp. 902–909 (2019)

    Google Scholar 

  34. Droby, A., Barakat, B., Madi, B., Alaasam, R., El-Sana, J.: Unsupervised deep learning for handwritten page segmentation. In: International Conference on Frontiers in Handwriting Recognition, pp. 240–245 (2020)

    Google Scholar 

  35. Lombardi, F., Marinai, S.: Deep learning for historical document analysis and recognition - a survey. J. Imaging 6(10), 110 (2020)

    Article  Google Scholar 

  36. Liebl, B., Burghardt, M.: An evaluation of DNN architectures for page segmentation of historical newspapers. In: International Conference on Pattern Recognition, pp. 5153–5160 (2020)

    Google Scholar 

  37. Monnier, T., Aubry, M.: docExtractor: an off-the-shelf historical document element extraction. In: International Conference on Frontiers in Handwriting Recognition, pp. 91–96 (2020)

    Google Scholar 

  38. Saire, D., Tabbone, S.: Documents counterfeit detection through a deep learning approach. In: International Conference on Pattern Recognition, pp. 3915–3922 (2020)

    Google Scholar 

  39. Thanh Le, H., Phung, S.L., Chapple, P.B., Bouzerdoum, A., Ritz, C.H., Tran, L.C.: Deep Gabor neural network for automatic detection of mine-like objects in sonar imagery. IEEE Access 8, 94126–94139 (2020)

    Google Scholar 

  40. Alam, N., Ahsan, M.M., Based, M.A., Haider, J., Kowalski, M.: An intelligent system for automatic fingerprint identification using feature fusion by Gabor filter and deep learning. Comput. Electr. Eng. 95, 107387 (2021)

    Article  Google Scholar 

  41. Aubry, M.: Deep learning for historical data analysis. In: Workshop on Structuring and Understanding of Multimedia heritAge Contents (2021)

    Google Scholar 

  42. Mechi, O., Mehri, M., Ingold, R., Amara, N.: A two-step framework for text line segmentation in historical Arabic and Latin document images. Int. J. Doc. Anal. Recogn. 24(3), 197–218 (2021)

    Article  Google Scholar 

  43. Sellami, A., Tabbone, S.: EDNets: deep feature learning for document image classification based on multi-view encoder-decoder neural networks. In: International Conference on Document Analysis and Recognition, pp. 318–332 (2021)

    Google Scholar 

  44. Markewich, L., et al.: Segmentation for document layout analysis: not dead yet. Int. J. Doc. Anal. Recogn. 25(2), 67–77 (2022)

    Article  Google Scholar 

  45. Sellami, A., Tabbone, S.: Deep neural networks-based relevant latent representation learning for hyperspectral image classification. Pattern Recogn. 121, 108224 (2022)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been funded under the “19PEJC-08-02” grant agreement number by the Tunisian Ministry of Higher Education and Scientific Research that is gratefully acknowledged.

The authors would like also to thank IDMC-Institut des sciences du Digital, Management & Cognition for supporting this research work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maroua Mehri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mehri, M., Sellami, A., Tabbone, S. (2023). Historical Document Image Segmentation Combining Deep Learning and Gabor Features. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14190. Springer, Cham. https://doi.org/10.1007/978-3-031-41685-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41685-9_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41684-2

  • Online ISBN: 978-3-031-41685-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics