Skip to main content

Insights on the Use of Convolutional Neural Networks for Document Image Binarization

  • Conference paper
  • First Online:
Advances in Computational Intelligence (IWANN 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9095))

Included in the following conference series:

Abstract

Convolutional Neural Networks have systematically shown good performance in Computer Vision and in Handwritten Text Recognition tasks. This paper proposes the use of these models for document image binarization. The main idea is to classify each pixel of the image into foreground and background from a sliding window centered at the pixel to be classified. An experimental analysis on the effect of sensitive parameters and some working topologies are proposed using two different corpora, of very different properties: DIBCO and Santgall.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Badekas, E., Papamarkos, N.: Optimal combination of document binarization techniques using a self-organizing map neural network. Engineering Applications of Artificial Intelligence 20(1), 11–24 (2007)

    Article  Google Scholar 

  2. Banerjee, J., Namboodiri, A.M., Jawahar, C.: Contextual restoration of severely degraded document images. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 517–524. IEEE (2009)

    Google Scholar 

  3. Brink, A.: Thresholding of digital images using two-dimensional entropies. Pattern recognition 25(8), 803–808 (1992)

    Article  Google Scholar 

  4. Chi, Z., Wong, K.: A two-stage binarization approach for document images. In: Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 275–278 (2001)

    Google Scholar 

  5. Ciresan, D.C., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. CoRR abs/1202.2745 (2012)

    Google Scholar 

  6. Egmont-Petersen, M., de Ridder, D., Handels, H.: Image processing with neural networks - a review. Pattern Recognition 35(10), 2279–2301 (2002)

    Article  MATH  Google Scholar 

  7. Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of latin manuscripts using hidden markov models. In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, pp. 29–36. ACM (2011)

    Google Scholar 

  8. Fischer, A., Indermühle, E., Bunke, H., Viehhauser, G., Stolz, M.: Ground truth creation for handwriting recognition in historical documents. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 3–10. ACM (2010)

    Google Scholar 

  9. Gatos, B., Ntirogiannis, K., Pratikakis, I.: Icdar 2009 document image binarization contest (dibco 2009). ICDAR 9, 1375–1382 (2009)

    Google Scholar 

  10. Hidalgo, J.L., España, S., Castro, M.J., Pérez, J.A.: Enhancement and cleaning of handwritten data by using neural networks. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3522, pp. 376–383. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Kang, L., Kumar, J., Ye, P., Li, Y., Doermann, D.: Convolutional neural networks for document image classification. In: Intern. Conf. on Pattern Recognition, pp. 3168–3172. IEEE (2014)

    Google Scholar 

  12. Kittler, J., Illingworth, J.: On threshold selection using clustering criteria. IEEE Transactions on Systems, Man and Cybernetics 5, 652–655 (1985)

    Article  Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

  14. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  15. Marinai, S., Gori, M., Soda, G.: Artificial neural networks for document analysis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(1), 23–35 (2005)

    Article  Google Scholar 

  16. Mehrara, H., Zahedinejad, M., Pourmohammad, A.: Novel edge detection using bp neural network based on threshold binarization. In: Second International Conference on Computer and Electrical Engineering, 2009. ICCEE 2009, vol. 2, pp. 408–412. IEEE (2009)

    Google Scholar 

  17. Nagy, G.: Twenty years of document image analysis in pami. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1), 38–62 (2000)

    Article  Google Scholar 

  18. Niblack, W.: An introduction to digital image processing. Strandberg Publishing Company (1985)

    Google Scholar 

  19. Otsu, N.: A threshold selection method from gray-level histograms. Automatica 11(285–296), 23–27 (1975)

    Google Scholar 

  20. Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-dibco 2010-handwritten document image binarization competition. In: 2010 International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 727–732. IEEE (2010)

    Google Scholar 

  21. Pratikakis, I., Gatos, B., Ntirogiannis, K.: Icfhr 2012 competition on handwritten document image binarization (h-dibco 2012). ICFHR 12, 18–20 (2012)

    Google Scholar 

  22. Pratikakis, I., Gatos, B., Ntirogiannis, K.: Icdar 2013 document image binarization contest (dibco 2013). In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1471–1476. IEEE (2013)

    Google Scholar 

  23. Rehman, A., Saba, T.: Neural networks for document image preprocessing: state of the art. Artificial Intelligence Review 42(2), 253–273 (2014)

    Article  Google Scholar 

  24. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognition 33(2), 225–236 (2000)

    Article  Google Scholar 

  25. Sermanet, P., Chintala, S., LeCun, Y.: Convolutional neural networks applied to house numbers digit classification. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3288–3291 (2012)

    Google Scholar 

  26. Su, B., Lu, S., Tan, C.L.: Combination of document image binarization techniques. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 22–26. IEEE (2011)

    Google Scholar 

  27. Zamora-Martínez, F., España-Boquera, S., Gorbe-Moya, J., Pastor-Pellicer, J., Palacios-Corella, A.: APRIL-ANN toolkit, A Pattern Recognizer In Lua with Artificial Neural Networks (2013). https://github.com/pakozm/april-ann

  28. Zeiler, M.D.: ADADELTA: an adaptive learning rate method. CoRR abs/1212.5701 (2012). http://arxiv.org/abs/1212.5701

  29. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. CoRR abs/1301.3557 (2013). http://arxiv.org/abs/1301.3557

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Pastor-Pellicer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M. (2015). Insights on the Use of Convolutional Neural Networks for Document Image Binarization. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2015. Lecture Notes in Computer Science(), vol 9095. Springer, Cham. https://doi.org/10.1007/978-3-319-19222-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19222-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19221-5

  • Online ISBN: 978-3-319-19222-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics