Skip to main content

Combination of Two Fully Convolutional Neural Networks for Robust Binarization

  • Conference paper
  • First Online:
  • 3219 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11363))

Abstract

To be able to process historical documents, it is often required to first binarize the image (background and foreground separation) before applying the processing itself. Historical documents are challenging to binarize because of the numerous degradations they suffer such as bleed-through, illuminations, background degradations or ink drops. We present in this paper a new approach to tackle this task by a combination of two neural networks. Recently, the DIBCO binarization competition has seen a growing interest in the use of supervised methods to binarize challenging images. Inspired by the winner of the DIBCO 17 competition, which uses a fully convolutional neural network (FCN), we propose a combination of two FCNs to obtain better performance. While the two FCNs have the same architecture, they are trained on different representations of the input image. The first one uses downscaled image to capture the global context and the object locations. The second one works on patches of native resolution to help defining precisely the boundaries of the characters by capturing the local context. The final prediction is obtained by combining the results of the two FCNs. We show in the experiments that this strategy provides better results and outperforms the winner of the DIBCO17 competition.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Impact project. http://www.impact-project.eu

  2. Read project. http://read.transkribus.eu/

  3. Afzal, M.Z., Pastor-Pellicer, J., Shafait, F., Breuel, T.M., Dengel, A., Liwicki, M.: Document image binarization using LSTM: a sequence learning approach. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 79–84. ACM (2015)

    Google Scholar 

  4. Almeida, M., Lins, R.D., Bernardino, R., Jesus, D., Lima, B.: A new binarization algorithm for historical documents. J. Imaging 4(2), 27 (2018)

    Article  Google Scholar 

  5. Alvarez, J.M., Gevers, T., LeCun, Y., Lopez, A.M.: Road scene segmentation from a single image. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 376–389. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33786-4_28

    Chapter  Google Scholar 

  6. Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. arXiv preprint arXiv:1706.10241 (2017)

  7. Fink, M., Layer, T., Mackenbrock, G., Sprinzl, M.: Baseline detection in historical documents using convolutional u-nets. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 37–42. IEEE (2018)

    Google Scholar 

  8. Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recogn. 39(3), 317–327 (2006)

    Article  Google Scholar 

  9. Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)

    Article  Google Scholar 

  10. Grüning, T., Leifert, G., Strauß, T., Labahn, R.: A Two-Stage Method for Text Line Detection in Historical Documents (2018). http://arxiv.org/abs/1802.03345

  11. He, S., Wiering, M., Schomaker, L.: Junction detection in handwritten documents and its application to writer identification. Pattern Recogn. 48(12), 4036–4048 (2015)

    Article  Google Scholar 

  12. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017)

  13. Journet, N., Visani, M., Mansencal, B., Van-Cuong, K., Billy, A.: DocCreator: a new software for creating synthetic ground-truthed document images. J. Imaging 3(4), 62 (2017)

    Article  Google Scholar 

  14. LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  15. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  16. Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)

    Article  Google Scholar 

  17. Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Sig. Process. Lett. 11(2), 228–231 (2004)

    Article  Google Scholar 

  18. Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs (1986)

    Google Scholar 

  19. Ntirogiannis, K., Gatos, B., Pratikakis, I.: Performance evaluation methodology for historical document image binarization. IEEE Trans. Image Process. 22(2), 595–609 (2013)

    Article  MathSciNet  Google Scholar 

  20. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  21. Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M.J.: Insights on the use of convolutional neural networks for document image binarization. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2015. LNCS, vol. 9095, pp. 115–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19222-2_10

    Chapter  Google Scholar 

  22. Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 competition on document image binarization (DIBCO 2017). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 1395–1403. IEEE (2017)

    Google Scholar 

  23. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  24. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)

    Article  Google Scholar 

  25. Tensmeyer, C., Martinez, T.: Document image binarization with fully convolutional neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 99–104. IEEE (2017)

    Google Scholar 

  26. Westphal, F., Lavesson, N., Grahn, H.: Document image binarization using recurrent neural networks. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 263–268. IEEE (2018)

    Google Scholar 

  27. Wolf, C., Jolion, J.M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: 2002 Proceedings of 16th International Conference on Pattern Recognition, vol. 2, pp. 1037–1040. IEEE (2002)

    Google Scholar 

  28. Afzal, M.Z., Krämer, M., Bukhari, S.S., Yousefi, M.R., Shafait, F., Breuel, T.M.: Robust binarization of stereo and monocular document images using percentile filter. In: Iwamura, M., Shafait, F. (eds.) CBDAR 2013. LNCS, vol. 8357, pp. 139–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05167-3_11

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdel Belaïd .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karpinski, R., Belaïd, A. (2019). Combination of Two Fully Convolutional Neural Networks for Robust Binarization. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20893-6_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20892-9

  • Online ISBN: 978-3-030-20893-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics