Combination of Two Fully Convolutional Neural Networks for Robust Binarization

Karpinski, Romain; Belaïd, Abdel

doi:10.1007/978-3-030-20893-6_32

Combination of Two Fully Convolutional Neural Networks for Robust Binarization

Conference paper
First Online: 29 May 2019

3219 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11363))

Abstract

To be able to process historical documents, it is often required to first binarize the image (background and foreground separation) before applying the processing itself. Historical documents are challenging to binarize because of the numerous degradations they suffer such as bleed-through, illuminations, background degradations or ink drops. We present in this paper a new approach to tackle this task by a combination of two neural networks. Recently, the DIBCO binarization competition has seen a growing interest in the use of supervised methods to binarize challenging images. Inspired by the winner of the DIBCO 17 competition, which uses a fully convolutional neural network (FCN), we propose a combination of two FCNs to obtain better performance. While the two FCNs have the same architecture, they are trained on different representations of the input image. The first one uses downscaled image to capture the global context and the object locations. The second one works on patches of native resolution to help defining precisely the boundaries of the characters by capturing the local context. The final prediction is obtained by combining the results of the two FCNs. We show in the experiments that this strategy provides better results and outperforms the winner of the DIBCO17 competition.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Impact project. http://www.impact-project.eu
Read project. http://read.transkribus.eu/
Afzal, M.Z., Pastor-Pellicer, J., Shafait, F., Breuel, T.M., Dengel, A., Liwicki, M.: Document image binarization using LSTM: a sequence learning approach. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 79–84. ACM (2015)
Google Scholar
Almeida, M., Lins, R.D., Bernardino, R., Jesus, D., Lima, B.: A new binarization algorithm for historical documents. J. Imaging 4(2), 27 (2018)
Article Google Scholar
Alvarez, J.M., Gevers, T., LeCun, Y., Lopez, A.M.: Road scene segmentation from a single image. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 376–389. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33786-4_28
Chapter Google Scholar
Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. arXiv preprint arXiv:1706.10241 (2017)
Fink, M., Layer, T., Mackenbrock, G., Sprinzl, M.: Baseline detection in historical documents using convolutional u-nets. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 37–42. IEEE (2018)
Google Scholar
Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recogn. 39(3), 317–327 (2006)
Article Google Scholar
Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)
Article Google Scholar
Grüning, T., Leifert, G., Strauß, T., Labahn, R.: A Two-Stage Method for Text Line Detection in Historical Documents (2018). http://arxiv.org/abs/1802.03345
He, S., Wiering, M., Schomaker, L.: Junction detection in handwritten documents and its application to writer identification. Pattern Recogn. 48(12), 4036–4048 (2015)
Article Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017)
Journet, N., Visani, M., Mansencal, B., Van-Cuong, K., Billy, A.: DocCreator: a new software for creating synthetic ground-truthed document images. J. Imaging 3(4), 62 (2017)
Article Google Scholar
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)
Article Google Scholar
Lu, H., Kot, A.C., Shi, Y.Q.: Distance-reciprocal distortion measure for binary document images. IEEE Sig. Process. Lett. 11(2), 228–231 (2004)
Article Google Scholar
Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs (1986)
Google Scholar
Ntirogiannis, K., Gatos, B., Pratikakis, I.: Performance evaluation methodology for historical document image binarization. IEEE Trans. Image Process. 22(2), 595–609 (2013)
Article MathSciNet Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M.J.: Insights on the use of convolutional neural networks for document image binarization. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2015. LNCS, vol. 9095, pp. 115–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19222-2_10
Chapter Google Scholar
Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 competition on document image binarization (DIBCO 2017). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 1395–1403. IEEE (2017)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
Article Google Scholar
Tensmeyer, C., Martinez, T.: Document image binarization with fully convolutional neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 99–104. IEEE (2017)
Google Scholar
Westphal, F., Lavesson, N., Grahn, H.: Document image binarization using recurrent neural networks. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 263–268. IEEE (2018)
Google Scholar
Wolf, C., Jolion, J.M., Chassaing, F.: Text localization, enhancement and binarization in multimedia documents. In: 2002 Proceedings of 16th International Conference on Pattern Recognition, vol. 2, pp. 1037–1040. IEEE (2002)
Google Scholar
Afzal, M.Z., Krämer, M., Bukhari, S.S., Yousefi, M.R., Shafait, F., Breuel, T.M.: Robust binarization of stereo and monocular document images using percentile filter. In: Iwamura, M., Shafait, F. (eds.) CBDAR 2013. LNCS, vol. 8357, pp. 139–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05167-3_11
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Université de Lorraine-CNRS-LORIA, Campus scientifique, 54500, Vandoeuvre-Lès-Nancy, France
Romain Karpinski & Abdel Belaïd

Authors

Romain Karpinski
View author publications
You can also search for this author in PubMed Google Scholar
Abdel Belaïd
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdel Belaïd .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karpinski, R., Belaïd, A. (2019). Combination of Two Fully Convolutional Neural Networks for Robust Binarization. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-20893-6_32
Published: 29 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20892-9
Online ISBN: 978-3-030-20893-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics