Skip to main content

Test-Time Augmentation for Document Image Binarization

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2023)

Abstract

Document binarization is a well-known process addressed in the document image analysis literature, which aims to isolate the ink information from the background. Current solutions use deep learning, which requires a great amount of annotated data for training robust models. Data augmentation is known to reduce such annotation requirements, and it can be used in two ways: during training and during prediction. The latter is the so-called Test Time Augmentation (TTA), which has been successfully applied for general classification tasks. In this work, we study the application of TTA for binarization, a more complex and specific task. We focus on cases with a severe scarcity of annotated data over 5 existing binarization benchmarks. Although the results report certain improvements, these are rather limited. This implies that existing TTA strategies are not sufficient for binarization, which points to interesting lines of future work to further boost the performance.

This work was supported by the I+D+i project TED2021-132103A-I00 (DOREMI), funded by MCIN/AEI/10.13039/501100011033.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.e-codices.unifr.ch/en/sbe/0611/.

  2. 2.

    https://cantus.simssa.ca/manuscript/133/.

References

  1. Afzal, M.Z., Pastor-Pellicer, J., Shafait, F., Breuel, T.M., Dengel, A., Liwicki, M.: Document image binarization using LSTM: a sequence learning approach. In: Proc. of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 79–84. New York, NY, USA (2015)

    Google Scholar 

  2. Ayatollahi, S.M., Nafchi, H.Z.: Persian heritage image binarization competition (PHIBC 2012). In: 2013 First Iranian Conference on Pattern Recognition and Image Analysis (PRIA), pp. 1–4. IEEE (2013)

    Google Scholar 

  3. Bainbridge, D., Bell, T.: The challenge of optical music recognition. Comput. Humanit. 35(2), 95–121 (2001)

    Article  Google Scholar 

  4. Burie, J.C., et al.: ICFHR 2016 competition on the analysis of handwritten text in images of balinese palm leaf manuscripts. In: 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 596–601 (2016)

    Google Scholar 

  5. Calvo-Zaragoza, J., Gallego, A.J.: A selectional auto-encoder approach for document image binarization. Pattern Recogn. 86, 37–47 (2019)

    Article  Google Scholar 

  6. Calvo-Zaragoza, J., Rico-Juan, J.R., Gallego, A.J.: Ensemble classification from deep predictions with test data augmentation. Soft. Comput. 24, 1423–1433 (2020)

    Article  Google Scholar 

  7. Campos, V.B., Toselli, A.H., Vidal, E.: Natural language inspired approach for handwritten text line detection in legacy documents. In: Proc. of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2012), pp. 107–111 (2012)

    Google Scholar 

  8. Castellanos, F.J., Gallego, A.J., Calvo-Zaragoza, J.: Unsupervised neural domain adaptation for document image binarization. Pattern Recogn. 119, 108099 (2021)

    Article  Google Scholar 

  9. Doermann, D., Tombre, K.: Handbook of Document Image Processing and Recognition. Springer, London (2014). https://doi.org/10.1007/978-0-85729-859-1

    Book  MATH  Google Scholar 

  10. Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: 2009 10th International Conference on Document Analysis and Recognition, pp. 1375–1382. IEEE (2009)

    Google Scholar 

  11. Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)

    Article  Google Scholar 

  12. Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)

    Article  MathSciNet  Google Scholar 

  13. He, S., Wiering, M., Schomaker, L.: Junction detection in handwritten documents and its application to writer identification. Pattern Recogn. 48(12), 4036–4048 (2015)

    Article  Google Scholar 

  14. Huang, X., Li, L., Liu, R., Xu, C., Ye, M.: Binarization of degraded document images with global-local U-Nets. Optik 203, 164025 (2020)

    Article  Google Scholar 

  15. Louloudis, G., Gatos, B., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recogn. 41(12), 3758–3772 (2008)

    Article  MATH  Google Scholar 

  16. Nalepa, J., Myller, M., Kawulok, M.: Training- and test-time data augmentation for hyperspectral image segmentation. IEEE Geosci. Remote Sens. Lett. 17(2), 292–296 (2020)

    Article  Google Scholar 

  17. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  MathSciNet  Google Scholar 

  18. Pastor-Pellicer, J., España-Boquera, S., Zamora-Martínez, F., Afzal, M.Z., Castro-Bleda, M.J.: Insights on the use of convolutional neural networks for document image binarization. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2015. LNCS, vol. 9095, pp. 115–126. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19222-2_10

    Chapter  Google Scholar 

  19. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Javier Gallego .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rosello, A., Castellanos, F.J., Martinez-Esteso, J.P., Gallego, A.J., Calvo-Zaragoza, J. (2023). Test-Time Augmentation for Document Image Binarization. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36616-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36615-4

  • Online ISBN: 978-3-031-36616-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics