Skip to main content

ISauvola: Improved Sauvola’s Algorithm for Document Image Binarization

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9730))

Abstract

Binarization of historical documents is difficult and is still an open area of research. In this paper, a new binarization technique for document images is presented. The proposed technique is based on the most commonly used binarization method: Sauvola’s, which performs relatively well on classical documents, however, three main defects remain: the window parameter of Sauvola’s formula does not fit automatically to the image content, is not robust to low contrasts, and not invariant with respect to contrast inversion. Thus on documents such as magazines, the content may not be retrieved correctly. In this paper we use the image contrast that is defined by the local image minimum and maximum in combination with the computed Sauvola’s binarization step to guarantee good quality binarization for both low and correctly contrasted objects inside a single document, without adjusting manually the user-defined parameters to the document content.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Numerically, this is done by using the function bwareaopen of Matlab.

References

  1. Otsu, N.: A thresholding selection method from gray-level histogram. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  MathSciNet  Google Scholar 

  2. Kapur, J.N., Sahoo, P.K., Wong, A.K.C.: A new method for gray-level picture thresholding using the entropy of the histogram. Graph. Image Process. 29, 273–285 (1985)

    Article  Google Scholar 

  3. Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognit. 19(1), 41–47 (1986)

    Article  Google Scholar 

  4. Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)

    Google Scholar 

  5. Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)

    Article  Google Scholar 

  6. Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceedings of the Eighth International Conference on Pattern Recognition, Paris, France, pp. 1251–1255, October 1986

    Google Scholar 

  7. Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents. Pattern Anal. Appl. 6(4), 309–326 (2003)

    MathSciNet  Google Scholar 

  8. Feng, M.L., Tan, Y.P.: Contrast adaptive binarization of low quality document images. IEICE Electron. Express 1(16), 501–506 (2004)

    Article  Google Scholar 

  9. Kim, I.K., Jung, D.W., Park, R.H.: Document image binarization based on topographic analysis using a water flow model. Pattern Recogn. 35(1), 265–277 (2002)

    Article  MATH  Google Scholar 

  10. Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recogn. 39(3), 317–327 (2006)

    Article  MATH  Google Scholar 

  11. Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recogn. 13(4), 303–314 (2010)

    Article  Google Scholar 

  12. Ntirogiannis, K., Gatos, B., Pratikakis, I.: A combined approach for the binarization of handwritten document images. Pattern Recogn. Lett. - Spec. Issue Front. Handwrit. Process. 35, 3–15 (2012). doi:10.1016/j.patrec.2012.09.026

    Article  Google Scholar 

  13. Moghaddam, R.F., Cheriet, M.: RSLDI: restoration of singlesided low-quality document images. Pattern Recogn. 42(12), 3355–3364 (2009)

    Article  MATH  Google Scholar 

  14. Howe, N.: Document binarization with automatic parameter tuning. Int. J. Doc. Anal. Recogn. 16, 247–258 (2012)

    Article  Google Scholar 

  15. Su, B., Lu, S., Tan, C.L.: Binarization of historical handwritten document images using local maximum and minimum filter. In: International Workshop on Document Analysis Systems, pp. 159–165, June 2010

    Google Scholar 

  16. Hadjadj, Z., Meziane, A., Cheriet, M., Cherfa, Y.: An active contour based method for image binarization: application to degraded historical document images. In: ICFHR 2014, Crete, Greece, pp. 655–660 (2014). doi:10.1109/ICFHR.2014.115

  17. Moghaddam, R.F., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recogn. 43(6), 2186–2198 (2010)

    Article  MATH  Google Scholar 

  18. Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 13, 146–165 (2004)

    Article  Google Scholar 

  19. Badekas, E., Papamarkos, N.: Automatic evaluation of document binarization results. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, pp. 1005–1014. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  20. Rangoni, Y., Shafait, F., Breuel, T.M.: OCR based thresholding. In: Proceedings of IAPR Conference on Machine Vision Applications, pp. 98–101 (2009)

    Google Scholar 

  21. Cheriet, M., Moghaddam, R.F., Hedjam, R.: A learning framework for the optimization and automation of document binarization methods. Comput. Vis. Image Underst. (CVIU) 117(3), 269–280 (2013)

    Article  Google Scholar 

  22. Lazzara, G., Géraud, T.: Efficient multiscale Sauvola’s binarization. Int. J. Doc. Anal. Recogn. 17(2), 105–123 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zineb Hadjadj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Hadjadj, Z., Meziane, A., Cherfa, Y., Cheriet, M., Setitra, I. (2016). ISauvola: Improved Sauvola’s Algorithm for Document Image Binarization. In: Campilho, A., Karray, F. (eds) Image Analysis and Recognition. ICIAR 2016. Lecture Notes in Computer Science(), vol 9730. Springer, Cham. https://doi.org/10.1007/978-3-319-41501-7_82

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41501-7_82

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41500-0

  • Online ISBN: 978-3-319-41501-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics