Skip to main content
Log in

A new efficient binarization method: application to degraded historical document images

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Binarization is an important step in reading text documents automatically through optical character recognition. Old document images often suffer from degradations that make their binarization a challenging task. In this paper, a new binarization technique for degraded document images is presented. The proposed technique is based on active contours evolving according to intrinsic geometric measures of the document image. The image contrast that is defined by the local image maximum and minimum is used to automatically generate the initialization map of our active contour model; an average thresholding is also used to produce the final delineation and binarization. The proposed implementation benefits from the level set framework, which allows the simultaneous application of a large variety of forces at the stroke–background interface. Our binarization method involves the combination of those forces in a specific way. The efficiency of the proposed method is shown on both recent and historical document images of the Document Image Binarization Contest (DIBCO) datasets that include different types of degradations. The results are compared to a number of known techniques from the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Numerically, this is done by using the function bwdist of MATLAB.

  2. http://users.iit.demokritos.gr/~bgat/DIBCO2009/benchmark/.

  3. http://utopia.duth.gr/~ipratika/DIBCO2011/benchmark/.

  4. http://utopia.duth.gr/~ipratika/HDIBCO2012/benchmark/.

  5. Numerically, this is done by using the function bwdist of MATLAB.

References

  1. Otsu, N.: A thresholding selection method from gray-level histogram. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  2. Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognit. 19(1), 41–47 (1986)

    Article  Google Scholar 

  3. Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)

    Google Scholar 

  4. Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)

  5. Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceedings of the Eighth International Conference on Pattern Recognition, Paris, France, pp. 1251–1255 (1986)

  6. Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recognit. 39(3), 317–327 (2006)

    Article  MATH  Google Scholar 

  7. Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. IJDAR 13(4), 303–314 (2010)

    Article  Google Scholar 

  8. Ntirogiannis, K., Gatos, B., Pratikakis, I.: A combined approach for the binarization of handwritten document images. Pattern Recognit. Lett. (2012). doi:10.1016/j.patrec.2012.09.026

    Google Scholar 

  9. Hedjam, R., Farrahi Moghaddam, R., Cheriet, M.: A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images. Pattern Recognit. 44(9), 2184–2196 (2011)

    Article  Google Scholar 

  10. Hadjadj, Z., Meziane, A., Cheriet, M., Cherfa, Y.: An active contour based method for image binarization: application to degraded historical document images. In: ICFHR’14, pp. 655–660 (2014)

  11. Hadjadj, Z., Meziane, A., Cherfa, Y., Cheriet, M., Setitra, I.: ISauvola: improved Sauvola’s algorithm for document image binarization. ICIAR 2016, 737–745 (2016)

    Google Scholar 

  12. Nirmala, S., Nagabhushan, P.: Foreground text segmentation in complex color document images using Gabor filters. SIVP 6(4), 669–678 (2012)

    Google Scholar 

  13. Moghaddam, R.F., Cheriet, M.: RSLDI: restoration of single sided low-quality document images. Pattern Recognit. 42(12), 3355–3364 (2009)

    Article  MATH  Google Scholar 

  14. Chen, Q., Sun, Q., Heng, P.A., Xia, D.: A double-threshold image binarization method based on edge detector. Pattern Recognit. 41(4), 1254–1267 (2008)

    Article  Google Scholar 

  15. Su, B., Lu, S., Tan, C.L.: Binarization of historical handwritten document images using local maximum and minimum filter. In: DAS, pp. 159–165 (2010)

  16. Howe, N.: Document binarization with automatic parameter tuning. IJDAR 16(3), 247–258 (2013)

    Article  Google Scholar 

  17. Rivest-Hénault, D., Farrahi Moghaddam, R., Cheriet, M.: A local linear level set method for the binarization of degraded historical document images. IJDAR 15, 101–124 (2011)

    Article  Google Scholar 

  18. Bukhari, S.S., Shafait, F., Breuel, T.M.: Coupled snakelet model for curled textline segmentation of camera-captured document images. In: ICDAR’2009, pp. 61–65 (2009)

  19. Li, C., Kao, C.Y., Gore, J.C., Ding, Z.: Minimization of region-scalable fitting energy for image segmentation. IEEE Trans. Image Process. 17(10), 1940–1949 (2008)

    Article  MathSciNet  Google Scholar 

  20. Osher, S.J., Sethian, J.A.: Fronts propagation with curvature dependent speed: algorithms based on Hamilton Jacobi formulations. J. Comput. Phys. 79(2), 12–49 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  21. Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: ICDAR’09, pp. 1375–1382 (2009)

  22. Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: ICFHR’2012 (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zineb Hadjadj.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hadjadj, Z., Cheriet, M., Meziane, A. et al. A new efficient binarization method: application to degraded historical document images. SIViP 11, 1155–1162 (2017). https://doi.org/10.1007/s11760-017-1070-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-017-1070-2

Keywords

Navigation