Abstract
Binarization is an important step in reading text documents automatically through optical character recognition. Old document images often suffer from degradations that make their binarization a challenging task. In this paper, a new binarization technique for degraded document images is presented. The proposed technique is based on active contours evolving according to intrinsic geometric measures of the document image. The image contrast that is defined by the local image maximum and minimum is used to automatically generate the initialization map of our active contour model; an average thresholding is also used to produce the final delineation and binarization. The proposed implementation benefits from the level set framework, which allows the simultaneous application of a large variety of forces at the stroke–background interface. Our binarization method involves the combination of those forces in a specific way. The efficiency of the proposed method is shown on both recent and historical document images of the Document Image Binarization Contest (DIBCO) datasets that include different types of degradations. The results are compared to a number of known techniques from the literature.
Similar content being viewed by others
Notes
Numerically, this is done by using the function bwdist of MATLAB.
Numerically, this is done by using the function bwdist of MATLAB.
References
Otsu, N.: A thresholding selection method from gray-level histogram. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognit. 19(1), 41–47 (1986)
Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)
Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)
Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceedings of the Eighth International Conference on Pattern Recognition, Paris, France, pp. 1251–1255 (1986)
Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recognit. 39(3), 317–327 (2006)
Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. IJDAR 13(4), 303–314 (2010)
Ntirogiannis, K., Gatos, B., Pratikakis, I.: A combined approach for the binarization of handwritten document images. Pattern Recognit. Lett. (2012). doi:10.1016/j.patrec.2012.09.026
Hedjam, R., Farrahi Moghaddam, R., Cheriet, M.: A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images. Pattern Recognit. 44(9), 2184–2196 (2011)
Hadjadj, Z., Meziane, A., Cheriet, M., Cherfa, Y.: An active contour based method for image binarization: application to degraded historical document images. In: ICFHR’14, pp. 655–660 (2014)
Hadjadj, Z., Meziane, A., Cherfa, Y., Cheriet, M., Setitra, I.: ISauvola: improved Sauvola’s algorithm for document image binarization. ICIAR 2016, 737–745 (2016)
Nirmala, S., Nagabhushan, P.: Foreground text segmentation in complex color document images using Gabor filters. SIVP 6(4), 669–678 (2012)
Moghaddam, R.F., Cheriet, M.: RSLDI: restoration of single sided low-quality document images. Pattern Recognit. 42(12), 3355–3364 (2009)
Chen, Q., Sun, Q., Heng, P.A., Xia, D.: A double-threshold image binarization method based on edge detector. Pattern Recognit. 41(4), 1254–1267 (2008)
Su, B., Lu, S., Tan, C.L.: Binarization of historical handwritten document images using local maximum and minimum filter. In: DAS, pp. 159–165 (2010)
Howe, N.: Document binarization with automatic parameter tuning. IJDAR 16(3), 247–258 (2013)
Rivest-Hénault, D., Farrahi Moghaddam, R., Cheriet, M.: A local linear level set method for the binarization of degraded historical document images. IJDAR 15, 101–124 (2011)
Bukhari, S.S., Shafait, F., Breuel, T.M.: Coupled snakelet model for curled textline segmentation of camera-captured document images. In: ICDAR’2009, pp. 61–65 (2009)
Li, C., Kao, C.Y., Gore, J.C., Ding, Z.: Minimization of region-scalable fitting energy for image segmentation. IEEE Trans. Image Process. 17(10), 1940–1949 (2008)
Osher, S.J., Sethian, J.A.: Fronts propagation with curvature dependent speed: algorithms based on Hamilton Jacobi formulations. J. Comput. Phys. 79(2), 12–49 (1988)
Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: ICDAR’09, pp. 1375–1382 (2009)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: ICFHR’2012 (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hadjadj, Z., Cheriet, M., Meziane, A. et al. A new efficient binarization method: application to degraded historical document images. SIViP 11, 1155–1162 (2017). https://doi.org/10.1007/s11760-017-1070-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-017-1070-2