Abstract
In this article, our goal is to describe mathematically and experimentally the gray-intensity distributions of the fore- and background of handwritten historical documents. We propose a local pixel model to explain the observed asymmetrical gray-intensity histograms of the fore- and background. Our pixel model states that, locally, the gray-intensity histogram is the mixture of gray-intensity distributions of three pixel classes. Following our model, we empirically describe the smoothness of the background for different types of images. We show that our model has potential application in binarization. Assuming that the parameters of the gray-intensity distributions are correctly estimated, we show that thresholding methods based on mixtures of lognormal distributions outperform thresholding methods based on mixtures of normal distributions. Our model is supported with experimental tests that are conducted with extracted images from DIBCO 2009 and H-DIBCO 2010 benchmarks. We also report results for all four DIBCO benchmarks.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig6_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig7_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig8_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig9_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig10_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig11_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig12_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig13_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig14_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig15_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig16_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10032-013-0212-5/MediaObjects/10032_2013_212_Fig17_HTML.gif)
Similar content being viewed by others
Notes
An inverted lognormal is a lognormal distribution that is reflected in a constant; See a formal definition in Appendix 2.
References
Badekas, E., Papamarkos, N.: Estimation of appropriate parameter values for document binarization techniques. Int. J. Robotics Autom. 24(1), 66–78 (2009)
Bar-Yosef, I., Mokeichev, A., Kedem, K., Dinstein, I., Ehrlich, U.: Adaptive shape prior for recognition and variational segmentation of degraded historical characters. Pattern Recognit. 42(12), 3348–3354 (2009). New Frontiers in Handwriting Recognition
Barney Smith, E.H.: An analysis of binarization ground truthing. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, DAS ’10, pp. 27–34. ACM, New York, NY, USA (2010)
Bataineh, B., Abdullah, S.N.H.S., Omar, K.: An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows. Pattern Recognit. Lett. 32(14), 1805–1813 (2011)
Bazi, Y., Bruzzone, L., Melgani, F.: Image thresholding based on the EM algorithm and the generalized gaussian distribution. Pattern Recognit. 40(2), 619–634 (2007)
Ben Messaoud, I., El Abed, H., Amiri, H., Märgner, V.: New method for the selection of binarization parameters based on noise features of historical documents. In: Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data, pp. 1:1–1:8. ACM, New York, NY, USA (2011)
Brink, A., Smit, J., Bulacu, M., Schomaker, L.: Writer identification using directional ink-trace width measurements. Pattern Recognit. 45(1), 162–171 (2012)
Çelik, T.: Bayesian change detection based on spatial sampling and gaussian mixture model. Pattern Recognit. Lett. 32(12), 1635–1642 (2011)
Chen, Q., Sun, Q., Ann Heng, P., Xia, D.: A double-threshold image binarization method based on edge detector. Pattern Recognit. 41(4), 1254–1267 (2008)
Chou, C.H., Lin, W.H., Chang, F.: A binarization method with learning-built rules for document images produced by cameras. Pattern Recognit. 43(4), 1518–1530 (2010)
Chow, C., Kaneko, T.: Boundary detection and volume determination of the left ventricle from a cineangiogram. Comput. Biol. Med. 3(1), 13–16, IN1-IN2, 17–26 (1973). Cardiology and Blood
Elguebaly, T., Bouguila, N.: Bayesian learning of finite generalized gaussian mixture models on images. Signal Process. 91(4), 801–820 (2011)
Fan, S.K.S., Lin, Y.: A fast estimation method for the generalized gaussian mixture distribution on complex images. Comput. Vis. Image Underst. 113(7), 839–853 (2009)
Fan, S.K.S., Lin, Y., Wu, C.C.: Image thresholding using a novel estimation method in generalized gaussian distribution mixture modeling. Neurocomputing 72(1–3), 500–512 (2008). Machine Learning for Signal Processing (MLSP 2006) / Life System Modelling, Simulation, and Bio-inspired Computing (LSMS 2007)
Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: Tenth International Conference on Document Analysis and Recognition, pp. 1375–1382 (2009)
Gatos, B., Ntirogiannis, K., Pratikakis, I.: DIBCO 2009: document image binarization contest. Int. J. Document Anal. Recognit. 14, 35–44 (2011)
Gatos, B., Pratikakis, I., Perantonis, S.: Adaptive degraded document image binarization. Pattern Recognit. 39(3), 317–327 (2006)
Gatos, B., Stamatopoulos, N., Louloudis, G.: ICDAR 2009 handwriting segmentation contest. Int. J. Document Anal. Recognit. 14, 25–33 (2011)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Prentice Hall, Englewood Cliffs, NJ (2007)
Hedjam, R., Moghaddam, R.F., Cheriet, M.: A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images. Pattern Recognit. 44(9), 2184–2196 (2011)
Howe, N.R.: Document binarization with automatic parameter tuning. Int. J. Document Anal. Recognit. 16, 247–258 (2013)
Huang, Z.K., Chau, K.W.: A new image thresholding method based on Gaussian mixture model. Appl. Math. Comput. 205, 899–907 (2008)
Kapur, J.N., Sahoo, P.K., Wong, A.K.C.: A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 29, 273–285 (1985)
Khosravi, H., Kabir, E.: A blackboard approach towards integrated Farsi OCR system. Int. J. Document Anal. Recognit. 12(1), 21–32 (2009)
Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognit. 19(1), 41–47 (1985)
Kuk, J.G., Cho, N.I., Lee, K.M.: MAP-MRF approach for binarization of degraded document image. In: Proceedings of the 15th International Conference on Image Processing, pp. 2612–2615 (2008)
Lázaro, J., Martín, J.L., Arias, J., Astarloa, A., Cuadrado, C.: Neuro semantic thresholding using OCR software for high precision OCR applications. Image Vis. Comput. 28, 571–578 (2010)
Lee, H., Verma, B.: Binary segmentation algorithm for english cursive handwriting recognition. Pattern Recognit. 45(4), 1306–1317 (2012)
Lelore, T., Bouchara, F.: FAIR: a fast algorithm for document image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 2039–2048 (2013)
Louloudis, G.E., Gatos, B.G., Pratikakis, I., Halatsis, C.: Text line detection in handwritten documents. Pattern Recognit. 41, 3758–3772 (2008)
Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. Int. J. Document Anal. Recognit. 13, 303–314 (2010)
Lyon, R.F.: A brief history of pixel. In: IS &T/SPIE Symposium on Electronic, Imaging, pp. 15–19 (2006)
Moghaddam, R.F., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recognit. 43(6), 2186–2198 (2010)
Moghaddam, R.F., Cheriet, M.: A variational approach to degraded document enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1347–1361 (2010)
Moghaddam, R.F., Cheriet, M.: Beyond pixels and regions: a non-local patch means (NLPM) method for content-level restoration, enhancement, and reconstruction of degraded document images. Pattern Recognit. 44(2), 363–374 (2011)
Moghaddam, R.F., Cheriet, M.: AdOtsu: an adaptive and parameterless generalization of Otsu’s method for document image binarization. Pattern Recognit. 46(6), 2419–2431 (2012)
Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Birkeroed (1985)
Nikolaou, N., Makridis, M., Gatos, B., Stamatopoulos, N., Papamarkos, N.: Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image Vis. Comput. 28, 590–604 (2010)
Otsu, N.: A threshold selection method from grey-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Pai, Y.T., Chang, Y.F., Ruan, S.J.: Adaptive thresholding algorithm: efficient computation technique based on intelligent block detection for degraded document images. Pattern Recognit. 43(9), 3177–3187 (2010)
Papavassiliou, V., Stafylakis, T., Katsouros, V., Carayannis, G.: Handwritten document image segmentation into text lines and words. Pattern Recognit. 43(1), 369–377 (2010)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-DIBCO 2010—handwritten document image binarization competition. In: International Conference on Frontiers in Handwriting Recognition, pp. 727–732. IEEE Computer Society, Los Alamitos, CA, USA (2010)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2011 document image binarization contest (DIBCO 2011). In: 2011 International Conference on Document Analysis and Recognition, pp. 1506–1510. IEEE (2011)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICFHR 2012 competition on handwritten document image binarization (H-DIBCO 2012). In: 2012 International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 813–818 (2012)
Ramírez-Ortegón, M., Tapia, E., Block, M., Rojas, R.: Quantile linear algorithm for robust binarization of digitalized letters. In: Ninth International Conference on Document Analysis and Recognition, vol. 2, pp. 1158–1162 (2007)
Ramírez-Ortegón, M.A., Rojas, R.: Transition thresholds for binarization of historical documents. In: 20th International Conference on Pattern Recognition, pp. 2362–2365. IEEE Computer Society (2010)
Ramírez-Ortegón, M.A., Tapia, E., Ramírez-Ramírez, L.L., Rojas, R., Cuevas, E.: Transition pixel: a concept for binarization based on edge detection and gray-intensity histograms. Pattern Recognit. 43, 1233–1243 (2010)
Ramírez-Ortegón, M.A., Tapia, E., Rojas, R., Cuevas, E.: Transition thresholds and transition operators for binarization and edge detection. Pattern Recognit. 43(10), 3243–3254 (2010)
Rivest-Hénault, D., Farrahi Moghaddam, R., Cheriet, M.: A local linear level set method for the binarization of degraded historical document images. Int. J. Document Anal. Recognit. 15, 101–124 (2012)
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)
Shi, J., Ray, N., Zhang, H.: Shape based local thresholding for binarization of document images. Pattern Recognit. Lett. 33(1), 24–32 (2012)
Smith, A.R.: A pixel is not a little square, a pixel is not a little square, a pixel is not a little square! (and a voxel is not a little cube). Tech. rep, Microsoft (1995)
Su, B., Lu, S., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 159–166. ACM (2010)
Tonazzini, A.: Color space transformations for analysis and enhancement of ancient degraded manuscripts. Pattern Recognit. Image Anal. 20, 404–417 (2010)
Valizadeh, M., Kabir, E.: Binarization of degraded document image based on feature space partitioning and classification. Int. J. Document Anal. Recognit. 15(1), 57–69 (2012)
Valizadeh, M., Kabir, E.: An adaptive water flow model for binarization of degraded document images. Int. J. Document Anal. Recognit. 16(2), 165–176 (2013)
Verma, B., Lee, H.: Segment confidence-based binary segmentation (SCBS) for cursive handwritten words. Expert Syst. Appl. 38(9), 11,167–11,175 (2011)
Vonikakis, V., Andreadis, I., Papamarkos, N.: Robust document binarization with OFF center-surround cells. Pattern Anal. Appl. 14, 219–234 (2011)
Wen, J., Fang, B., Chen, J., Tang, Y., Chen, H.: Fragmented edge structure coding for chinese writer identification. Neurocomputing 86(1), 45–51 (2012)
Wolf, L., Littman, R., Mayer, N., German, T., Dershowitz, N., Shweka, R., Choueka, Y.: Identifying join candidates in the Cairo Genizah. Int. J. Comput. Vis. 94, 1–18 (2010)
Xue, J., Zhang, Y., Lin, X.: Rayleigh-distribution based minimum error thresholding for SAR images. J. Electron. (China) 16, 336–342 (1999)
Xue, J.H., Titterington, D.M.: t-tests, F-tests and Otsu’s methods for image thresholding. IEEE Trans. Image Process. 20(8), 2392–2396 (2011)
Acknowledgments
We would like to thanks to the Asociación Mexicana de Cultura A.C. We are so grateful to the editor and all the reviewers for their constructive and meticulous comments.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix 1: A Frontier pixel convergence
Let \(X\sim N(\mu _1,\sigma _2),\,Y\sim N(\mu _2,\sigma _2)\) and \(U\sim \hbox {Unif}(0,1)\), all independent. In this section we prove that if \(W\) is defined as \(W:=uX+(1-u)Y\), then
-
1.
If \(U\) is a degenerated random variable with value \(u\), we have that \(W\sim N(u\mu _1+(1-u)\mu _2, \sqrt{u^2\sigma _1^2\!+\!(1\!-\!u)^2\sigma _2^2})\).
-
2.
If \(\sigma _1=\sigma _2=\sigma \) then \(W\) has lighter tails than a random variable that is normally distributed with standard deviation \(\sqrt{3}\sigma \).
-
3.
As \(\sigma _1,\sigma _2 \rightarrow 0\) we have \(W\) tends to be a random variable that is uniform distributed.
Proof of 1: From the properties of the moment generating functions (MGF’s), we have:
where \(M_W(t):=E(\exp \{tW\})\).
Since the MGF of a normal random variable with parameters \((\mu , \sigma )\) is
then (15) is equal to
that corresponds to a random variable that is normally distributed with the specified parameters.
Proof of 2: From the properties of MGF’s, we have:
where the function \(M_{W|u}(t)\) denotes the moment generating function of \(W\) with respect to the conditional density \(f_{X,Y|U}(x,y|u)\).
Then the conditional MGF in (19) is equal to
Without loss of generality assume that \(\mu _1>\mu _2\). To obtain unconditional MGF of \(W\) we integrate the last expression over the \(U\)’s domain as following:
where
If \(t>0\), then (23) is smaller than or equal to
Similarly, if \(t<0\) then (23) is smaller than or equal to
Since in both cases, the moment generating function is dominated by a MGF of a random variable with Normal distribution and variance \(3\sigma ^2\), the conclusion follows.
Proof of 3: Based on (20) we have that
as \(\sigma _1,\sigma _2 \rightarrow 0\).
To obtain the MGF of \(W\) we integrate \(M_{W|u}(t)\) as
that is the MGF of an uniform distributed random variables with lower and upper limits equal to \(\min \{\mu _1,\mu _2\}\) and \(\max \{\mu _1,\mu _2\}\), respectively.
Appendix 2: Quasi-thresholding methods
To simplify our notation, the subindexes \(f,\,b,\,if,\,of,\,ib\), and \(ob\) abbreviate the foreground, background, inner foreground, outer foreground, inner background, and outer background sets, respectively. Furthermore, we also simplify our notation of the means and variances of gray intensities of a set \(\mathcal{A }\) by
1.1 Quasi-threshold \(LI\)
The mixture \(LI\) models the gray-intensity histogram as the mixture of two distributions: Lognormal for the foreground and inverted lognormal for the background. Formally, its threshold is defined by Bayes rule as the value \(x\) that satisfies:
such that \(\hat{\mu }_{f} < x < \hat{\mu }_{b}\), where
\(\lambda (x;\tilde{\mu }_{f},\tilde{\sigma }_{f})\) and \(\tilde{\lambda }(x;c_{b},\tilde{\tilde{\mu }}_{b},\tilde{\sigma }_{b})\) denote the probability distribution functions of the lognormal and inverted lognormal distributions. These functions are given by:
-
(1)
Lognormal:
$$\begin{aligned} \lambda (x;\tilde{\mu }_{f},\tilde{\sigma }_{f}) = \frac{1}{x\tilde{\sigma }_{f}\sqrt{2\pi }} \exp \left( -\frac{ (\ln (x) - \tilde{\mu }_{f} )^{2} }{2\tilde{\sigma }^{2}_{f}} \right) \!,\nonumber \\ \end{aligned}$$(36)where
$$\begin{aligned} \tilde{\mu }_{f}&= \ln (\hat{\mu }_{f}) - \frac{1}{2}\ln \left( 1 + \frac{\hat{\sigma }^{2}_{f}}{\hat{\mu }^{2}_{f}} \right) \quad \text { and } \end{aligned}$$(37)$$\begin{aligned} \tilde{\sigma }^{2}_{f}&= \frac{1}{2}\ln \left( 1 + \frac{\hat{\sigma }^{2}_{f}}{\hat{\mu }^{2}_{f}} \right) \!. \end{aligned}$$(38) -
(2)
Inverted lognormal:
$$\begin{aligned} \tilde{\lambda }(x;c_{b},\tilde{\tilde{\mu }}_{b},\tilde{\sigma }_{b}) = \lambda (c_{b} - x;\tilde{\tilde{\mu }}_{b},\tilde{\sigma }_{b}), \end{aligned}$$(39)where \(\tilde{\sigma }_{b}\) is computed in an analogous manner as \(\tilde{\sigma }_{f}\),
$$\begin{aligned} \tilde{\tilde{\mu }}_{b}&= \ln (c_{b} - \hat{\mu }_{f}) - \frac{1}{2}\ln \left( 1 + \frac{\hat{\sigma }^{2}_{f}}{[c_{b} - \hat{\mu }_{f}]^{2}} \right) \!,\quad \text { and } \nonumber \\ \end{aligned}$$(40)$$\begin{aligned} c_{b}&= \underset{ \varvec{p} \in \mathcal{B } }{\max } \, \left( I(\varvec{p} ) \right) + 1. \end{aligned}$$(41)
1.2 Quasi-thresholding \(NLIN\)
The mixture \(NLIN\) models the gray-intensity histogram as the mixture of the gray-intensity distributions of the inner foreground, outer foreground, inner background, and outer background. Such sets are estimated as:
where \(\mathcal{E }\) denotes the set of 8-edge pixels:
Once the frontier pixels are estimated, the gray-intensity distribution of the foreground is modeled as the mixture of a normal distribution (corresponding to the inner foreground) and a lognormal distribution (corresponding to the outer foreground). On the other hand, the gray-intensity distribution of the background is modeled as the mixture of a normal distribution (corresponding to the inner background) and an inverted lognormal distribution (corresponding to the outer background).
Formally, the threshold of \(NLIN\) is defined by Bayes rule as the value \(x\) that satisfies:
such that \(\hat{\mu }_{f} < x < \hat{\mu }_{b}\), where
The functions \(\lambda (x;\tilde{\mu }_{of},\tilde{\sigma }_{of})\) and \(\tilde{\lambda }(x;c_{ob},\tilde{\tilde{\mu }}_{ob},\tilde{\sigma }_{ob})\) are defined in a similar manner as in the section “Quasi-threshold \(LI\)” of Appendix; \(\phi (x;\hat{\mu }_{if},\hat{\sigma }_{if})\) denotes the probability density function of a normal distribution given by:
In similar manner, \(\phi (x;\hat{\mu }_{ib},\hat{\sigma }_{ib})\) is defined.
1.3 Quasi-thresholding methods based on normal distributions
We implemented two mixtures based on normal distributions: \(NN\) and \(NNNN\). The former mixes two normal distributions to approximate the gray-intensity distribution, while the latter mixes four normal distributions. Their parameters are estimated in similar manner as in the previous subsections.
The threshold of \(NN\) is defined by Bayes rule as the value \(x\) that satisfies:
such that \(\hat{\mu }_{f} < x < \hat{\mu }_{b}\). Likewise, the threshold of \(NNNN\) is defined as:
such that \(\hat{\mu }_{f} < x < \hat{\mu }_{b}\), where
and
Rights and permissions
About this article
Cite this article
Ramírez-Ortegón, M.A., Ramírez-Ramírez, L.L., Messaoud, I.B. et al. A model for the gray-intensity distribution of historical handwritten documents and its application for binarization. IJDAR 17, 139–160 (2014). https://doi.org/10.1007/s10032-013-0212-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-013-0212-5