Abstract
Binarization of highly degraded document images is one of the key steps of image preprocessing, influencing the final results of further text recognition and document analysis. As the contaminations visible on such documents are usually local, the most popular fast global thresholding methods should not be directly applied for such images. On the other hand, the application of some typical adaptive methods based on the analysis of the neighbourhood of each pixel of the images is time consuming and not always leads to satisfactory results. To bridge the gap between those two approaches the application of region based modifications of some histogram based thresholding methods has been proposed in the paper. It has been verified for well known Otsu, Rosin and Kapur algorithms using the challenging images from Bickley Diary dataset. Experimental results obtained for region based Otsu and Kapur methods are superior in comparison to the use of global methods and may be the basis for further research towards combined region based binarization of degraded document images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. Tools 12(2), 13–21 (2007)
Chou, C.H., Lin, W.H., Chang, F.: A binarization method with learning-built rules for document images produced by cameras. Pattern Recognit. 43(4), 1518–1530 (2010)
Deng, F., Wu, Z., Lu, Z., Brown, M.S.: Binarizationshop: a user assisted software suite for converting old documents to black-and-white. In: Proceedings of the Annual Joint Conference on Digital Libraries, pp. 255–258 (2010)
Feng, M.L., Tan, Y.P.: Adaptive binarization method for document image analysis. In: Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME), vol. 1, pp. 339–342, June 2004
Gatos, B., Pratikakis, I., Perantonis, S.: Adaptive degraded document image binarization. Pattern Recognit. 39(3), 317–327 (2006)
Kapur, J., Sahoo, P., Wong, A.: A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 29(3), 273–285 (1985)
Khurshid, K., Siddiqi, I., Faure, C., Vincent, N.: Comparison of Niblack inspired binarization methods for ancient documents. In: Document Recognition and Retrieval XVI, vol. 7247, pp. 7247–7247-9 (2009)
Kulyukin, V., Kutiyanawala, A., Zaman, T.: Eyes-free barcode detection on smartphones with Niblack’s binarization and Support Vector Machines. In: Proceedings of the 16th International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV 2012) at the World Congress in Computer Science, Computer Engineering, and Applied Computing WORLDCOMP, vol. 1, pp. 284–290. CSREA Press, July 2012
Lech, P., Okarma, K.: Fast histogram based image binarization using the Monte Carlo threshold estimation. In: Chmielewski, L.J., Kozera, R., Shin, B.S., Wojciechowski, K. (eds.) Computer Vision and Graphics. Lecture Notes in Computer Science, vol. 8671, pp. 382–390. Springer, Cham (2014)
Lech, P., Okarma, K.: Optimization of the fast image binarization method based on the monte carlo approach. Elektronika Ir Elektrotechnika 20(4), 63–66 (2014)
Lech, P., Okarma, K.: Prediction of the optical character recognition accuracy based on the combined assessment of image binarization results. Elektronika Ir Elektrotechnika 21(6), 62–65 (2015)
Leedham, G., Yan, C., Takru, K., Tan, J.H.N., Mian, L.: Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In: Proceedings of the 7th International Conference on Document Analysis and Recognition, ICDAR 2003, pp. 859–864, August 2003
Michalak, H., Okarma, K.: Fast adaptive image binarization using the region based approach. In: Silhavy, R. (ed.) Artificial Intelligence and Algorithms in Intelligent Systems. Advances in Intelligent Systems and Computing, vol. 764, pp. 79–90. Springer, Cham (2019)
Moghaddam, R.F., Cheriet, M.: AdOtsu: an adaptive and parameterless generalization of Otsu’s method for document image binarization. Pattern Recognit. 45(6), 2419–2431 (2012)
Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs (1986)
Okarma, K., Lech, P.: Fast statistical image binarization of colour images for the recognition of the QR codes. Elektronika Ir Elektrotechnika 21(3), 58–61 (2015)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 Document Image Binarization COmpetition (DIBCO 2017) (2017). https://vc.ee.duth.gr/dibco2017/
Rosin, P.L.: Unimodal thresholding. Pattern Recognit. 34(11), 2083–2096 (2001)
Samorodova, O.A., Samorodov, A.V.: Fast implementation of the Niblack binarization algorithm for microscope image segmentation. Pattern Recognit. Image Anal. 26(3), 548–551 (2016)
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)
Saxena, L.P.: Niblack’s binarization method and its modifications to real-time applications: a review. Artif. Intell. Rev., 1–33 (2017)
Shrivastava, A., Srivastava, D.K.: A review on pixel-based binarization of gray images. Advances in Intelligent Systems and Computing, vol. 439, pp. 357–364. Springer, Singapore (2016)
Su, B., Lu, S., Tan, C.L.: Robust document image binarization technique for degraded document images. IEEE Trans. Image Process. 22(4), 1408–1417 (2013)
Wen, J., Li, S., Sun, J.: A new binarization method for non-uniform illuminated document images. Pattern Recognit. 46(6), 1670–1690 (2013)
Wolf, C., Jolion, J.M.: Extraction and recognition of artificial text in multimedia documents. Form. Pattern Anal. Appl. 6(4), 309–326 (2004)
Yoon, Y., Ban, K.D., Yoon, H., Lee, J., Kim, J.: Best combination of binarization methods for license plate character segmentation. ETRI J. 35(3), 491–500 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Michalak, H., Okarma, K. (2019). Region Based Approach for Binarization of Degraded Document Images. In: PejaÅ›, J., El Fray, I., Hyla, T., Kacprzyk, J. (eds) Advances in Soft and Hard Computing. ACS 2018. Advances in Intelligent Systems and Computing, vol 889. Springer, Cham. https://doi.org/10.1007/978-3-030-03314-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-03314-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03313-2
Online ISBN: 978-3-030-03314-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)