Abstract
In this article, we have addressed the problem of denoising and enhancement of color archival handwritten document images by separating noise from text and background. Indeed, archival document images that originated from scanning or photographing paper documents are mainly digitized in full color mode. Thus, it is necessary to preserve and exploit color information when applying an enhancement method or a denoising technique. Thus, the focus of our work has been to model a color image using a hyperspace. The defined hyperspace formed by the image pixels is obtained by using both topological and color spaces. The novelty of our work lies in exploiting the obtained hyperspace to cluster the extracted low-level features (topological and color) and, thereafter, to separate noise from text and background. Indeed, based on combining the obtained hyperspace with an adapted kernel-based intuitionistic fuzzy c-means (KIFCM) algorithm we have proposed a novel hyper-KIFCM (HKIFCM) method for denoising color historical document images. To illustrate the effectiveness of the HKIFCM method, a thorough experimental study has been firstly conducted with qualitative and quantitative observations obtained from color archival handwritten document images collected from both the Tunisian national archives and two datasets provided in the context of open competitions at ICDAR and ICFHR conferences. Then, we have compared the results achieved with those obtained using the state-of-the-art methods.











Similar content being viewed by others
Notes
Some image samples among those we have used in our experiments are temporarily available on https://drive.google.com/open?id=1X-SDB2CmT3cfkB8dTdS3mEWR5-KLkwsa and on request subject to the agreement from the ANT.
References
ANT. http://www.archives.nat.tn/. Accessed 17 August 2018
DIBCO 2009. http://users.iit.demokritos.gr/~bgat/DIBCO2009/. Accessed 17 August 2018
H-DIBCO 2016. https://vc.ee.duth.gr/h-dibco2016/. Accessed 17 August 2018
Elhedda, W., Mehri, M., Mahjoub, M.A.: A comparative study of filtering approaches applied to color archival document images. In: Proceedings of the International Arab Conference on Information Technology (2017)
Stanco, F., Tenze, L., Ramponi, G.: Technique to correct yellowing and foxing in antique books. IET Image Process. 1(2), 123–133 (2007)
Drira, F., LeBourgeois, F., Emptoz, H.: Restoring ink bleed-through degraded document images using a recursive unsupervised classification technique. In: Lecture Notes in Computer Science (2006)
Tan, C.L., Shen, P.: Restoration of archival documents using a wavelet technique. IEEE Trans. Pattern Anal. Mach. Intell. 24, 10 (2002)
Charrada, M.A., Benamara, N.E.: Old document image denoising using bilateral filter. In: International Document Image Processing (2013)
Ganbold, G.: History document image background noise and removal methods. Int. J. Knowl. Content Dev. Technol. 5(2), 11–24 (2015)
Chaira, T.: A novel intuitionistic fuzzy c-means color clustering on human cell images. In: Proceedings of World Congress on Nature and Biologically Inspired Computing, pp. 736–741 (2009)
Lin, K.P.: A novel evolutionary kernel intuitionistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 22(5), 1074–1087 (2014)
Sugeno, M.: Fuzzy measures and fuzzy integrals: a survey. In: Readings in Fuzzy Sets for Intelligent Systems, pp. 251–257. Morgan Kaufmann, Los Altos (1993)
Leydier, Y., LeBourgeois, F., Emptoz, H.: Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts. In: Proceedings of International Conference on Pattern Recognition, vol. 1, pp. 494–497 (2004)
Sangwine, S.J., Ell, T.A.: Hypercomplex auto- and cross-correlation of color images. In: Proceedings of IEEE International Conference on Image Processing (1999)
Sangwine, S.J., Ell, T.A.: The discrete Fourier transform of a colour image. In: Proceedings of Image Processing II Mathematical Methods, Algorithms and Applications, pp. 430–441 (2000)
Jangra, S., Rani, P.: A survey on STING and CLIQUE grid based clustering methods. Int. J. Adv. Res. Comput. Sci. 8, 5 (2017)
Babur, I.H., Ahmed, J., Ahmed, B., Habib, M.: Analysis of DBSCAN clustering technique on different datasets using WekaTools. Sci. Int. 27(6), 5087–5090 (2015)
Mehri, M., Gomez-Krämer, P., Héroux, P., Boucher, A., Mullot, R.: A texture-based pixel labeling approach for historical books. In: Proceedings of Pattern Analysis and Applications, pp. 325–364 (2017)
Tonazzini, A., Bedini, L.: Restoration of recto-verso colour documents using correlated component analysis. EURASIP J. Adv. Signal Process. 2013, 58 (2013)
Chaira, T., Panwar, A.: An Atanassov’s intuitionistic fuzzy kernel clustering for medical image segmentation. Int. J. Comput. Intell. Syst. 7(2), 360–370 (2014)
Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Kannan, S.R., Ramathilagam, S., Sathya, A., Pandiyarajan, R.: Effective fuzzy c-means based kernel function in segmenting medical images. Comput. Biol. Med. 40(6), 572–579 (2010)
Kannan, S.R., Ramathilagam, S., Devi, R., Sathya, A.: Robust kernel FCM in segmentation of breast medical images. Expert Syst. Appl. 38(4), 4382–4389 (2011)
Atanassov, K.T.: Intuitionistic fuzzy set. Fuzzy Set Syst. 20(1), 87–96 (1986)
Kaur, P., Soni, A.K., Gosain, A.: Robust intuitionistic fuzzy c-means clustering for linearly and nonlinearly separable data. In: Proceedings of International Conference on Image Information Processing (2011)
Bezdek, J.C.: A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 2(1), 1–8 (1980)
Xu, Z., Chen, J., Wu, J.: Clustering algorithm for intuitionistic fuzzy sets. Inf. Sci. 178(19), 3775–3790 (2008)
Atanassov, K.T., Stoeva, S.: Intuitionistic fuzzy sets. In: Proceedings of Polish Symposium on Interval and Fuzzy Mathematics, pp. 23–26 (1983)
Yager, R.R.: Some aspects of intuitionistic fuzzy sets. Fuzzy Optim. Decis. Mak. 8, 67–90 (2009)
Xu, Z., Hui, H.: Entropy-based procedures for intuitionistic fuzzy multiple attribute decision making. J. Syst. Eng. Electron. 20(5), 1001–1011 (2009)
Xu, Z., Wu, J.: Intuitionistic fuzzy c-means clustering algorithms. J.Syst. Eng. Electron. 21(4), 580–590 (2010)
Chaira, T.: A novel intuitionistic fuzzy c-means clustering algorithm and its application to medical images. Appl. Soft Comput. 11, 1711–1717 (2011)
Jiang, H., Zhou, X., Feng, B., Zhang, M.: A new intuitionistic fuzzy c-means clustering algorithm. In: Proceedings of International Conference on Mechatronic Sciences, Electric Engineering and Computer (2013)
Jiang, H., Zhou, X., Feng, B., Zhang, M.: A new intuitionistic fuzzy c-means clustering algorithm. In: Proceedings of International Conference on Mechatronic Sciences, Electric Engineering and Computer (2013)
Gatos, B., Ntirogiannis, K., Pratikakis., I.: ICDAR 2009 document image binarization contest (DIBCO 2009). In: Proceedings of International Conference on Document Analysis and Recognition, pp. 1375–1382 (2009)
Pratikakis, I., Zagoris, K., Barlas, G., Gatos., B.: ICFHR 2016 handwritten document image binarization contest (H-DIBCO 2016). In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 619–623 (2016)
Cheng, H., Sun, Y.: A hierarchical approach to color image segmentation using homogeneity. IEEE Trans. Image Process. 9(12), 2071–2082 (2000)
Rendón, E., Abundez, I., Arizmendi, A., Quiroz, E.M.: Internal versus external cluster validation indexes. Int. J. Comput. Commun. 5(1), 27–34 (2011)
Rendón, E., Abundez, I., Gutierrez, C., Zagal, S.D., Arizmendi, A., Quiroz, E.M., Arzate, H.E.: A comparison of internal and external cluster validation indexes. In: Proceedings of Applications of Mathematics and Computer Engineering, pp. 158–163 (2011)
Powers, D.M.W.: Evaluation: from precision, recall and F-factor to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
Pitas, I., Venetsanopoulos, A.N.: Nonlinear filters in image processing: principles and applications. In: The Springer International Series in Engineering and Computer Science. Academic Publishers, Boston (1990)
Sharma, S.: Applied multivariate techniques. In: University of South Carolina, Wiley, NewYork (1996)
Acknowledgements
The authors would like to acknowledge the Tunisian national archives for providing access to their digital collections.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Elhedda, W., Mehri, M. & Mahjoub, M.A. Hyperkernel-based intuitionistic fuzzy c-means for denoising color archival document images. IJDAR 23, 161–181 (2020). https://doi.org/10.1007/s10032-020-00352-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-020-00352-2