Skip to main content
Log in

Dynamic hidden feature space detection of noisy image set by weight binarization

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Historical documents are mostly in printed format. Considering space requirements and physical inspection, their preservation and restoration are costly. Scanners can turn these materials into an electronic mode, producing images polluted with noise. As a result, there is a higher storage demand and worse OCR precision. To overcome this, the most appropriate choice is noise reduction. The low-resolution grayscaled image and binarization process reduces the input data source. Furthermore, hidden feature space is extracted based on binary pixel quantization by the KF-CM method to obtain the feature space from binary images. The local-minimal points in binarized image segments define the 33 variables in the preprocessing stage. Followed by preprocessing, the scanned document images point KF-CM method is described as grouping input image pixels into noise, text, and background categories based on their characteristics. Therefore, noise reduction and binarization were both completed at the same time. The proposed approach has binarized a noisy image's bit planes by choosing local thresholds. This approach is evaluated with the document image datasets and compared with widely used binarization-based existing feature extraction methods, wherein the proposed work outperforms all other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Thangamani, M., Thangaraj, P.: Fuzzy ontology for document clustering based on genetic algorithm. Appl. Math. Inf. Sci. 4(7), 1563–74 (2013)

    Google Scholar 

  2. Rajkumar, R., Dileepan, D., Chinmay, C., Suresh, P.: Modified minkowski fractal multiband antenna with circular-shaped split-ring resonator for wireless applications. Measurement 182, 109766 (2021). https://doi.org/10.1016/j.measurement.2021.109766

    Article  Google Scholar 

  3. Markkandan, S., Malarvizhi, C., Raja, L., Kalloor, J., Karthi, J., Atla, R.: Highly compact-sized circular microstrip patch antenna with the partial ground for biomedical applications. Mater. Today: Proceedings 47, 318–320 (2021)

    Google Scholar 

  4. A. Farahmand, A. Sarrafzadeh, and J. Shanbehzadeh,: "Document image noises and removal methods," IMECS, Newswood Limited, 436–440, 2013.

  5. Leonid, T.T., Jayaparvathy, R.: Statistical–model based voice activity identification for human-elephant conflict mitigation. J. Ambient Intell. Human. Comput. 12, 5269–5275 (2021). https://doi.org/10.1007/s12652-020-02005-y

    Article  Google Scholar 

  6. Fan, K.C., Wang, Y.K., Lay, T.R.: Marginal noise removal of document images. Pattern Recognit. Soc. 35(11), 2593–2611 (2002)

    Article  MATH  Google Scholar 

  7. W. Peerawit and A. Kawtrakul,: "Marginal Noise Removal from Document Images Using Edge Density," Proc. Fourth Information and Computer Eng. Postgraduate Workshop, Jan. 2004.

  8. Shafait, Faisal, van Beusekom, Joost, Keysers, Daniel, Breuel, Thomas M.: Document cleanup using page frame detection. IJDAR 11(2), 81–96 (2008)

    Article  Google Scholar 

  9. F. Shafait and T. M. Breuel,: "A simple and effective approach for border noise removal from document images," Proceedings. 13th IEEE Int’l Multi-Topic Conf., Dec. 2009.

  10. Garateguy, G.J., Arce, G.R., Lau, D.L., Villarreal, O.P.: QR images: optimized image embedding in QR codes. IEEE trans. image process. 23(7), 2842–2853 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  11. Q. Zhang and B. Li,: "Discriminative K-SVD for dictionary learning in face recognition," in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, Jun. 2010 - Jun. 2010, pp. 2691–2698.

  12. Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE trans. image process. 19(11), 2861–2873 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  13. Y. Bengio and O. Delalleau,: “On the Expressive Power of Deep Architectures,” in Lecture Notes in Computer Science, Algorithmic Learning Theory, J. Kivinen, C. Szepesvári, E. Ukkonen, and T. Zeugmann, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 18–36.

  14. RFMoghaddam, MCheriet: RSLDI: restoration of single-sided low-quality document images. Pattern Recognit. 42(12), 3355–3364 (2009)

    Article  Google Scholar 

  15. Jia, Fuxi, Shi, Cunzhao, He, Kun, Wang, Chunheng, Xiao, Baihua: Degraded document image binarization using structural symmetry of strokes. Pattern Recogniti. 74, 225–240 (2018)

    Article  Google Scholar 

  16. Ranjan Mondal, Sanchayan Santra, and Bhabatosh Chanda.: Dense morphological network: An universal function approximator. arXiv preprint arXiv:1901.00109, 2019.

  17. M. Valizadeh, N. Armanfard, M. Komeili, and E. Kabir,: "A novel hybrid algorithm for binarization of badly illuminated document images," in Proceedings of the 14th International CSI Computer Conference (CSICC' 09), pp. 121–126, Tehran, Iran, October 2009.

  18. Y.-F. Chang, Y.-T. Pai, and S.-J. Ruan,: "An efficient thresholding algorithm for degraded document images based on intelligent block detection," in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC' 08), pp. 667– 672, October 2009.

  19. B. Gatos, I. Pratikakis, and S. J. Perantonis,: "Efficient binarization of historical and degraded document images," in Proceedings of the 8th IAPR International Workshop on Document Analysis Systems (DAS' 08), pp. 447–454, September 2008.

  20. H. Cao, R. Prasad, and P. Natarajan.: "A stroke regeneration method for cleaning rule lines in handwritten document images," In MOCR '09: Proceedings of the International Workshop on Multilingual OCR, pages 1–10, New York, NY, USA, 2009.

  21. Zhixin Shi, Srirangaraj Setlur, Venu Govindaraju, "Removing RuleLines from Binary Handwritten Arabic Document Images Using Directional Local Profile," ICPR 2010: 1916-1919.

  22. M. Agarwal, D. Doermann, "Clutter noise removal in binary document images," in [Proc. Intl. Conf. on Document Analysis and Recognition], 556–560 (2009).

  23. M. Agrawal, D. S. Doermann: "Stroke-Like Pattern Noise Removal in Binary Document Images," ICDAR 2011: 17-21.

  24. Kim, J.Y., Kim, L.S., Hwang, S.H.: An advanced contrast enhancement using partially overlapped sub-block histogram equalization. IEEE Trans. Circuits Syst. Video Technol. 11, 475–484 (2006)

    Google Scholar 

  25. R. Parvathi, S. K. Jayanthi, N. Palaniappan, S. Devi, "Intuitionistic Fuzzy approach to Enhance Text Documents," Proceedings -3rd IEEE International Conference on Intelligent Systems (IEEE IS' 06), 2006, p733-737.

  26. Nomura, S., Yamanaka, K., Shiose, T., Kawakami, H., Katai, O.: Morphological preprocessing method to thresholding degraded word images. Pattern recognition letters 30(8), 729–744 (2009)

    Article  Google Scholar 

  27. H. Deborah and A. M. Arymurthy, "Image Enhancement and Image Restoration for Old Document Image using Genetic Algorithm," Proceedings of Second International Conference on Advances in Computing, Control and Telecommunication Technologies (ACT 2010), p 108-12, 2010.

Download references

Funding

The authors received no specific funding for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. S. Umadevi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The manuscript has not been submitted to more than one journal for simultaneous consideration. The manuscript has not been published previously. The research does not involve human participants and/or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Umadevi, K.S., Thakare, K.S., Patil, S. et al. Dynamic hidden feature space detection of noisy image set by weight binarization. SIViP 17, 761–768 (2023). https://doi.org/10.1007/s11760-022-02284-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02284-2

Keywords

Navigation