A New Deep Fuzzy Based MSER Model for Multiple Document Images Classification

Biswas, Kunal; Shivakumara, Palaiahnakote; Sivanthi, Sittravell; Pal, Umapada; Lu, Yue; Liu, Cheng-Lin; Ayub, Mohamad Nizam Bin

doi:10.1007/978-3-031-09037-0_30

Kunal Biswas¹²,
Palaiahnakote Shivakumara¹³,
Sittravell Sivanthi¹³,
Umapada Pal¹²,
Yue Lu¹⁴,
Cheng-Lin Liu^15,16 &
…
Mohamad Nizam Bin Ayub¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13363))

Included in the following conference series:

International Conference on Pattern Recognition and Artificial Intelligence

1600 Accesses
2 Citations

Abstract

Understanding document images uploaded on social media is challenging because of multiple types like handwritten, printed and scene text images. This study presents a new model called Deep Fuzzy based MSER for classification of multiple document images (like handwritten, printed and scene text). The proposed model detects candidate components that represent dominant information irrespective of the type of document images by combining fuzzy and MSER in a novel way. For every candidate component, the proposed model extracts distance-based features which result in proximity matrix (feature matrix). Further, the deep learning model is proposed for classification by feeding input images and feature matrix as input. To evaluate the proposed model, we create our own dataset and to show effectiveness, the proposed model is tested on standard datasets. The results show that the proposed approach outperforms the existing methods in terms of average classification rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Krishnani, D., et al.: A new context-based features for classification of emotions in photographs. Multimedia Tools Appl. 80, 15589–15618 (2021)
Google Scholar
Nandanwar, L., et al.: Chebyshev-Harmonic-Fourier-Moments and deep CNNs for detecting forged handwriting. In: Proceedings of the ICPR, pp. 6562–6569 (2021)
Google Scholar
Liu, L., et al.: Document image classification: progress over decades. Neurocomputing 453, 223–240 (2021)
Google Scholar
Bakkali, S., Ming, Z., Coustaty, M., Rusinol, M.: Cross-modal deep networks for document image classification. In: Proceedings of the ICIP, pp. 2556–2560 (2020)
Google Scholar
Rani, N.S., Nair, B.J.B., Karthik, S.K., Srinidi, A.: Binarization of degraded photographed document images-a variational denoising auto encoder. In: Proceedings of the ICIRCA, pp. 119–124 (2021)
Google Scholar
Vision AI | Derive Image Insights via ML | Cloud Vision API. https://cloud.google.com/vision. Accessed 28 Jan 2022
Pal, U., Chaudhuri, B.B.: Machine-printed and hand-written text lines identification. Pattern Recognit. Lett. 22(3/4), 431–441 (2001)
Article MATH Google Scholar
Bakkali, S., Ming, Z., Coustaty, M., Rusinol, M.: Visual and textual deep feature fusion for document image classification. In: Proceedings of the CVPRW, pp. 2394–2403 (2020)
Google Scholar
Bhowmic, S., Sarkar, R.: Classification of text regions in a document image by analyzing the properties of connected components. In: Proceedings of the ASPCON, pp. 36–40 (2020)
Google Scholar
Fu, W., Xue, B., Gao, X., Zhang, M.: Transductive transfer learning based genetic programming for balanced and unbalanced document classification using different types of features. Appl. Soft Comput. J. 103, 107172 (2021)
Google Scholar
Jadli, A., Hain, M., Chergui, A., Jaize, A.: DCGAN-based data augmentation for document classification. In: Proceedings of the ICECOCS (2020)
Google Scholar
Raghunandan, K.S., et al.: Fourier coefficients for fraud handwritten document classification through age analysis. In: Proceedings of the ICFHR, pp. 25–30 (2016)
Google Scholar
Saddami, K., Munadi, K., Arnia, F.: Degradation classification on ancient document image based on deep neural networks. In: Proceedings of the ICOIACT, pp. 405–410 (2020)
Google Scholar
Nandanwar, L., et al.: Local gradient difference features for classification of 2D-3D natural scene text images. In: Proceedings of the ICPR, pp. 1112–1119 (2021)
Google Scholar
Xue, M., et al.: Arbitrarily-oriented text detection in low light natural scene images. IEEE Trans. MM, 2706–2719 (2021)
Google Scholar
Fuzzy logic - membership function. https://www.tutorialspoint.com/fuzzy_logic/fuzzy_logic_membership_function.htm. Accessed 16 Jan 2022
Liu, J., Su, H., Yi, Y., Hu, W.: Robust text detection via multi-degree of sharpening and blurring. Signal Process., 259–265 (2016)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM, 84–90 (2017)
Google Scholar
Basavaraj, V., et al.: Age estimation using disconnectedness features in handwriting. In: Proceedings of the ICDAR, pp. 1131–1136 (2019)
Google Scholar
Nayef, N., et al.: ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition—RRC-MLT-2019. In: Proceedings of the ICDAR, pp. 1582–1587 (2019)
Google Scholar

Download references

Acknowledgement

Yue Lu's work is supported by the National Key Research and Development Program of China under Grant No. 2020AAA0107903, the National Natural Science Foundation of China under Grant No. 62176091, and the Shanghai Natural Science Foundation under Grant No. 19ZR1415900. And also, partially supported by TIH, ISI, Kolkata.

Author information

Authors and Affiliations

Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India
Kunal Biswas, Umapada Pal & Mohamad Nizam Bin Ayub
Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Palaiahnakote Shivakumara & Sittravell Sivanthi
Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, China
Yue Lu
National Laboratory of Pattern Recognition, Institute of Automation of Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
School of Artificial Intelligence, University of Chinese Academy, Beijing, China
Cheng-Lin Liu

Authors

Kunal Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Palaiahnakote Shivakumara
View author publications
You can also search for this author in PubMed Google Scholar
Sittravell Sivanthi
View author publications
You can also search for this author in PubMed Google Scholar
Umapada Pal
View author publications
You can also search for this author in PubMed Google Scholar
Yue Lu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Lin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mohamad Nizam Bin Ayub
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Palaiahnakote Shivakumara .

Editor information

Editors and Affiliations

Télécom SudParis, Palaiseau, France
Mounîm El Yacoubi
École de Technologie Supérieure, Montreal, QC, Canada
Eric Granger
Hong Kong Baptist University, Kowloon, Kowloon, Hong Kong
Pong Chi Yuen
Indian Statistical Institute, Kolkata, India
Umapada Pal
Université Paris Cité, Paris, France
Nicole Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Biswas, K. et al. (2022). A New Deep Fuzzy Based MSER Model for Multiple Document Images Classification. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13363. Springer, Cham. https://doi.org/10.1007/978-3-031-09037-0_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-09037-0_30
Published: 02 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09036-3
Online ISBN: 978-3-031-09037-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics