Abstract
Understanding document images uploaded on social media is challenging because of multiple types like handwritten, printed and scene text images. This study presents a new model called Deep Fuzzy based MSER for classification of multiple document images (like handwritten, printed and scene text). The proposed model detects candidate components that represent dominant information irrespective of the type of document images by combining fuzzy and MSER in a novel way. For every candidate component, the proposed model extracts distance-based features which result in proximity matrix (feature matrix). Further, the deep learning model is proposed for classification by feeding input images and feature matrix as input. To evaluate the proposed model, we create our own dataset and to show effectiveness, the proposed model is tested on standard datasets. The results show that the proposed approach outperforms the existing methods in terms of average classification rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Krishnani, D., et al.: A new context-based features for classification of emotions in photographs. Multimedia Tools Appl. 80, 15589–15618 (2021)
Nandanwar, L., et al.: Chebyshev-Harmonic-Fourier-Moments and deep CNNs for detecting forged handwriting. In: Proceedings of the ICPR, pp. 6562–6569 (2021)
Liu, L., et al.: Document image classification: progress over decades. Neurocomputing 453, 223–240 (2021)
Bakkali, S., Ming, Z., Coustaty, M., Rusinol, M.: Cross-modal deep networks for document image classification. In: Proceedings of the ICIP, pp. 2556–2560 (2020)
Rani, N.S., Nair, B.J.B., Karthik, S.K., Srinidi, A.: Binarization of degraded photographed document images-a variational denoising auto encoder. In: Proceedings of the ICIRCA, pp. 119–124 (2021)
Vision AI | Derive Image Insights via MLÂ | Cloud Vision API. https://cloud.google.com/vision. Accessed 28 Jan 2022
Pal, U., Chaudhuri, B.B.: Machine-printed and hand-written text lines identification. Pattern Recognit. Lett. 22(3/4), 431–441 (2001)
Bakkali, S., Ming, Z., Coustaty, M., Rusinol, M.: Visual and textual deep feature fusion for document image classification. In: Proceedings of the CVPRW, pp. 2394–2403 (2020)
Bhowmic, S., Sarkar, R.: Classification of text regions in a document image by analyzing the properties of connected components. In: Proceedings of the ASPCON, pp. 36–40 (2020)
Fu, W., Xue, B., Gao, X., Zhang, M.: Transductive transfer learning based genetic programming for balanced and unbalanced document classification using different types of features. Appl. Soft Comput. J. 103, 107172 (2021)
Jadli, A., Hain, M., Chergui, A., Jaize, A.: DCGAN-based data augmentation for document classification. In: Proceedings of the ICECOCS (2020)
Raghunandan, K.S., et al.: Fourier coefficients for fraud handwritten document classification through age analysis. In: Proceedings of the ICFHR, pp. 25–30 (2016)
Saddami, K., Munadi, K., Arnia, F.: Degradation classification on ancient document image based on deep neural networks. In: Proceedings of the ICOIACT, pp. 405–410 (2020)
Nandanwar, L., et al.: Local gradient difference features for classification of 2D-3D natural scene text images. In: Proceedings of the ICPR, pp. 1112–1119 (2021)
Xue, M., et al.: Arbitrarily-oriented text detection in low light natural scene images. IEEE Trans. MM, 2706–2719 (2021)
Fuzzy logic - membership function. https://www.tutorialspoint.com/fuzzy_logic/fuzzy_logic_membership_function.htm. Accessed 16 Jan 2022
Liu, J., Su, H., Yi, Y., Hu, W.: Robust text detection via multi-degree of sharpening and blurring. Signal Process., 259–265 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM, 84–90 (2017)
Basavaraj, V., et al.: Age estimation using disconnectedness features in handwriting. In: Proceedings of the ICDAR, pp. 1131–1136 (2019)
Nayef, N., et al.: ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition—RRC-MLT-2019. In: Proceedings of the ICDAR, pp. 1582–1587 (2019)
Acknowledgement
Yue Lu's work is supported by the National Key Research and Development Program of China under Grant No. 2020AAA0107903, the National Natural Science Foundation of China under Grant No. 62176091, and the Shanghai Natural Science Foundation under Grant No. 19ZR1415900. And also, partially supported by TIH, ISI, Kolkata.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Biswas, K. et al. (2022). A New Deep Fuzzy Based MSER Model for Multiple Document Images Classification. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13363. Springer, Cham. https://doi.org/10.1007/978-3-031-09037-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-09037-0_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09036-3
Online ISBN: 978-3-031-09037-0
eBook Packages: Computer ScienceComputer Science (R0)