Ensemble deep learning model for optical character recognition

Shetty, Ashish; Sharma, Sanjeev

doi:10.1007/s11042-023-16018-0

Ensemble deep learning model for optical character recognition

Published: 28 June 2023

Volume 83, pages 11411–11431, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

190 Accesses
1 Citation
Explore all metrics

Abstract

In modern deep learning, character recognition in images is a very important field of study due to its has many real life applications. The goal of this paper is to create the state-of-the-art character recognition model using a stacking ensemble of convolution neural networks (CNNs).To develop the proposed ensemble model, we evaluated several CNN models. The models were judged on how well they performed on the Chars74k dataset. The dataset contains 74,103 images divided into 62 classes with labels [A-Z], [a-z], and [0-9]. The accuracy distribution based on the dataset’s subgroups (uppercase, lowercase, and digit) is shown in results. The proposed ensemble model achieves state-of-the-art performance with a maximum accuracy of 92.31% on complete dataset, 99.22% on Uppercase alphabets, 98.66% on Lowercase alphabets, 99.77% on Digits, 91.97% on Uppercase+Lowercase alphabets. On the complete and partial datasets, a comparison report between the proposed model and other existing approaches is also displayed. A comparative study of the proposed work and the previous methods is also shown in this paper, in order to demonstrate the effectiveness of the proposed work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

References

Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258
De Campos TE, Babu BR, Varma M et al (2009) Character recognition in natural images. VISAPP 2:7
Google Scholar
Dey R, Balabantaray RC, Mohanty S (2021) Sliding window based off-line handwritten text recognition using edit distance. Multimedia Tools and Applications, pp 1–28
Dey R, Balabantaray RC, Mohanty S (2022) Offline odia handwritten character recognition with a focus on compound characters. Multimedia Tools and Applications, pp 1–27
Driss SB, Soua M, Kachouri R, Akil M (2017) A comparison study between MLP and convolutional neural network models for character recognition. In: Real-Time Image and Video Processing 2017, vol. 10223, p 1022306. International Society for Optics and Photonics
Harizi R, Walha R, Drira F, Zaied M (2021) Convolutional neural network with joint stepwise character/word modeling based system for scene text recognition. Multimedia Tools and Applications, pp 1–16
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision, pp 630–645. Springer
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708
Hussein D, Ibrahim D, Elshennawy N (2020) Deep-pneumonia framework using deep learning models based on chest x-ray images. Diagnostics 10:1–16
Google Scholar
Islam MS, Rahman MM, Rahman MH, Rivolta MW, Aktaruzzaman M (2022) Ratnet: A deep learning model for Bengali handwritten characters recognition. Multimedia Tools and Applications, pp 1–21
Islam N, Islam Z, Noor N (2017) A survey on optical character recognition system. arXiv preprint arXiv:1710.05703
Joshi GP, Alenezi F, Thirumoorthy G, Dutta AK, You J (2021) Ensemble of deep learning-based multimodal remote sensing image classification model on unmanned aerial vehicle networks. Mathematics 9(22):2984
Article Google Scholar
Kandaswamy C, Silva LM, Alexandre LA, Santos JM, de Sá JM (2014) Improving deep neural network performance by reusing features trained with transductive transference. In: International conference on artificial neural networks, pp 265–272. Springer
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Ko DG, Song SH, Kang KM, Han SW (2017) Convolutional neural networks for character-level classification. IEIE Trans Smart Process Comput 6(1):53–59
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Krogh PSA et al (1996) Learning with ensembles: How over-fitting can be useful. In: Proceedings of the 1995 Conference, vol. 8, p 190
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207
Article Google Scholar
LeCun Y et al (2015) Lenet-5, convolutional neural networks. 20(5):14. http://yann.lecun.com/exdb/lenet
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. Handbook Brain Theory Neural Netw 3361(10):1995
Mahdianpari M, Salehi B, Rezaee M, Mohammadimanesh F, Zhang Y (2018) Very deep convolutional neural networks for complex land cover mapping using multispectral remote sensing imagery. Remote Sens 10(7):1119
Article Google Scholar
Obaid A, El Bakry H, Eldosuky M, Shehab A (2016) Handwritten text recognition system based on neural network. Int J Adv Res Comput Sci Technol 4(1):72–77
Priya A, Mishra S, Raj S, Mandal S, Datta S (2016) Online and offline character recognition: A survey. In: 2016 International Conference on Communication and Signal Processing (ICCSP), pp 0967–0970. IEEE
Roy RK, Mukherjee H, Roy K, Pal U (2022) CNN based recognition of handwritten multilingual city names. Multimedia Tools and Applications, pp 1–17
Shen J, Robertson N (2021) Bbas: Towards large scale effective ensemble adversarial attacks against deep neural network learning. Inf Sci 569:469–478
Article Google Scholar
Sheshadri K, Divvala SK (2012) Exemplar driven character recognition in the wild. In: BMVC, pp 1–10
Soomro M, Farooq MA, Raza RH (2017) Performance evaluation of advanced deep learning architectures for offline handwritten character recognition. In: 2017 International Conference on Frontiers of Information Technology (FIT), pp 362–367. IEEE
Sundaresan V, Lin J Recognizing handwritten digits and characters
Supardi J, Hapsari IA, Siraj MM (2014) Handwritten alphabets recognition using twelve directional feature extraction and self organizing maps. In: 2014 International Conference on Computer, Control, Informatics and its Applications (IC3INA), pp 149–153. IEEE
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826
Tang Q, Feng X, Zhang X (2022) A spatial feature adaptive network for text detection. Multimedia Tools and Applications, pp 1–18
Wang K, Belongie S (2010) Word spotting in the wild. In: European conference on computer vision, pp 591–604. Springer
Wibowo GH, Sigit R, Barakbah A (2016) Feature extraction of character image using shape energy. In: 2016 International Electronics Symposium (IES), pp 471–475. IEEE
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article Google Scholar
Yi C, Yang X, Tian Y (2013) Feature representations for scene text character recognition: A comparative study. In: 2013 12th International Conference on Document Analysis and Recognition, pp 907–911. IEEE
Zhao H, Hu Y, Zhang J (2017) Character recognition via a compact convolutional neural network. In: 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp 1–6. IEEE
Zhao H, Hu Y, Zhang J (2017) Reading text in natural scene images via deep neural networks. In: 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR), pp 43–48. IEEE

Download references

Funding

There has been no significant financial support for this work that could have influenced its outcome.

Author information

Authors and Affiliations

Indian Institute of Information Technology, Pune, India
Ashish Shetty & Sanjeev Sharma

Authors

Ashish Shetty
View author publications
You can also search for this author in PubMed Google Scholar
Sanjeev Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashish Shetty.

Ethics declarations

Conflict of interest

The authors confirm that there are no known conflicts of interest associated with this publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shetty, A., Sharma, S. Ensemble deep learning model for optical character recognition. Multimed Tools Appl 83, 11411–11431 (2024). https://doi.org/10.1007/s11042-023-16018-0

Download citation

Received: 14 March 2022
Revised: 29 April 2023
Accepted: 11 June 2023
Published: 28 June 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-16018-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble deep learning model for optical character recognition

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ensemble deep learning model for optical character recognition

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation