Skip to main content
Log in

Ensemble deep learning model for optical character recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In modern deep learning, character recognition in images is a very important field of study due to its has many real life applications. The goal of this paper is to create the state-of-the-art character recognition model using a stacking ensemble of convolution neural networks (CNNs).To develop the proposed ensemble model, we evaluated several CNN models. The models were judged on how well they performed on the Chars74k dataset. The dataset contains 74,103 images divided into 62 classes with labels [A-Z], [a-z], and [0-9]. The accuracy distribution based on the dataset’s subgroups (uppercase, lowercase, and digit) is shown in results. The proposed ensemble model achieves state-of-the-art performance with a maximum accuracy of 92.31% on complete dataset, 99.22% on Uppercase alphabets, 98.66% on Lowercase alphabets, 99.77% on Digits, 91.97% on Uppercase+Lowercase alphabets. On the complete and partial datasets, a comparison report between the proposed model and other existing approaches is also displayed. A comparative study of the proposed work and the previous methods is also shown in this paper, in order to demonstrate the effectiveness of the proposed work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258

  2. De Campos TE, Babu BR, Varma M et al (2009) Character recognition in natural images. VISAPP 2:7

    Google Scholar 

  3. Dey R, Balabantaray RC, Mohanty S (2021) Sliding window based off-line handwritten text recognition using edit distance. Multimedia Tools and Applications, pp 1–28

  4. Dey R, Balabantaray RC, Mohanty S (2022) Offline odia handwritten character recognition with a focus on compound characters. Multimedia Tools and Applications, pp 1–27

  5. Driss SB, Soua M, Kachouri R, Akil M (2017) A comparison study between MLP and convolutional neural network models for character recognition. In: Real-Time Image and Video Processing 2017, vol. 10223, p 1022306. International Society for Optics and Photonics

  6. Harizi R, Walha R, Drira F, Zaied M (2021) Convolutional neural network with joint stepwise character/word modeling based system for scene text recognition. Multimedia Tools and Applications, pp 1–16

  7. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  8. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision, pp 630–645. Springer

  9. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708

  10. Hussein D, Ibrahim D, Elshennawy N (2020) Deep-pneumonia framework using deep learning models based on chest x-ray images. Diagnostics 10:1–16

    Google Scholar 

  11. Islam MS, Rahman MM, Rahman MH, Rivolta MW, Aktaruzzaman M (2022) Ratnet: A deep learning model for Bengali handwritten characters recognition. Multimedia Tools and Applications, pp 1–21

  12. Islam N, Islam Z, Noor N (2017) A survey on optical character recognition system. arXiv preprint arXiv:1710.05703

  13. Joshi GP, Alenezi F, Thirumoorthy G, Dutta AK, You J (2021) Ensemble of deep learning-based multimodal remote sensing image classification model on unmanned aerial vehicle networks. Mathematics 9(22):2984

    Article  Google Scholar 

  14. Kandaswamy C, Silva LM, Alexandre LA, Santos JM, de Sá JM (2014) Improving deep neural network performance by reusing features trained with transductive transference. In: International conference on artificial neural networks, pp 265–272. Springer

  15. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  16. Ko DG, Song SH, Kang KM, Han SW (2017) Convolutional neural networks for character-level classification. IEIE Trans Smart Process Comput 6(1):53–59

    Article  Google Scholar 

  17. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  18. Krogh PSA et al (1996) Learning with ensembles: How over-fitting can be useful. In: Proceedings of the 1995 Conference, vol. 8, p 190

  19. Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207

    Article  Google Scholar 

  20. LeCun Y et al (2015) Lenet-5, convolutional neural networks. 20(5):14. http://yann.lecun.com/exdb/lenet

  21. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  22. LeCun Y, Bengio Y et al (1995) Convolutional networks for images, speech, and time series. Handbook Brain Theory Neural Netw 3361(10):1995

  23. Mahdianpari M, Salehi B, Rezaee M, Mohammadimanesh F, Zhang Y (2018) Very deep convolutional neural networks for complex land cover mapping using multispectral remote sensing imagery. Remote Sens 10(7):1119

    Article  Google Scholar 

  24. Obaid A, El Bakry H, Eldosuky M, Shehab A (2016) Handwritten text recognition system based on neural network. Int J Adv Res Comput Sci Technol 4(1):72–77

  25. Priya A, Mishra S, Raj S, Mandal S, Datta S (2016) Online and offline character recognition: A survey. In: 2016 International Conference on Communication and Signal Processing (ICCSP), pp 0967–0970. IEEE

  26. Roy RK, Mukherjee H, Roy K, Pal U (2022) CNN based recognition of handwritten multilingual city names. Multimedia Tools and Applications, pp 1–17

  27. Shen J, Robertson N (2021) Bbas: Towards large scale effective ensemble adversarial attacks against deep neural network learning. Inf Sci 569:469–478

    Article  Google Scholar 

  28. Sheshadri K, Divvala SK (2012) Exemplar driven character recognition in the wild. In: BMVC, pp 1–10

  29. Soomro M, Farooq MA, Raza RH (2017) Performance evaluation of advanced deep learning architectures for offline handwritten character recognition. In: 2017 International Conference on Frontiers of Information Technology (FIT), pp 362–367. IEEE

  30. Sundaresan V, Lin J Recognizing handwritten digits and characters

  31. Supardi J, Hapsari IA, Siraj MM (2014) Handwritten alphabets recognition using twelve directional feature extraction and self organizing maps. In: 2014 International Conference on Computer, Control, Informatics and its Applications (IC3INA), pp 149–153. IEEE

  32. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9

  33. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826

  34. Tang Q, Feng X, Zhang X (2022) A spatial feature adaptive network for text detection. Multimedia Tools and Applications, pp 1–18

  35. Wang K, Belongie S (2010) Word spotting in the wild. In: European conference on computer vision, pp 591–604. Springer

  36. Wibowo GH, Sigit R, Barakbah A (2016) Feature extraction of character image using shape energy. In: 2016 International Electronics Symposium (IES), pp 471–475. IEEE

  37. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  Google Scholar 

  38. Yi C, Yang X, Tian Y (2013) Feature representations for scene text character recognition: A comparative study. In: 2013 12th International Conference on Document Analysis and Recognition, pp 907–911. IEEE

  39. Zhao H, Hu Y, Zhang J (2017) Character recognition via a compact convolutional neural network. In: 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp 1–6. IEEE

  40. Zhao H, Hu Y, Zhang J (2017) Reading text in natural scene images via deep neural networks. In: 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR), pp 43–48. IEEE

Download references

Funding

There has been no significant financial support for this work that could have influenced its outcome.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashish Shetty.

Ethics declarations

Conflict of interest

The authors confirm that there are no known conflicts of interest associated with this publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shetty, A., Sharma, S. Ensemble deep learning model for optical character recognition. Multimed Tools Appl 83, 11411–11431 (2024). https://doi.org/10.1007/s11042-023-16018-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16018-0

Keywords

Navigation