Abstract
In this work, a convolutional neural network (CNN) based architecture is proposed for low memory GPU to recognize the handwritten isolated Bangla characters and numerals. The merit of the proposed architecture is the lesser number of trainable parameters as compared to the standard deep architectures and enabling it to train the proposed architecture on the low-memory GPU. The features from various layers of CNN are fused to handle the multi-scale nature of a character. The spatial pyramid pooling on the fused features produces a fixed size feature vector. It helps to reduce the number of parameters of the proposed model. Extensive experiments have been conducted on various versions of publicly available Bangla character dataset CMATERdb. The proposed architecture yields competitive results as compared to the fine-tuned standard deep architectures such as AlexNet, VGGNet, and GoogLeNet.
Similar content being viewed by others
References
Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029
Roy PP, Pal U, Lladós J, Kimura F (2008) Convex hull based approach for multi oriented character recognition from graphical documents. In: 19th international conference on pattern recognition, pp 1–4
Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149
Das N, Acharya K, Sarkar R, Basu S, Kundu M, Nasipuri M (2014) A benchmark image database of isolated Bangla handwritten compound characters. Int J Doc Anal Recognit 17(4):413
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Yuan A, Bai G, Jiao L, Liu Y (2012) Offline handwritten english character recognition based on convolutional neural network. In: 10th IAPR International workshop on document analysis systems, pp 125–129
Ciresan DC, Schmidhuber J (2013) Multi-column deep neural networks for offline handwritten Chinese character classification. Technical report, IDSIA
Kim I, Xie X (2015) Handwritten hangul recognition using deep convolutional neural networks. Int J Doc Anal Recognit 18(1):1
Mehrotra K, Jetley S, Deshmukh A, Belhe S (2013) Unconstrained handwritten Devanagari character recognition using convolutional neural networks. In: Proceedings of the 4th international workshop on multilingual OCR, p 15
Singh P, Verma A, Chaudhari NS (2016) Deep convolutional neural network classifier for handwritten Devanagari character recognition. In: Information systems design and intelligent applications, pp 551–561
Maitra DS, Bhattacharya U, Parui SK (2015) CNN based common approach to handwritten character recognition of multiple scripts. In: 13th international conference on document analysis and recognition, pp 1021–1025
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Bhattacharya U, Shridhar M, Parui SK (2006) On recognition of handwritten Bangla characters. In: Computer vision, graphics and image processing, pp 817–828
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recognit 42(7):1467
Bhattacharya U, Shridhar M, Parui SK, Sen P, Chaudhuri B (2012) Offline recognition of handwritten Bangla characters: an efficient two-stage approach. Pattern Anal Appl 15(4):445
Sarkhel R, Das N, Saha AK, Nasipuri M (2016) A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition. Pattern Recognit 58:172
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2012) An MLP based approach for recognition of handwritten Bangla’ numerals. arXiv:1203.0876
Santosh K (2011) Character recognition based on dtw-radon. In: International conference on document analysis and recognition, pp 264–268
Das N, Sarkar R, Basu S, Kundu M, Nasipuri M, Basu DK (2012) A genetic algorithm based region sampling for selection of local features in handwritten digit recognition application. Appl Soft Comput 12(5):1592
Khan HA, Al Helal A, Ahmed KI (2014) Handwritten Bangla digit recognition using sparse representation classifier. In: International conference on informatics, electronics and vision, pp 1–6
Alom MZ, Sidike P, Taha TM, Asari VK (2017) Handwritten Bangla digit recognition using deep learning. arXiv:1705.02680
Bag S, Harit G, Bhowmick P (2014) Recognition of Bangla compound characters using structural decomposition. Pattern Recognit 47(3):1187
Roy S, Das N, Kundu M, Nasipuri M (2017) Handwritten isolated Bangla compound character recognition: a new benchmark using a novel deep learning approach. Pattern Recognit Lett 90:15
Das N, Sarkar R, Basu S, Saha PK, Kundu M, Nasipuri M (2015) Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach. Pattern Recognit 48(6):2054
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning, pp 807–814
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106
Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. In: Advances in neural information processing systems, pp 4898–4906
Le H, Borji A (2017) What are the receptive, effective receptive, and projective fields of neurons in convolutional neural networks? arXiv:1705.07049
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Park SJ, Hong KS, Lee S (2017) Rdfnet: Rgb-d multi-level residual feature fusion for indoor semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 4980–4989
Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv:1212.5701
Bergstra J, Bastien F, Breuleux O, Lamblin P, Pascanu R, Delalleau O, Desjardins G, Warde-Farley D, Goodfellow I, Bergeron A et al (2011) Theano: deep learning on gpus with python. In: Neural information processing systems, vol. 3, pp 1–48
Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook, pp 667–685
He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5353–5360
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Keserwani, P., Ali, T. & Roy, P.P. Handwritten Bangla character and numeral recognition using convolutional neural network for low-memory GPU. Int. J. Mach. Learn. & Cyber. 10, 3485–3497 (2019). https://doi.org/10.1007/s13042-019-00938-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-019-00938-1