Abstract
Text obtained in natural scenes contains various information; therefore, it is extensively used in various applications to understand the image scenarios and also to retrieve the visual information. The semantic information provided by this scene image is very much valuable for human beings to realize the whole environment. But the text in such natural images depicts a flexible appearance in an unconstrained environment which makes the text identification and character recognition process a more challenging one. Therefore, a weighted naïve Bayes classifier (WNBC)-based deep learning process is used in this framework to effectively detect the text and to recognize the character from the scene images. Normally, the natural scene images may carry some kind of noise in it, and to remove that, the guided image filter is introduced at the pre-processing stage. The features that are useful for the classification process are extracted using the Gabor transform and stroke width transform techniques. Finally, with these extracted features, the text detection and character recognition is successfully achieved by WNBC and deep neural network-based adaptive galactic swarm optimization. Then, the performance metrics such as accuracy, F1-score, precision, mean absolute error, mean square error and recall metrics are evaluated to estimate the adeptness of the proposed method.
Similar content being viewed by others
References
Ahmed SB, Naz S, Razzak MI and Yusof R (2018) Cursive scene text analysis by deep convolutional linear pyramids. In: International conference on neural information processing, Springer, Cham, pp 307–318
Ahmed SB, Naz S, Razzak MI, Yusof RB (2019) A novel dataset for english-arabic scene text recognition (EASTR)-42 K and its evaluation using invariant feature extraction on detected extremal regions. IEEE Access 7:19801–19820
Ali A, Pickering M, Shafi K (2018) Urdu natural scene character recognition using convolutional neural networks. In: 2018 IEEE 2nd international workshop on arabic and derived script analysis and recognition (ASAR), IEEE, pp 29–34
Almazán J, Gordo A, Fornés A, Valveny E (2014) Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell 36(12):2552–2566
Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Gener Comput Syst 87:328–340
Bagchi C, Amali DGB and Dinakaran M (2019) Accurate facial ethnicity classification using artificial neural networks trained with galactic swarm optimization algorithm. In: Information systems design and intelligent applications, pp 123–132
Baliarsingh SK, Vipsita S, Muhammad K, Bakshi S (2019) Analysis of high-dimensional biomedical data using an evolutionary multi-objective emperor penguin optimizer. Swarm Evolut Comput 48:262–273
Baran R, Partila P, Wilk R (2018) Automated text detection and character recognition in natural scenes based on local image features and contour processing techniques. In: International conference on intelligent human systems integration, Springer, Cham, pp 42–48
Bernal E et al (2018a) A variant to the dynamic adaptation of parameters in galactic swarm optimization using a fuzzy logic augmentation. FUZZ-IEEE, pp 1–7
Bernal E et al (2017) Imperialist competitive algorithm with dynamic parameter adaptation using fuzzy logic applied to the optimization of mathematical functions. Algorithms 10(1):18
Bernal E, Castillo O, Soria J, Valdez F (2018a) Fuzzy galactic swarm optimization with dynamic adjustment of parameters based on fuzzy logic. Metaheuristics 1(1):1–19
Bernal E, Castillo O, Soria J, Valdez F (2018c) Galactic swarm optimization with adaptation of parameters using fuzzy logic for the optimization of mathematical functions. In: Fuzzy logic augmentation of neural and optimization algorithms: theoretical aspects and real applications, Springer, Cham, pp 131–140
Bhunia AK, Kumar G, Roy PP, Balasubramanian R, Pal U (2018) Text recognition in scene image and video frame using Color Channel selection. Multimedia Tools Appl 77(7):8551–8578
Castillo O et al (2015) A new approach for dynamic fuzzy logic parameter tuning in ant colony optimization and its application in fuzzy control of a mobile robot. Appl Soft Comput 28:150–159
Chandio AA, Pickering M (2019) Convolutional feature fusion for multi-language text detection in natural scene images. In: 2019 2nd international conference on computing, mathematics and engineering technologies (iCoMET), IEEE, pp 1–6
Chavre P, Ghotkar A (2016) Scene text extraction using stroke width transform for tourist translator on android platform. In: 2016 international conference on automatic control and dynamic optimization techniques (ICACDOT), IEEE, pp 301–306
Cheng J, Rajapakse JC (2008) Segmentation of clustered nuclei with shape markers and marking function. IEEE Trans Biomed Eng 56(3):741–748
Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
Dhiman G, Kumar V (2018) Emperor penguin optimizer: a bio-inspired algorithm for engineering problems. Knowl-Based Syst 159:20–50
Dutta IN, Chakraborty N, Mollah AF, Basu S, Sarkar R (2019) Multi-lingual text localization from camera captured images based on foreground homogenity analysis. In: Recent developments in machine learning and data analytics, Springer, Singapore, pp 149–158
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 2963–2970
Francis LM, Sreenath N (2019) Robust scene text recognition: using manifold regularized Twin-Support Vector Machine. J King Saud Univ-Comput Inform Sci 10(1):19–36
Gao X, Qian Y, Hui R, Loomes M, Comley R, Barn B, Chapman A, Rix J (2010) Texture-based 3D image retrieval for medical applications. In: IADIS international conference e-health, pp 101–108
Gaxiola F et al (2016) Optimization of type-2 fuzzy weights in backpropagation learning for neural networks using GAs and PSO. Appl Soft Comput 38:860–871
Ghai D, Jain N (2019) Comparative analysis of multi-scale wavelet decomposition and k-means clustering based text extraction. Wireless Personal Commun 109(1):1–36
He P, Huang W, Qiao Y, Loy CC, Tang X (2016) Reading scene text in deep convolutional sequences. In: Thirtieth AAAI conference on artificial intelligence
Huang Z, Jiang S, Yang Z, Ding Y, Wang W, Yu Y (2016) Automatic multi-organ segmentation of prostate magnetic resonance images using watershed and nonsubsampled contourlet transform. Biomed Signal Process Control 25:53–61
Huang Z, Zhong Z, Sun L, Huo Q (2019) Mask R-CNN with pyramid attention network for scene text detection. In: 2019 IEEE winter conference on applications of computer vision (WACV), IEEE, pp 764–772
Joan SF, Valli S (2019) A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci, India, Sect A 89(1):77–101
Keserwani P, Chandrasekhar Pammi VS, Prakash O, Khare A, Jeon M (2016) Classification of alzheimer disease using gabor texture feature of hippocampus region. Int J Image Graph Signal Process 8(6):13
Kharya S, Soni S (2016) Weighted naive bayes classifier: a predictive model for breast cancer detection. Int J Comput Appl 133(9):32–37
Khlif W, Nayef N, Burie JC, Ogier JM, Alimi A (2018) Learning text component features via convolutional neural networks for scene text detection. In: 2018 13th IAPR international workshop on document analysis systems (DAS), IEEE, pp 79–84
Kumuda T, Basavaraj L (2017) Edge based segmentation approach to extract text from scene images. In: 2017 IEEE 7th international advance computing conference (IACC), IEEE, pp 706–710
Lin J, Yu J (2011) Weighted Naive Bayes classification algorithm based on particle swarm optimization. In: 2011 IEEE 3rd international conference on communication software and networks, IEEE, pp 444–447
Lin H, Yang P, Zhang F (2019) Review of scene text detection and recognition. Archiv Comput Methods Eng 27(2):433–454
Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26
Liu X, Meng G, Pan C (2019) Scene text detection and recognition with advances in deep learning: a survey. Int J Document Anal Recognit (IJDAR) 22(2):143–162
Lu Z, Long B, Li K, Lu F (2018) Effective guided image filtering for contrast enhancement. IEEE Signal Process Lett 25(10):1585–1589
Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia 20(11):3111–3122
Manjula C, Florence L (2019) Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Comput 22(4):9847–9863
Muthiah-Nakarajan V, Noel MM (2016) Galactic swarm optimization: a new global optimization metaheuristic inspired by galactic motion. Appl Soft Comput 38:771–787
Paul S, Saha S, Basu S, Saha PK, Nasipuri M (2019) Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter. Multimedia Tools Appl 78(13):18017–18036
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275:1531–1549
Tang Y, Wu X (2018) Scene text detection using superpixel-based stroke feature transform and deep learning based region classification. IEEE Trans Multimedia 20(9):2276–2288
Tian C, Xia Y, Zhang X, Gao X (2017) Natural scene text detection with MC–MR candidate extraction and coarse-to-fine filtering. Neurocomputing 260:112–122
Trémeau A, Fernando B, Karaoglu S, Muselet D (2011a) April) Detecting text in natural scenes based on a reduction of photometric effects: problem of text detection. International workshop on computational color imaging. Springer, Berlin, pp 230–244
Trémeau A, Godau C, Karaoglu S, Muselet D (2011b) April) Detecting text in natural scenes based on a reduction of photometric effects: problem of color invariance. International workshop on computational color imaging. Springer, Berlin, pp 214–229
Wang L, Uchida S, Zhu A, Sun J (2018a) Human reading knowledge inspired text line extraction. Cognit Comput 10(1):84–93
Wang Y, Shi C, Xiao B, Wang C, Qi C (2018b) CRF based text detection for natural scene images using convolutional neural network and context information. Neurocomputing 295:46–58
Wang Y, Wang L, Su F (2018) A robust approach for scene text detection and tracking in video. In: Pacific rim conference on multimedia, Springer, Cham, pp 303–314
Wu H, Zou B, Zhao YQ, Guo J (2017) Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy. Vis Comput 33(1):113–126
Xie X, Li Y, Zhang M, Shen L (2018) Robust segmentation of nucleus in histopathology images via mask R-CNN. In: International MICCAI Brainlesion workshop, Springer, Cham, pp 428–436
Xue M, Shivakumara P, Zhang C, Lu T, Pal U (2019) Curved text detection in blurred/non-blurred video/scene images. Multimedia Tools Appl 78(18):25629–25653
Zeng F, Liu L (2013) Contrast enhancement of mammographic images using guided image filtering. In: Chinese conference on image and graphics technologies, pp 300–306
Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4159-4167
Zhu A, Uchida S (2017) Scene text relocation with guidance. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), IEEE, vol 1, pp 1289–1294
Funding
No funding is provided for the preparation of manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors Digvijay Pandey, Binay Kumar Pandey, Subodh Wairya declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pandey, D., Pandey, B.K. & Wairya, S. Hybrid deep neural network with adaptive galactic swarm optimization for text extraction from scene images. Soft Comput 25, 1563–1580 (2021). https://doi.org/10.1007/s00500-020-05245-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05245-4