Abstract
Speech is a unique characteristic of humans that expresses one's emotional viewpoint to others. Speech emotion recognition (SER) identifies the speaker's emotion from the speech signal. Nowadays, (SER) plays a vital role in real-time applications such as human–machine interface, lie detection, virtual reality, security, audio mining, etc. But in SER, filtering the noise content and extracting the emotional features is complex. Moreover, incorporating digital filters increases the cost and complexity of the system. Thus, a novel hybrid firefly-based recurrent neural speech recognition (FbRNSR) was developed with preprocessing and a feature analysis module to classify human emotions based on the speech input. The extracted features from the feature extraction module are trained to classify the emotions as happy, sad, or average. Moreover, the incorporation of firefly fitness improves the classification rate. The presented model is executed in Python, and the results are estimated. The performance of the presented approach is analyzed using the confusion matrix. The designed model achieved high true positive rate of 99.34%, true negative rate of 99.12%, false positive of 99.21%, and false negative rate of 99.07%. The designed model achieved 99.2% accuracy, 98.9% recall, and precision value for the speech signal dataset. Finally, the effectiveness and robustness of the proposed approach are proved by comparing it with the existing techniques. Hence, this method is applicable in various sectors such as medicine, security, etc., to identify the state of emotions among the people.













Similar content being viewed by others
Abbreviations
- SER:
-
Speech emotion recognition
- NN:
-
Neural network
- ASI:
-
Automatic speech identification
- SVM:
-
Support vector machine
- RNN:
-
Recurrent neural network
- FFA:
-
Firefly optimization algorithm
- FbRNSR:
-
Firefly-based recurrent neural speech recognition
- DWT:
-
Discrete wavelet transform
References
Zhang, L., Fu, W., Shi, F., Zhou, C., & Liu, Y. (2022). A parallel turbo decoder based on recurrent neural network. Wireless Personal Communications. https://doi.org/10.1007/s11277-022-09779-8
Subhashini, J., & Kumar, C. M. (2019). An algorithm to identify syllable from a visual speech recognition system. Wireless Personal Communications, 107, 2105–2121. https://doi.org/10.1007/s11277-019-06374-2
Sharma, U., Maheshkar, S., Mishra, A. N., & Kaushik, R. (2019). Visual speech recognition using optical flow and hidden Markov model. Wireless Personal Communications, 106, 2129–2147. https://doi.org/10.1007/s11277-018-5930-z
Skuratovskii, R. V., & Osadchyy, V. (2021). Analysis of the MFC singuliarities of speech signals using big data methods. In Intelligent computing (pp. 987–1009). Springer, Cham. https://doi.org/10.1007/978-3-030-80126-7_70
Fu, H., & Lyu, Y. (2022). Facial recognition interaction in a university setting: impression, reaction, and decision-making. International Conference on Information. https://doi.org/10.1007/978-3-030-96957-8_29
Lee, J., Jatowt, A., & Kim, K. S. (2021). Discovering underlying sensations of human emotions based on social media. Journal of the Association for Information Science and Technology, 72(4), 417–432. https://doi.org/10.1002/asi.24414
Tan, Z., Dai, N., Su, Y., Zhang, R., Li, Y., Wu, D., & Li, S. (2021). Human–machine interaction in intelligent and connected vehicles: a review of status quo, issues and opportunities. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2021.3127217
Sahni, S. P., & Langan, L. (2021). Psychological approaches to detection of deceit. In Criminal psychology and the criminal justice system in India and beyond (pp. 173–184). Springer, Singapore. https://doi.org/10.1007/978-981-16-4570-9_11
Quasim, M. T., Alkhammash, E. H., Khan, M. A., & Hadjouni, M. (2021). Emotion-based music recommendation and classification using machine learning with IoT Framework. Soft Computing, 25(18), 12249–12260. https://doi.org/10.1007/s00500-021-05898-9
Nayak, S., Nagesh, B., Routray, A., & Sarma, M. (2021). A human–computer interaction framework for emotion recognition through time-series thermal video sequences. Computers & Electrical Engineering, 93, 107280. https://doi.org/10.1016/j.compeleceng.2021.107280
Veeranki, Y. R., Kumar, H., Ganapathy, N., Natarajan, B., & Swaminathan, R. (2021). A systematic review of sensing and differentiating dichotomous emotional states using audio-visual stimuli. IEEE Access, 9, 124434–124451. https://doi.org/10.1109/ACCESS.2021.3110773
Saleem, S., Amin, J., Sharif, M., Anjum, M. A., Iqbal, M., & Wang, S. H. (2021). A deep network designed for segmentation and classification of leukemia using fusion of the transfer learning models. Complex & Intelligent Systems. https://doi.org/10.1007/s40747-021-00473-z
Yang, K., Wang, C., Gu, Y., Sarsenbayeva, Z., Tag, B., Dingler, T., Wadley, G., & Goncalves, J. (2021). Behavioral and physiological signals-based deep multimodal approach for mobile emotion recognition. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2021.3100868
Aslam, A. R., & Altaf, M. A. B. (2021). A 10.13 µJ/Classification 2-channel deep neural network based SoC for negative emotion outburst detection of autistic children. IEEE Transactions on Biomedical Circuits and Systems, 15(5), 1039–1052. https://doi.org/10.1109/TBCAS.2021.3113613
Kurani, A., Doshi, P., Vakharia, A., & Shah, M. (2021). A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Annals of Data Science. https://doi.org/10.1007/s40745-021-00344-x
Krishnan, P. T., Joseph Raj, A. N., & Rajangam, V. (2021). Emotion classification from speech signal based on empirical mode decomposition and non-linear features. Complex & Intelligent Systems, 7(4), 1919–1934. https://doi.org/10.1007/s40747-021-00295-z
Yuvaraj, N., Chang, V., Gobinathan, B., Pinagapani, A., Kannan, S., Dhiman, G., & Rajan, A. R. (2021). Automatic detection of cyberbullying using multi-feature based artificial intelligence with deep decision tree classification. Computers & Electrical Engineering, 92, 107186. https://doi.org/10.1016/j.compeleceng.2021.107186
Yolcu, O. C., Temel, F. A., & Kuleyin, A. (2021). New hybrid predictive modeling principles for ammonium adsorption: The combination of Response Surface Methodology with feed-forward and Elman-Recurrent Neural Networks. Journal of Cleaner Production, 311, 127688. https://doi.org/10.1016/j.jclepro.2021.127688
Dhavakumar, P., & Gopalan, N. P. (2021). An efficient parameter optimization of software reliability growth model by using chaotic grey wolf optimization algorithm. Journal of Ambient Intelligence and Humanized Computing, 12(2), 3177–3188. https://doi.org/10.1007/s12652-020-02476-z
Alamiedy, T. A., Anbar, M., Alqattan, Z. N. M., & Alzubi, Q. M. (2020). Anomaly-based intrusion detection system using multi-objective grey wolf optimisation algorithm. Journal of Ambient Intelligence and Humanized Computing, 11(9), 3735–3756. https://doi.org/10.1007/s12652-019-01569-8
Srivastava, A. K., Kumar, S., & Zareapoor, M. (2018). Self-organized design of virtual reality simulator for identification and optimization of healthcare software components. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-018-1100-0
Abualigah, L., Diabat, A., & Elaziz, M. A. (2021). Improved slime mould algorithm by opposition-based learning and Levy flight distribution for global optimization and advances in real-world engineering problems. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-03372-w
Shirmarz, A., & Ghaffari, A. (2021). Taxonomy of controller placement problem (CPP) optimization in Software Defined Network (SDN): A survey. Journal of Ambient Intelligence and Humanized Computing, 12(12), 10473–10498. https://doi.org/10.1007/s12652-020-02754-w
Karpagam, M., Geetha, K., & Rajan, C. (2021). A reactive search optimization algorithm for scientific workflow scheduling using clustering techniques. Journal of Ambient Intelligence and Humanized Computing, 12(2), 3199–3207. https://doi.org/10.1007/s12652-020-02480-3
Kumar, P., Shilpi, S., Kanungo, A., Gupta, V., & Gupta, N. K. (2022). A novel ultra wideband antenna design and parameter tuning using hybrid optimization strategy. Wireless Personal Communications, 122(2), 1129–1152. https://doi.org/10.1007/s11277-021-08942-x
Wei, L., Changwu, X., Yue, H., Liguo, C., Lining, S., & Guoqiang, F. (2019). Actual deviation correction based on weight improvement for 10-unit Dolph–Chebyshev array antennas. Journal of Ambient Intelligence and Humanized Computing, 10(5), 1713–1726. https://doi.org/10.1007/s12652-017-0589-y
Kargar-Barzi, A., & Mahani, A. (2020). H-V scan and diagonal trajectory: Accurate and low power localization algorithms in WSNs. Journal of Ambient Intelligence and Humanized Computing, 11(7), 2871–2882. https://doi.org/10.1007/s12652-019-01406-y
Di Fazio, A. R., Erseghe, T., Ghiani, E., Murroni, M., Siano, P., & Silvestro, F. (2013). Integration of renewable energy sources, energy storage systems, and electrical vehicles with smart power distribution networks. Journal of Ambient Intelligence and Humanized Computing, 4(6), 663–671. https://doi.org/10.1007/s12652-013-0182-y
Malhat, H. A., Zainud-Deen, A. S., Rihan, M., & Badway, M. M. (2022). Elements failure detection and radiation pattern correction for time-modulated linear antenna arrays using particle swarm optimization. Wireless Personal Communications. https://doi.org/10.1007/s11277-022-09645-7
Grewal, N. S., Rattan, M., & Patterh, M. S. (2017). A non-uniform circular antenna array failure correction using firefly algorithm. Wireless Personal Communications, 97(1), 845–858. https://doi.org/10.1007/s11277-017-4540-5
Talaat, F. M., & Gamel, S. A. (2022). RL based hyper-parameters optimization algorithm (ROA) for convolutional neural network. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03788-y
Rafiq, M. S., Jianshe, X., Arif, M., & Barra, P. (2021). Intelligent query optimization and course recommendation during online lectures in E-learning system. Journal of Ambient Intelligence and Humanized Computing, 12(11), 10375–10394. https://doi.org/10.1007/s12652-020-02834-x
Talaat, F. M., Saraya, M. S., Saleh, A. I., Ali, H. A., & Ali, S. H. (2020). A load balancing and optimization strategy (LBOS) using reinforcement learning in fog computing environment. Journal of Ambient Intelligence and Humanized Computing, 11(11), 4951–4966. https://doi.org/10.1007/s12652-020-01768-8
Prabhakar, T. S., & Veena, M. N. (2022). Efficient anomaly detection using deer hunting optimization algorithm via adaptive deep belief neural network in mobile network. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03861-6
Gupta, V., Mittal, M., Mittal, V., & Gupta, A. (2021). ECG signal analysis using CWT, spectrogram and autoregressive technique. Iran Journal of Computer Science, 4(4), 265–280. https://doi.org/10.1007/s42044-021-00080-8
Jazayeri, F., Shahidinejad, A., & Ghobaei-Arani, M. (2021). Autonomous computation offloading and auto-scaling the in the mobile fog computing: A deep reinforcement learning-based approach. Journal of Ambient Intelligence and Humanized Computing, 12(8), 8265–8284. https://doi.org/10.1007/s12652-020-02561-3
Usharani, R., & Shanthini, A. (2021). Neuropathic complications: Type II diabetes mellitus and other risky parameters using machine learning algorithms. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-02972-w
Jayavadivel, R., & Prabaharan, P. (2021). Investigation on automated surveillance monitoring for human identification and recognition using face and iris biometric. Journal of Ambient Intelligence and Humanized Computing, 12(11), 10197–10208. https://doi.org/10.1007/s12652-020-02787-1
Elavarasan, D., & Vincent, P. M. (2021). A reinforced random forest model for enhanced crop yield prediction by integrating agrarian parameters. Journal of Ambient Intelligence and Humanized Computing, 12(11), 10009–10022. https://doi.org/10.1007/s12652-020-02752-y
Parameswari, C., & Siva Ranjani, S. (2021). Prediction of atherosclerosis pathology in retinal fundal images with machine learning approaches. Journal of Ambient Intelligence and Humanized Computing, 12(6), 6701–6711. https://doi.org/10.1007/s12652-020-02294-3
Kumar, K. H., & Srinivas, K. (2021). Preliminary performance study of a brief review on machine learning techniques for analogy based software effort estimation. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-03427-y
Benamrane, A., Benelallam, I., & Bouyakhf, E. H. (2020). Constraint programming based techniques for medical resources optimization: Medical internships planning. Journal of Ambient Intelligence and Humanized Computing, 11(9), 3801–3810. https://doi.org/10.1007/s12652-019-01587-6
Gupta, V., Mittal, M., Mittal, V., & Chaturvedi, Y. (2022). Detection of R-peaks using fractional Fourier transform and principal component analysis. Journal of Ambient Intelligence and Humanized Computing, 13(2), 961–972. https://doi.org/10.1007/s12652-021-03484-3
Wang, J., Wang, M., Liu, Q., Yin, G., & Zhang, Y. (2020). Deep anomaly detection in expressway based on edge computing and deep learning. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-020-02574-y
Lin, D., Li, Y., Xie, S., New, T. L., & Dong, S. (2021). Ddr-id: Dual deep reconstruction networks based image decomposition for anomaly detection. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-03425-0
Singh, L., & Alam, A. (2022). An efficient hybrid methodology for an early detection of breast cancer in digital mammograms. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03895-w
Hafeez, U., Umer, M., Hameed, A., Mustafa, H., Sohaib, A., Nappi, M., & Madni, H. A. (2022). A CNN based coronavirus disease prediction system for chest X-rays. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03775-3
Nimmy, K., Dilraj, M., Sankaran, S., & Achuthan, K. (2022). Leveraging power consumption for anomaly detection on IoT devices in smart homes. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-04110-6
Song, X., Cong, Y., Song, Y., Chen, Y., & Liang, P. (2021). A bearing fault diagnosis model based on CNN with wide convolution kernels. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-03177-x
Hassan, T., Akçay, S., Bennamoun, M., Khan, S., & Werghi, N. (2021). Unsupervised anomaly instance segmentation for baggage threat recognition. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-03383-7
Deepak, S., & Ameer, P. M. (2021). Automated categorization of brain tumor from mri using cnn features and svm. Journal of Ambient Intelligence and Humanized Computing, 12(8), 8357–8369. https://doi.org/10.1007/s12652-020-02568-w
Karthik, K., & Sowmya Kamath, S. (2022). MSDNet: A deep neural ensemble model for abnormality detection and classification of plain radiographs. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03835-8
Jan, A., & Khan, G. M. (2022). Real-world malicious event recognition in CCTV recording using Quasi-3D network. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03702-6
Sreeja, M. U., & Kovoor, B. C. (2022). A multi-stage deep adversarial network for video summarization with knowledge distillation. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-021-03641-8
Bhangale, K. B., & Kothandaraman, M. (2022). Survey of deep learning paradigms for speech processing. Wireless Personal Communications. https://doi.org/10.1007/s11277-022-09640-y
Wang, C., Jin, Y., Chen, X., & Liu, Z. (2020). Automatic classification of volumetric optical coherence tomography images via recurrent neural network. Sensing and Imaging, 21(1), 1–15. https://doi.org/10.1007/s11220-020-00299-y
Shirgan, S. S., & Bombale, U. L. (2020). Hybrid neural network based wideband spectrum behavior sensing predictor for cognitive radio application. Sensing and Imaging, 21(1), 1–21. https://doi.org/10.1007/s11220-020-00293-4
Salem, A., Ushijima, K., Gamey, T. J., & Ravat, D. (2001). Automatic detection of UXO from airborne magnetic data using a neural network. Subsurface Sensing Technologies and Applications, 2(3), 191–213. https://doi.org/10.1023/A:1011918119491
Shao, X., Li, H., Lin, H., Kang, X., & Lu, T. (2017). Ship detection in optical satellite image based on RX method and PCAnet. Sensing and Imaging, 18(1), 1–18. https://doi.org/10.1007/s11220-017-0167-6
Plaza-del-Arco, F. M., Molina-González, M. D., Ureña-López, L. A., & Martín-Valdivia, M. T. (2021). Comparing pre-trained language models for Spanish hate speech detection. Expert Systems with Applications, 166, 114120. https://doi.org/10.1016/j.eswa.2020.114120
Banerjee, D., Dutta, S., & Ghosal, A. (2021). Automatic gender identification through speech. In Emerging technologies in data mining and information security (pp. 375–383). Springer, Singapore. https://doi.org/10.1007/978-981-33-4367-2_36
Revathi, A., Sasikaladevi, N., & Geetha, K. (2021). Forensic investigation for twin identification from speech: Perceptual and gamma-tone features and models. Multimedia Tools and Applications, 80(12), 18301–18315. https://doi.org/10.1007/s11042-021-10639-z
Loizou, C. P. (2021). An automated integrated speech and face image analysis system for the identification of human emotions. Speech Communication, 130, 15–26. https://doi.org/10.1016/j.specom.2021.04.001
Fasounaki, M., Yüce, E. B., Öncül, S., & İnce, G. (2021). CNN-based Text-independent automatic speaker identification using short utterances. In 2021 6th international conference on computer science and engineering (UBMK). IEEE. https://doi.org/10.1109/UBMK52708.2021.9559031
Lin, J. C. W., Shao, Y., Djenouri, Y., & Yun, U. (2021). ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowledge-Based Systems, 212, 106548. https://doi.org/10.1016/j.knosys.2020.106548
Wankhade, S. B., & Doye, D. D. (2019). IKKN predictor: An EEG signal based emotion recognition for HCI. Wireless Personal Communications, 107(2), 1135–1153. https://doi.org/10.1007/s11277-019-06328-8
Agarwal, G., & Om, H. (2021). Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimedia Tools and Applications, 80(7), 9961–9992. https://doi.org/10.1007/s11042-020-10118-x
Fang, Q., Nguyen, H., Bui, X. N., Nguyen-Thoi, T., & Zhou, J. (2021). Modeling of rock fragmentation by firefly optimization algorithm and boosted generalized additive model. Neural Computing and Applications, 33(8), 3503–3519. https://doi.org/10.1007/s00521-020-05197-8
Rajak, R., & Mall, R. (2019). Emotion recognition from audio, dimensional and discrete categorization using CNNs. In TENCON 2019–2019 IEEE Region 10 conference (TENCON). IEEE. https://doi.org/10.1109/TENCON.2019.8929459
Hizlisoy, S., Yildirim, S., & Tufekci, Z. (2021). Music emotion recognition using convolutional long short term memory deep neural networks. Engineering Science and Technology, an International Journal, 24(3), 760–767. https://doi.org/10.1016/j.jestch.2020.10.009
Acknowledgements
None.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no potential conflict of interest.
Ethical Approval
All applicable institutional and/or national guidelines for the care and use of animals were followed.
Informed Consent
For this type of analysis formal consent is not needed.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Koppula, N., Rao, K.S., Nabi, S.A. et al. A Novel Optimized Recurrent Network-Based Automatic System for Speech Emotion Identification. Wireless Pers Commun 128, 2217–2243 (2023). https://doi.org/10.1007/s11277-022-10040-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-022-10040-5