Abstract
To enable humanoid robots to share the social space,development in technology is required for natural interaction with the robots using multiple modes of communication such as speech, gestures, and share emotions with them. This research is targeted towards addressing the core issue of emotion recognition problem, which would require fewer computation resources and a much lesser number of network parameters, which will be more adaptive to compute on social robots for real-time communication. Any robots will have limited computation capability for run time actions and decisions. In the present investigation, Inception based Convolution Neural Network(CNN) Architecture is proposed to improve the emotion prediction. The proposed model has achieved improved accuracy of up to 6% improvement over the existing network architecture for emotion classification. The model was tested over seven different datasets to verify its robustness. In addition, real-time implementation capability is verified on humanoid robot NAO, which depicts its social behavior in real-time. The proposed model is reducing the trainable parameters to the extent of 94% as compared to vanilla CNN model, which indicates that its implementation ability in a real-time based application such as human-robot interaction. Rigorous experiments have been performed to validate the methodology, which is sufficiently robust and could achieve a high level of accuracy. Seven datasets are used to build a robust model. Finally, the model is integrated in a humanoid robot, NAO, in real-time. When averaged over all the emotions, the reduction in response time by 60% and 61% and improvement in prediction rate by 42% and 21% when compared in real-time environment with Vanilla CNN and state of the art model respectively.














Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Abate AF, Barra P, Bisogni C, Cascone L, Passero I (2020) Contextual trust model with a humanoid robot defense for attacks to smart eco-systems. In: IEEE access, vol 8, pp 207404–207414. https://doi.org/10.1109/ACCESS.2020.3037701
Albani D, Youssef A, Suriani V, Nardi D, Bloisi DD Behnke S., Sheh R., Sarıel S, Lee DD (eds) (2017) Adeep learning approach for object recognition with NAO soccer robots, vol 9776. Springer, Cham
Alizadeh S, Fazel A (2017) Convolutional neural networks for facial expression recognition, CoRR
Anwar I, Islam NU (2017) Learned features are better for ethnicity classifications. Available at: https://arxiv.org/abs/1709.07429
Arriaga O, Valdenegro-Toro M, Plöger P (2017) Real-time convolutional neural networks for emotion and gender classification, arXiv:1710.07557
Barra P, Bisogni C, Rapuano A, Abate AF, Iovane G (2019) HiMessage: an interactive voice mail system with the humanoid robot pepper. In: 2019 IEEE intl conf on dependable, autonomic and secure computing, intl conf on pervasive intelligence and computing, intl conf on cloud and big data computing, intl conf on cyber science and technology congress (DASC/PiCom/CBDCom/CyberSciTech), pp 652–656
Bo H, Ma L, Liu Q, et al. (2019) Music-evoked emotion recognition based on cognitive principles inspired EEG temporal and spectral features. Int J Mach Learn & Cyber 10:2439–2448
Chen X, Yang X, Wang M, Zou J (2017) Convolution neural network for automatic facial expression recognition. In: 2017 International conference on applied system innovation (ICASI). Sapporo, pp 814–817
Chen L, Zhou C, Shen L (2012) Facial expression recognition based on SVM in E-learning. In: Proceedings of 2012 international conference on future computer supported education (FCSE 2012), pp 220–221
Chu WS, Torre FD, Cohn JF (2017) Learning spatial and temporal cues for multi-label facial action unit detection. In: Proceedings of the 12th IEEE international conference on automatic face and gesture recognition, Washington, pp 1–8
ChulKo B (2018) A brief review of facial emotion recognition based on visual information. Sensors
Dachapally PR (2017) Facial emotion detection using convolutional neural networks and representational autoencoder units. Available at: https://arxiv.org/abs/1706.01509
Feng H, Golshan H, Mahoor M (2018) A wavelet-based approach to emotion classification using EDA signals. Exp Syst Applic 112:77–86
Gholipour B (2014) Accessed date 20 August 2020 https://www.livescience.com/44494-human-facial-expressions-compound-emotions.html
Ghosal D, Majumder N, Poria S, Chhaya N, Gelbukh A (2019) Dialoguegcn: a graph convolutional neural network for emotion recognition in conversation. arXiv:1908.11540
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang L, Wang G et al (2017) Recent advances in convolutional neural networks. Pattern Recognit 1:1–24
He K, Sun J (2015) Convolutional neural networks at a constrained time cost. In: CVPR
Huang MH, Wang ZW, Ying ZL (2010) A new method for facial expression recognition based on sparse representation plus LBP. In: Proceedings of the international congress on image and signal processing. Yantai, pp 1750–1754
Hyung H, Lee D, Yoon HU, Choi D, Lee D, Hur M (2018) Facial expression generation of an android robot based on probabilistic model. In: 2018 27th IEEE international symposium on robot and human interactive communication (RO-MAN), Nanjing, pp 458–460
Jiang B, Valstar MF, Pantic M (2011) Action unit detection using sparse appearance descriptors in space-time video volumes. In: Proceedings of the IEEE International conference and workshops on automatic face & gesture recognition. Santa Barbara, pp 314–321
Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE International conference on computer vision. Santiago pp 2983–2991
Kanade T, Cohn JF, Tian Y (2000) Comprehensive database for facial expression analysis. In:: Proceedings of the Fourth IEEE International conference on automatic face and gesture recognition (FG’00). Grenoble, pp 46–53
Lee S, Baddar WJ, Ro YM (2016) Collaborative expression representation using peak expression and intra class variation face images for practical subject-independent emotion recognition in videos. Pattern Recognit 54:52–67
Liu M, Li S, Shan S, Chen X (2013) Au-aware deep networks for facial expression recognition. In: Proceedings of the IEEE International conference and workshops on automatic face and gesture recognition. Shanghai, pp 1–6
Liu M, Li S, Shan S, Chen X (2015) AU-inspired deep networks for facial expression feature learning. Neurocomputing 159:126–136
Liu M, Li S, Shan S, Wang R, Chen X (2014) Deeply learning deformable facial action parts model for dynamic expression analysis. In: Proceedings of the Asian conference on computer vision. Singapore, pp 143–157
Liu S, Tong J, Meng J, et al. (2018) Study on an effective cross-stimulus emotion recognition model using EEGs based on feature selection and support vector machine. Int J Mach Learn & Cyber 9:721– 726
Lyons MJ, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: 3rd IEEE International conference on automatic face and gesture recognition, pp 200–205. https://doi.org/10.1109/AFGR.1998.670949
Ma C, Wittenbrink (2015) The Chicago face database: a free stimulus set of faces and norming data. Behav Res Methods 47:1122–1135
Majumder N, Poria S, Hazarika D, Mihalcea R, Gelbukh A, Cambria E (2019) DialogueRNN: an attentive RNN for emotion detection in conversations. Proc AAAI Conf Artif Intell 33(01):6818–6825. https://doi.org/10.1609/aaai.v33i01.33016818
Mehta D, Siddiqui MFH, Javaid AY (2018) Facial emotion recognition: a survey and real-world user experiences in mixed reality. Sensors (Basel) 18(2):E416
Mena-Chalco J, Marcondes R, Velho L (2008) Banco de Dados de Faces 3D: IMPA-FACE3D. TR 01, IMPA - VISGRAF Laboratory
Mohammadpour M, Khaliliardali H, Hashemi SMR, AlyanNezhadi MM (2017) Facial emotion recognition using deep convolutional networks. In: 2017 IEEE 4th international conference on knowledge-based engineering and innovation (KBEI), Tehran, pp 0017–0021
Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: Proceedings of the IEEE Winter conference on application of computer vision, Lake Placid, pp 1–10
Mollahosseini A, Hasani B, Mahoor MH (2019) AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31
Pantic M, Rothkrantz JM (2004) Facial action recognition for facial expression analysis from static face images. IEEE Trans Syst Man Cybern 34:3
Pierre-Luc C, Aaron C (2017) Accessed date 20 August 2020 https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/datahttps://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data
Pilla V Jr, Medeiros H (2016) Facial expression classification using convolutional neural network and support vector machine, Available at https://pdfs.semanticscholar.org/d300/50cfd16b29e43ed2024ae74787ac0bbcf2f7.pdf, WVC
Ren X, Guo H, He G, Xu X, Di C, Li S (2016) Convolutional neural network based on principal component analysis initialization for image classification. In: 2016 IEEE first international conference on data science in cyberspace (DSC), Changsha, pp 329–334
Samara A, Menezes MLR, Galway L (2016) Feature extraction for emotion recognition and modelling using neurophysiological data. In: 2016 15th international conference on ubiquitous computing and communications and 2016 international symposium on cyberspace and security (IUCC-CSS), Granada, pp 138–144
Samarth T, Srinivas A, Ranti DS, Sudhanshu M, Samit B (2017) Using deep and convolutional neural networks for accurate emotion classification on DEAP dataset. In: Proc IAAI, pp 4746–4752
Shruti J, Ashish M, Gora CN (2018) Investigation on the effect of L1 an L2 regularization on image feature extracted using restricted Boltzmann machine. In: 2018 Second international conference on intelligent computing and control systems (ICICCS), Madurai, pp 1548–1553
Shruti J, Pratush M, Gora CN (2018) Deep learning-based command pointing direction estimation using a single rgb camera. In: 2018 5th IEEE Uttar Pradesh section international conference on electrical, electronics and computer engineering (UPCON). Gorakhpur, pp 1–6
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR
Tautkute I, Trzcinski T, Bielski A (2018) I know how you feel: emotion recognition with facial landmarks. In: CVPRW
Tian Y, Kanade T, Cohn J (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23:2
Webb N, Ruiz-Garcia A, Elshaw M, Palade V (2020) Emotion recognition from face images in an unconstrained environment for usage on social robots
Wu X, Bartram L (2018) Social robots for people with developmental disabilities: a user study on design features of a graphical user interface. arXiv:1808.0012
Zhang S, Zhao X, Lei B (2012) Robust facial expression recognition via compressive sensing. Sensors 12:3747–3761
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29:915–928
Zhen W, Zilu Y (2012) Facial expression recognition based on local phase quantization and sparse representation. In: Proceedings of the IEEE international conference on natural computation. Chongqing, pp 222–225
Acknowledgements
The authors thank all the research scholars of the robotics and machine intelligence laboratory of our institute who gave their consent and helped in data collection and carrying out the experiment. This research was improved by the suggestions given by reviewers of CVPR conference where a part of this work is presented as a poster in Women in Computer Vision Workshop of CVPR 2019 Conference.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jaiswal, S., Nandi, G.C. Optimized, robust, real-time emotion prediction for human-robot interactions using deep learning. Multimed Tools Appl 82, 5495–5519 (2023). https://doi.org/10.1007/s11042-022-12794-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12794-3