Abstract
It is important for monitoring systems for the elderly to estimate the state of the user during an interaction. In the field of elderly welfare, quality of life (QOL) is a useful indicator for comprehensive treatment of physical suffering as well as mental health and social well-being. In this study, we propose a new approach to QOL estimation that integrates facial expressions, head fluctuations, and eye movements captured during interactions with a communication agent. To this end, we implemented a communication agent and constructed a database based on the information collected through interpersonal experiments with 14 participants. In addition, we implemented a multimodal learning estimator and performed training with facial expressions, head fluctuation, and gaze feature extraction. Our results show that integration learning realized with all of the features integrated provided less error than single-modal learning based on facial expression alone. From our experimental results, we concluded that the proposed system can be used adequately as a QOL estimator.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Cabinet Office (CAO): Annual report on the ageing society 2018 (2018). https://www8.cao.go.jp/kourei/english/annualreport/2018/2018pdf_e.html
Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multiperson 2D pose estimation using part affinity fields. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302–1310 (2017). https://doi.org/10.1109/CVPR.2017.143
Cohn, J.F., Reed, L.I., Ambadar, Z. Xiao, J., Moriyama, T.: Automatic analysis and recognition of brow actions and head motion in spontaneous facial behavior. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 1, pp. 610–616. IEEE (2004). https://doi.org/10.1109/ICSMC.2004.1398367
DeVault, D., et al.: SimsenSei Kiosk: a virtual human interviewer for healthcare decision support. In: Proceedings of the 2014 international Conference on Autonomous Agents and Multi-agent Systems, pp. 1061–1068. International Foundation for Autonomous Agents and Multiagent Systems (2014)
docomo developer support, NTT Docomo. (n.d.). https://dev.smt.docomo.ne.jp. Accessed 26 Feb 2020
Fan, Y., Lu, X., Li, D., Liu, Y.: Video-based emotion recognition using CNN-RNN and C3D hybrid networks. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 445–450 (2016). https://doi.org/10.1145/2993148.2997632
Fukuhara, S., Bito, S., Green, J., Hsiao, A., Kurokawa, K.: Translation, adaptation, and validation of the SF-36 health survey for use in Japan. J. Clin. Epidemiol. 51(11), 1037–1044 (1998). https://doi.org/10.1016/S0895-4356(98)00095-X
Higuchi, M., et al.: Discrimination of bipolar disorders using voice. In: Cipresso, P., Serino, S., Villani, D. (eds.) MindCare 2019. LNICSSITE, vol. 288, pp. 199–207. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25872-6_16
Hu, M., Yang, C., Zheng, Y., Wang, X., He, L., Ren, F.: Facial expression recognition based on fusion features of center-symmetric local signal magnitude pattern. IEEE Access 7, 118435–118445 (2019)
Kocaguneli, E., Menzies, T.: Software effort models should be assessed via leave-one-out validation. J. Syst. Softw. 86(7), 1879–1890 (2013). https://doi.org/10.1016/j.jss.2013.02.053
Liu, P., Yin, L.: Spontaneous facial expression analysis based on temperature changes and head motions. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–6. IEEE (2015). https://doi.org/10.1109/FG.2015.7163094
Lu, Y., Zheng, W.L., Li, B., Lu, B.L.: Combining eye movements and EEG to enhance emotion recognition. In: Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI), pp. 1170–1176. AAAI Press (2015)
Mori, T., et al.: Life pattern estimation of the elderly based on accumulated activity data and its application to anomaly detection. J. Robot. Mechatron. 24(5), 754–765 (2012). https://doi.org/10.20965/jrm.2012.p0754
Nakagawa, S., Enomoto, D., Yonekura, S., Kanazawa, H., Kuniyoshi, Y.: A telecare system that estimates quality of life through communication. In: 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), pp. 325–330 (2018). https://doi.org/10.1109/CCIS.2018.8691360
Nakagawa, S., Enomoto, D., Yonekura, S., Kanazawa, H., Kuniyoshi, Y.: New telecare approach based on 3D convolutional neural network for estimating quality of life. Neurocomputing 397, 464–476 (2020). https://doi.org/10.1016/j.neucom.2019.09.112
Ren, F., Dong, Y., Wang, W.: Emotion recognition based on physiological signals using brain asymmetry index and echo state network. Neural Comput. Appl. 1–11 (2018). https://doi.org/10.1007/s00521-018-3664-1
Ren, F., Kang, X., Quan, C.: Examining accumulated emotional traits in suicide blogs with an emotion topic model. IEEE J. Biomed. Health Inform. 20(5), 1384–1396 (2015). https://doi.org/10.1109/JBHI.2015.2459683
Ren, F., Wu, Y.: Predicting user-topic opinions in Twitter with social and topical context. IEEE Trans. Affect. Comput. 4(4), 412–424 (2013). https://doi.org/10.1109/T-AFFC.2013.22
Solbach, M.D., Tsotsos, J.K. Vision-based fallen person detection for the elderly. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1433–1442 (2017). https://doi.org/10.1109/ICCVW.2017.170
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015). https://doi.org/10.1109/ICCV.2015.510
The United Nations: World population ageing 2019 (2019). https://www.un.org/en/development/desa/population/publications/pdf/ageing/WorldPopulationAgeing2019-Report.pdf
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. I (2001). https://doi.org/10.1109/CVPR.2001.990517
Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009). https://doi.org/10.1109/TPAMI.2008.52
Zhang, Y., Ji, Q.: Active and dynamic information fusion for facial expression understanding from image sequences. IEEE Trans. Pattern Anal. Mach. Intell. 27(5), 699–714 (2005). https://doi.org/10.1109/TPAMI.2005.93
Zhao, L.M., Li, R., Zheng, W.L., Lu, B.L.: Classification of five emotions from EEG and eye movement signals: complementary representation properties. In: 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER), pp. 611–614 (2019). https://doi.org/10.1109/NER.2019.8717055
Acknowledgements
This research was supported by the KDDI Foundation Research Grant Program 2019 and Graduate Program for Social ICT Global Creative Leaders (GCL) of The University of Tokyo by MEXT (Ministry of Education, Culture, Sports, Science and Technology).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nakagawa, S., Yonekura, S., Kanazawa, H., Nishikawa, S., Kuniyoshi, Y. (2021). New Approach to Estimating Mental Health Score Using a Communication Agent. In: Yada, K., et al. Advances in Artificial Intelligence. JSAI 2020. Advances in Intelligent Systems and Computing, vol 1357. Springer, Cham. https://doi.org/10.1007/978-3-030-73113-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-73113-7_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73112-0
Online ISBN: 978-3-030-73113-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)