Abstract
We present Human Comfort Classifier (HCC): A framework for classifying human discomfort from video. Recognizing comfort and discomfort in social interactions is something that many of us do without having to think about it. However, identifying discomfort in others can be a challenge for individuals with social skills deficits, who often become socially isolated. Social isolation can lead to many negative outcomes for individuals and is recognized by the CDC and WHO as a priority public health problem. In this work, we propose HCC to detect discomfort in videos. This can be utilized for training for individuals with social skills deficits. HCC utilizes a multi-modal approach of pose estimation, facial landmarks, and natural language processing to determine comfort in real time. We utilize an explainable rule-based model to categorize behavior and achieve approximately 78% prediction accuracy on an interview dataset.
W. Valentine and M. WebbâEqual contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aviezer, H., et al.: Angry, disgusted, or afraid? studies on the malleability of emotion perception. Psychol. Sci. 19(7), 724â732 (2008)
Baker, J.: Key components of social skills training. Teaching Social Skills to People with Autism: Best Practices in Individualizing Interventions. Woodbine House, Inc., Bethesda, MD (2013)
Bazarevsky, V., Grishchenko, I., Raveendran, K., Zhu, T., Zhang, F., Grundmann, M.: Blazepose: On-device real-time body pose tracking. CoRR abs/ arXiv: 2006.10204 (2020). https://arxiv.org/abs/2006.10204
Busso, C., et al.: Iemocap: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42, 335â359 (2008)
CDC: Loneliness and social isolation linked to serious health conditions. Alzheimerâs Disease and Healthy Aging (2021)
Conger, J.C., Keane, S.P.: Social skills intervention in the treatment of isolated or withdrawn children. Psychol. Bull. 90(3), 478 (1981)
De Boo, G.M., Prins, P.J.: Social incompetence in children with adhd: possible moderators and mediators in social-skills training. Clin. Psychol. Rev. 27(1), 78â97 (2007)
De Silva, L.C., Miyasato, T., Nakatsu, R.: Facial emotion recognition using multi-modal information. In: Proceedings of ICICS, 1997 International Conference on Information, Communications and Signal Processing. Theme: Trends in Information Systems Engineering and Wireless Multimedia Communications (Cat. vol. 1, pp. 397â401. IEEE (1997)
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops), pp. 2106â2112. IEEE (2011)
Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Commun. 40(1â2), 33â60 (2003)
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124 (1971)
Gendron, M., Mesquita, B., Barrett, L.F.: 538539 emotion perception: putting the face in context. In: The Oxford Handbook of Cognitive Psychology. Oxford University Press (Mar 2013). https://doi.org/10.1093/oxfordhb/9780195376746.013.0034
Hagopian, L.P., Kuhn, D.E., Strother, G.E., Van Houten, R.: Targeting social skills deficits in an adolescent with pervasive developmental disorder (2009)
Harrigan, J.A., OâConnell, D.M.: How do you look when feeling anxious? facial displays of anxiety. Personality Individ. Differ. 21(2), 205â212 (1996)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004). https://api.semanticscholar.org/CorpusID:207155218
Ilyas, C.M.A., Nunes, R., Nasrollahi, K., Rehm, M., Moeslund, T.B.: Deep emotion recognition through upper body movements and facial expression. In: VISIGRAPP (5: VISAPP), pp. 669â679 (2021)
Kadambi, A., Ichien, N., Qiu, S., Lu, H.: Understanding the visual perception of awkward body movements: how interactions go awry. Attention, Percept. Psychophys. 82(5), 2544â2557 (2020). https://doi.org/10.3758/s13414-019-01948-5
Kartynnik, Y., Ablavatski, A., Grishchenko, I., Grundmann, M.: Real-time facial surface geometry from monocular video on mobile gpus (2019)
Kosti, R., Alvarez, J.M., Recasens, A., Lapedriza, A.: Context based emotion recognition using emotic dataset. IEEE Trans. Pattern Anal. Mach. Intell. 42(11), 2755â2766 (2019)
Koudenburg, N., Postmes, T., Gordijn, E.: Disrupting the flow: how brief silences in group conversations affect social needs. J. Exper. Soc. Psychol. 47, 512â515 (2011). https://doi.org/10.1016/j.jesp.2010.12.006
Liberman, R.P.: Assessment of social skills. Schizophr. Bull. 8(1), 62 (1982)
Lugaresi, C., et al.: Mediapipe: Aaframework for building perception pipelines. ArXiv abs/ arXiv: 9060.8172 (2019). https://api.semanticscholar.org/CorpusID:195069430
Maenner, M.J., et al..: Prevalence and characteristics of autism spectrum disorder among children aged 8 years - autism and developmental disabilities monitoring network, 11 sites, united states, 2020. MMWR Surveillance Summaries 72(2) (2023)
Novotney, A.: The risks of social isolation. Monitor Psychol. 50(5), 32â37 (2019). https://www.apa.org/monitor/2019/05/ce-corner-isolation
Pereira, R., et al.: Systematic review of emotion detection with computer vision and deep learning. Sensors 24(11) (2024). https://doi.org/10.3390/s24113484, https://www.mdpi.com/1424-8220/24/11/3484
Radford, A., Kim, J.W., Xu, T., Brockman, G., McLeavey, C., Sutskever, I.: Robust speech recognition via large-scale weak supervision (2022)
Ranganathan, H., Chakraborty, S., Panchanathan, S.: Multimodal emotion recognition using deep learning architectures. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1â9. IEEE (2016)
Segrin, C., Kinney, T.: Social skills deficits among the socially anxious: rejection from others and loneliness. Motiv. Emot. 19, 1â24 (1995)
Stratton, D., Hand, E.: Bridging the gap between automated and human facial emotion perception. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2400â2410 (2022). https://doi.org/10.1109/CVPRW56347.2022.00268
Templeton, E.M., Chang, L.J., Reynolds, E.A., Cone LeBeaumont, M.D., Wheatley, T.: Fast response times signal social connection in conversation. Proc. Natl. Acad. Sci. 119(4), e2116915119 (2022)
WHO: Social isolation and loneliness (2024)
Xue, J., Wang, J., Wu, X., Zhang, Q.: Affective video content analysis: Decade review and new perspectives (2024). https://arxiv.org/abs/2310.17212
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant #IIS-2150394. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Âİ 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Valentine, W., Webb, M., Collum, C., Feil-Seifer, D., Hand, E. (2025). HCC: An Explainable Framework for Classifying Discomfort from Video. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2024. Lecture Notes in Computer Science, vol 15047. Springer, Cham. https://doi.org/10.1007/978-3-031-77389-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-77389-1_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-77388-4
Online ISBN: 978-3-031-77389-1
eBook Packages: Computer ScienceComputer Science (R0)