Abstract
Understanding and modeling people’s behavior in social interactions is an important problem in Social Computing. In this work, we automatically predict the communication skill of a person in two kinds of interview-based social interactions namely interface-based (without an interviewer) and traditional face-to-face interviews. We investigate the differences in behavior perception and automatic prediction of communication skill when the same participant gives both interviews. Automated video interview platforms are gaining increasing attention that allows conducting interviews anywhere and anytime. Until recently, interviews were conducted face-to-face either for screening or for automatic assessment purposes. Our dataset consists of 100 dual interviews where the same participant participates in both settings. External observers rate the interviews by answering several behavioral based assessment questions (manually annotated attributes). Multimodal features related to lexical, acoustic and visual behavior are extracted automatically and trained using supervised learning algorithms like Support Vector Machines (SVM) and Logistic Regression. We make an extensive study of the verbal behavior of the participant using the spoken response obtained from manual transcriptions and an Automatic Speech Recognition (ASR) tool. We also explore early and late fusion of modalities for better prediction. Our best results indicate that automatic assessment can be done with interface-based interviews.
Similar content being viewed by others
Notes
References
Ambady N, Rosenthal R (1992) Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. American Psychological Association
Batrinca LM, Mana N, Lepri B, Pianesi F, Sebe N (2012) Please, tell me about yourself: automatic personality assessment using short self-presentations. In: Proceedings of the 13th international conference on multimodal interfaces. ACM, pp 255–262
Batrinca L et al (2013) Cicero-towards a multimodal virtual audience platform for public speaking training. In: International Workshop on Intelligent Virtual Agents. Springer, Berlin
Biel JI, Gatica-Perez D (2013) The Youtube lens: crowdsourced personality impressions and audiovisual analysis of vlogs. IEEE Trans Multimed 15(1):41–55
Boersma P (2002) Praat, a system for doing phonetics by computer. Glot Int 5 (9/10):341–345
Celiktutan O, Sariyanidi E, Gunes H (2015) Let me tell you about your personality!: real-time personality prediction from nonverbal behavioural cues. In: 015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), 2, vol 1, pp 1–1
Cheng DS, Salamin H, Salvagnini P, Cristani M, Vinciarelli A, Murino V (2014) Predicting online lecture ratings based on gesturing and vocal behavior. J Multimodal Interfaces 8(2):151–160
Chollet M, Scherer S (2017) Assessing public speaking ability from thin slices of behavior. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE
DeGroot T, Gooty J (2009) Can nonverbal cues be used to make meaningful personality attributions in employment interviews?. J Bus Psychol 24(2):179–192
Eyben F, Weninger F, Gross F, Schuller B (2013) Recent developments in openSMILE, the munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM international conference on Multimedia. ACM, pp 835–838
Fleiss JL, Levin B, Paik MC (2003) The measurement of interrater agreement. In: Statistical Methods for Rates and Proportions. 3rd edn. Wiley, pp 598–626
Giannakopoulos T (2010) A method for silence removal and segmentation of speech signals, implemented in matlab Web Resource
Gwen L, Whitehill J, Wu T, Fasel I, Mark F, Javier M, Marian B (2011) The computer expression recognition toolbox (CERT), 2011 IEEE International Conference and Workshops on Automatic Face & Gesture Recognition and Workshops, pp 298–305
Hassle-free Efficient Hiring (2016) https://www.talview.com/automated-video
Hayes AF, Krippendorff K (2007) Answering the call for a standard reliability measure for coding data. Commun Methods Measures 1:77–89
Hecht M, De Vito J, Guerrero L (1999) Perspectives on Nonverbal Communication: Codes, Functions, and Contexts. In: The Nonverbal Communication Reader
Hoque ME, Courgeon M, Martin JC, Mutlu B, Picard R. W. (2013) Mach: My automated conversation coach. In: Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing, pp 697–706
Hung H, Jayagopi DB, Yeo C, Friedland G, Ba SO, Odobez JM, Gatica-Perez D (2007) Using audio and video features to classify the most dominant person in a group meeting. LIDIAP-CONF-2007- 016
Hung H, Jayagopi DB, Yeo C, Friedland G, Ba SO, Odobez JM, Gatica-Perez D (2007) Using audio and video features to classify the most dominant person in a group meeting. In: LIDIAP Conference
IBM Watson, https://speech-to-text-demo.mybluemix.net/
Joshi J, Gunes H, Goecke R (2014) Automatic prediction of perceived traits using visual cues under varied situational context. In: ICPR, pp 2855–2860
Murray KW, Orii N (2012) Automatic essay scoring, pp 45–52
Mohammadi G, Vinciarelli A (2012) Towards a technology of nonverbal communication: vocal behavior in social and affective phenomena. No. EPFL-REPORT-192770. Idiap
Naim I, Tanveer MI, Gildea D, Hoque ME (2015) Automated prediction and analysis of job interview performance: The role of what you say and how you say it. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol 1, pp 1–6
Nguyen LS, Gatica-Perez D (2015) I would hire you in a minute: Thin slices of nonverbal behavior in job interviews. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, pp 51–58
Nguyen LS, Frauendorfer D, Mast MS, Gatica-Perez D (2014) Hire me: Computational inference of hirability in employment interviews based on nonverbal behavior. IEEE Trans Multimed 16(4):1018–1031
Nguyen LS, Gatica-Perez D (2016) Hirability in the wild: Analysis of online conversational video resumes, IEEE Transactions on Multimedia
Oertel C, Scherer S, Campbell N On the use of multimodal cues for the prediction of involvement in spontaneous conversation. Interspeech Vol. 2011
Okada S, Aran O, Gatica-Perez D (2015) Personality trait classification via co-occurrent multiparty multimodal event discovery. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp 15–22
Park S et al (2013) Mutual behaviors during dyadic negotiation: Automatic prediction of respondent reactions. 2013 Humaine Association Conference on IEEE Affective Computing and Intelligent Interaction (ACII)
Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The Development and Psychometric Properties of LIWC. LIWC. Net, Austin
Rasipuram S, Jayagopi DB (2016) Automatic assessment of communication skill in interface-based employment interviews using audio-visual cues. In: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, pp 1–6
Rasipuram S, Jayagopi D et al (2016) Asynchronous video interviews vs. face-to-face interviews for communication skill measurement: a systematic study. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction (ICMI), pp 370–377
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1-2):23–69
Rosenberg A, Hirschberg J (2005) Acoustic/prosodic and lexical correlates of charismatic speech. In: INTERSPEECH, pp 513–516
Sanchez-Cortes D, Aran O, Mast MS, Gatica-Perez D (2012) A nonverbal behavior approach to identify emergent leaders in small groups. IEEE Trans Multimed 14(3):816–832
Sanchez-Cortes D, Kumano S, Otsuka K, Gatica-Perez D (2015) In the mood for vlog multimodal inference in conversational social video. ACM Trans Interact Intell Syst (TiiS) 5(2):9
Scherer S et al (2012) Multimodal prediction of expertise and leadership in learning groups. In: Proceedings of the 1st International Workshop on Multimodal Learning Analytics. ACM
Spitzberg BH, Adams TW (2007) CSRS, the conversational skills rating scale: an instructional assessment of interpersonal competence, NCA National Communication Association
Team Acceleration Software, https://www.hirevue.com/solutions/digital-interviewing
Weninger F, Krajewski J, Batliner A, Schuller B (2012) The voice of leadership: models and performances of automatic analysis in online speeches. IEEE Trans Affect Comput 3(4):496–508
Weninger F, Staudt P, Schuller B (2013) Words that fascinate the listener: predicting affective ratings of on-line lectures. Int J Dist Educ Technol (IJDET) 11 (2):110–123
Zechner K, Bejar II (2006) Towards automatic scoring of non-native spontaneous speech. In: Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics. Association for Computationalx Linguistics, pp 216–223
Acknowledgments
This work was funded by SERB Young Scientist grant of Dr. Jayagopi (Grant No. YSS2015001074). We thank Pooja Rao S. B for help in the extraction of verbal features and comments with the manuscript. We would also like to thank all the participants who accepted to use their data for the research purpose.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rasipuram, S., Jayagopi, D.B. Automatic assessment of communication skill in interview-based interactions. Multimed Tools Appl 77, 18709–18739 (2018). https://doi.org/10.1007/s11042-018-5654-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5654-9