Abstract
Soft skills are interpersonal skills that define the personality trait of an individual. These skills influence the social interactions of a person and are described using non-verbal and verbal factors like posture, gesture, vocal tone, etc. Throughout the literature, many attempts have been made to evaluate the soft skills of an individual using computer-based algorithms. Most of these attempts focus on only one of the two factors, i.e., either verbal or non-verbal evaluation. The non-verbal evaluation algorithms mainly consist of lateral body posture evaluation or the study of kinematic behavior using specialized hardware, whereas the verbal evaluation typically includes in-person interviews. Hence, a computerized soft skills evaluation system is required that can automatically evaluate the soft skills of an individual. This study proposes a way to evaluate soft skills using automated frontal posture evaluation and vocal assessment of an individual. The proposed methodology estimates the non-verbal and verbal confidence score of an individual. The non-verbal confidence score is estimated using the combination of the MoveNet-thunder skeleton estimation algorithm and the posture angle evaluation system. This system evaluates the frontal posture of an individual and provides a confidence score based on various posture angles like shoulder alignment, neck bent angle, arm abduction, etc. The verbal confidence score is estimated using pause detection, filler word detection, and continuous word repetition estimation models. The confidence scores generated from these two estimation pipelines are combined to form the overall soft skills confidence score of an individual. This score is compared with a threshold value to validate the results. The threshold value depicts the average and natural soft skills confidence score of an individual. This threshold value was estimated using the natural speech part of the standard Librispeech dataset and the bio-mechanic studies for posture estimation. The proposed model only supports single-user visual input, and it can be improved using cloud implementations and real-time data input considerations.
Similar content being viewed by others
References
Heckman, J.J., Kautz, T.: Hard evidence on soft skills. Labour Econ. 19(4), 451–464 (2012)
Dunbar, N.E., Brooks, C.F., Kubicka-Miller, T.: Oral communication skills in higher education: using a performance-based evaluation rubric to assess communication skills. Innov. High. Educ. 31(2), 115–128 (2006)
Sobol Shikler, T.: Analysis of affective expression in speech. PhD thesis, Cambridge University (2007)
Ferguson, S.H., Morgan, S.D.: Talker differences in clear and conversational speech: perceived sentence clarity for young adults with normal hearing and older adults with hearing loss. J. Speech Lang. Hear. Res. 61(1), 159–173 (2018)
Fawcett, S.B., Miller, L.K.: Training public-speaking behavior: an experimental analysis and social validation. J. Appl. Behav. Anal. 2, 125–135 (1975)
De Jong, N., Wempe, T.: Praat script to detect syllable nuclei and measure speech rate automatically. Behav. Res. Methods 41(2), 385–390 (2009)
Isbister, K., Nass, C.: Consistency of personality in interactive characters: verbal cues, non-verbal cues, and user characteristics. Int. J. Hum. Comput. Stud. 53(2), 251–267 (2000)
Ohlendorf, D., Sosnov, P., Keller, J., Wanke, E.M., Oremek, G., Ackermann, H., Groneberg, D.A.: Standard reference values of the upper body posture in healthy middle-aged female adults in Germany. Sci. Rep. 11, 1–10 (2021)
Claus, A.P., Hides, J.A., Moseley, G.L., Hodges, P.W.: Thoracic and lumbar posture behaviour in sitting tasks and standing: progressing the biomechanics from observations to measurements. Appl. Ergon. 53, 161–168 (2016)
Aviv, I., Barger, A., Pyatigorsky, S.: Novel machine learning approach for automatic employees’ soft skills assessment: group collaboration analysis case study. In: Fifth International Conference On Intelligent Computing in Data Sciences (ICDS), pp. 1–7 (2021)
Li, J., Wong, Y., Kankanhalli, M.S.: Multi-stream deep learning framework for automated presentation assessment. In: IEEE International Symposium on Multimedia (ISM), pp. 222–225 (2016)
Echeverría, V., Avendaño, A., Chiluiza, K., Vásquez, A., Ochoa, X.: Presentation skills estimation based on video and kinect data analysis. In: Proceedings of the ACM Workshop on Multimodal Learning Analytics Workshop and Grand Challenge, pp. 53–60 (2014)
Chen, L., Feng, G., Joe, J., Leong, C.W., Kitchen, C., Lee, C.M.: Towards automated assessment of public speaking skills using multimodal cues. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 200–203 (2014)
Kiwelekar, A.W., Netak, L.D.: Automatic grading of student’s presentation skills based on powerpoint presentation and audio. U. Porto J. Eng. 8(2), 95–107 (2022)
Bhatia, G., Saha, B., Khamkar, M., Chandwani, A., Khot, R.: Stutter diagnosis and therapy system based on deep learning. arXiv preprint arXiv:2007.08003 (2020)
Casale, S., Russo, A., Scebba, G., & Serrano, S.: Speech emotion classification using machine learning algorithms. In: 2008 IEEE International Conference on Semantic Computing, pp. 158–165 (2008)
Kourkounakis, T., Hajavi, A., Etemad.: FluentNet: end-to-end detection of speech disfluency with deep learning. arXiv preprint arXiv:2009.11394 (2020)
Harvill, J., Hasegawa-Johnson, M., Yoo, C.: Frame-level stutter detection. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2843–2847 (2022)
Das, S., Gandhi, N., Naik, T., Shilkrot, R.: Increase apparent public speaking fluency by speech augmentation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6890–6894 (2019)
Qian, X., Yang, L.: Disfluency detection using multi-step stacked learning. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2013)
Honnibal, M., Mark, J.: Joint incremental disfluency detection and dependency parsing. Trans. Assoc. Comput. Linguist. 2, 131–142 (2014)
Rasooli, M. S., Tetreault, J.: Joint parsing and disfluency detection in linear time. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 124–129 (2013)
Zayats, V., Mari O., Hannaneh, H.: Disfluency detection using a bidirectional LSTM.arXiv preprint arXiv:1604.03209 (2016)
Li, S.S.: Design and Implementation of Sitting Position Recognition Software Based on Kinect Sensor. University of Electronic Science and Technology (2018)
Li, S., Zhou, P., Xiao, W., Zhou, G.: A wearable system for cervical spondylosis prevention based on artificial intelligence. Zhongguo yi Liao qi xie za zhi Chinese J. Med. Instrum. 44(1), 33–37 (2020)
Liu, X.: Design and Implementation of Sitting Position Detection System Based on Inertial Sensor. Harbin University of Science and Technology (2017)
Piñero-Fuentes, E., Canas-Moreno, S., Rios-Navarro, A., Domínguez-Morales, M., Sevillano, J.L., Linares-Barranco, A.: A deep-learning based posture detection system for preventing telework-related musculoskeletal disorders. Sensors 21(15), 5236 (2021)
Seo, J., Lee, S.: Automated postural ergonomic risk assessment using vision-based posture classification. Autom Constr 128, 103725 (2021)
Chen, K.: Sitting posture recognition based on OpenPose. In: IOP Conference Series: Materials Science and Engineering, vol. 677.3 (2019)
Nguyen, A.T., Chen W., Rauterberg M.: Online feedback system for public speakers. In: IEEE Symposium on E-Learning, E-Management and E-Services (2012)
Simonyan, K., Andrew, Z.: Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199 (2014)
Du, Y., Wei W., Liang, W.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Nadeem, A., Ahmad, J., Kibum, K.: Automatic human posture estimation for sport activity recognition with robust body parts detection and entropy markov model. Multimedia Tools Appl. 80, 21465–21498 (2021)
Ogundokun, R.O., Rytis, M., Robertas, D.: Human posture detection using image augmentation and hyperparameter-optimized transfer learning algorithms. Appl. Sci. 12(19), 10156 (2022)
Gan, T., Wong, Y., Mandal, B., Chandrasekhar, V., Kankanhalli, M.S.: Multi-sensor self-quantification of presentations. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 601–610 (2015)
Saunders, L., Rozaklis, L., Abels, E.G.: Repositioning Reference: New Methods and New Services for a New Age. Rowman & Littlefield, London (2014)
Bajpai, R., Deepak, J.: Movenet: A deep neural network for joint profile prediction across variable walking speeds and slopes. IEEE Trans. Instrum. Meas. 70, 1–11 (2021)
Bazarevsky, V., Grishchenko, I., Raveendran, K., Zhu, T., Zhang, F., Grundmann, M.: Blazepose: on-device real-time body pose tracking. arXiv preprint arXiv:2006.10204 (2020)
Kendall, A., Matthew G., Roberto C.: Posenet: a convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L. Microsoft coco: common objects in context. In: Computer Vision 13th European Conference, pp. 740–755. Springer (2014)
Seo, J., Han, S., Lee, S., Armstrong, T.J.: Motion data-driven unsafe pose identification through biomechanical analysis. In: Computing in Civil Engineering, pp. 693–700 (2013)
Woodson, W. E., Tillman, B., & Tillman, P.: Human factors design handbook: information and guidelines for the design of systems, facilities, equipment, and products for human use (1992)
Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: Librispeech: an ASR corpus based on public domain audio books. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210 (2015)
McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., Nieto, O.: Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, vol. 8, pp. 18–25 (2015)
Stojanovic, V., Novak, N.: Robust identification of OE model with constrained output using optimal input design. J. Frankl. Inst. 353(2), 576–593 (2016)
Filipovic, V., Nedic, N., Stojanovic, V.: Robust identification of pneumatic servo actuators in the real situations. Forsch Ingenieurwes 75, 183–196 (2011)
Zhuang, Z., Tao, H., Chen, Y., Stojanovic, V., Paszke, W.: An optimal iterative learning control approach for linear systems with nonuniform trial lengths under input constraints. IEEE Trans. Syst. Man Cybern. Syst. 53, 3461–3473 (2022)
Stojanovic, V., Nedic, N.: Joint state and parameter robust estimation of stochastic nonlinear systems. Int. J. Robust Nonlinear Control 26(14), 3058–3074 (2016)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they do not have any conflict of interest. This research did not involve any human or animal participation. All authors have checked and agreed on the submission.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (MP4 128908 KB)
Supplementary file2 (MP4 6905 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gulati, V., Dwivedi, S., Kumar, D. et al. An automated framework to evaluate soft skills using posture and disfluency detection. Machine Vision and Applications 34, 94 (2023). https://doi.org/10.1007/s00138-023-01431-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-023-01431-0