Skip to main content
Log in

An automated framework to evaluate soft skills using posture and disfluency detection

  • ORIGINAL PAPER
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Soft skills are interpersonal skills that define the personality trait of an individual. These skills influence the social interactions of a person and are described using non-verbal and verbal factors like posture, gesture, vocal tone, etc. Throughout the literature, many attempts have been made to evaluate the soft skills of an individual using computer-based algorithms. Most of these attempts focus on only one of the two factors, i.e., either verbal or non-verbal evaluation. The non-verbal evaluation algorithms mainly consist of lateral body posture evaluation or the study of kinematic behavior using specialized hardware, whereas the verbal evaluation typically includes in-person interviews. Hence, a computerized soft skills evaluation system is required that can automatically evaluate the soft skills of an individual. This study proposes a way to evaluate soft skills using automated frontal posture evaluation and vocal assessment of an individual. The proposed methodology estimates the non-verbal and verbal confidence score of an individual. The non-verbal confidence score is estimated using the combination of the MoveNet-thunder skeleton estimation algorithm and the posture angle evaluation system. This system evaluates the frontal posture of an individual and provides a confidence score based on various posture angles like shoulder alignment, neck bent angle, arm abduction, etc. The verbal confidence score is estimated using pause detection, filler word detection, and continuous word repetition estimation models. The confidence scores generated from these two estimation pipelines are combined to form the overall soft skills confidence score of an individual. This score is compared with a threshold value to validate the results. The threshold value depicts the average and natural soft skills confidence score of an individual. This threshold value was estimated using the natural speech part of the standard Librispeech dataset and the bio-mechanic studies for posture estimation. The proposed model only supports single-user visual input, and it can be improved using cloud implementations and real-time data input considerations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Heckman, J.J., Kautz, T.: Hard evidence on soft skills. Labour Econ. 19(4), 451–464 (2012)

    Article  Google Scholar 

  2. Dunbar, N.E., Brooks, C.F., Kubicka-Miller, T.: Oral communication skills in higher education: using a performance-based evaluation rubric to assess communication skills. Innov. High. Educ. 31(2), 115–128 (2006)

    Article  Google Scholar 

  3. Sobol Shikler, T.: Analysis of affective expression in speech. PhD thesis, Cambridge University (2007)

  4. Ferguson, S.H., Morgan, S.D.: Talker differences in clear and conversational speech: perceived sentence clarity for young adults with normal hearing and older adults with hearing loss. J. Speech Lang. Hear. Res. 61(1), 159–173 (2018)

    Article  Google Scholar 

  5. Fawcett, S.B., Miller, L.K.: Training public-speaking behavior: an experimental analysis and social validation. J. Appl. Behav. Anal. 2, 125–135 (1975)

    Article  Google Scholar 

  6. De Jong, N., Wempe, T.: Praat script to detect syllable nuclei and measure speech rate automatically. Behav. Res. Methods 41(2), 385–390 (2009)

    Article  Google Scholar 

  7. Isbister, K., Nass, C.: Consistency of personality in interactive characters: verbal cues, non-verbal cues, and user characteristics. Int. J. Hum. Comput. Stud. 53(2), 251–267 (2000)

    Article  Google Scholar 

  8. Ohlendorf, D., Sosnov, P., Keller, J., Wanke, E.M., Oremek, G., Ackermann, H., Groneberg, D.A.: Standard reference values of the upper body posture in healthy middle-aged female adults in Germany. Sci. Rep. 11, 1–10 (2021)

    Article  Google Scholar 

  9. Claus, A.P., Hides, J.A., Moseley, G.L., Hodges, P.W.: Thoracic and lumbar posture behaviour in sitting tasks and standing: progressing the biomechanics from observations to measurements. Appl. Ergon. 53, 161–168 (2016)

    Article  Google Scholar 

  10. Aviv, I., Barger, A., Pyatigorsky, S.: Novel machine learning approach for automatic employees’ soft skills assessment: group collaboration analysis case study. In: Fifth International Conference On Intelligent Computing in Data Sciences (ICDS), pp. 1–7 (2021)

  11. Li, J., Wong, Y., Kankanhalli, M.S.: Multi-stream deep learning framework for automated presentation assessment. In: IEEE International Symposium on Multimedia (ISM), pp. 222–225 (2016)

  12. Echeverría, V., Avendaño, A., Chiluiza, K., Vásquez, A., Ochoa, X.: Presentation skills estimation based on video and kinect data analysis. In: Proceedings of the ACM Workshop on Multimodal Learning Analytics Workshop and Grand Challenge, pp. 53–60 (2014)

  13. Chen, L., Feng, G., Joe, J., Leong, C.W., Kitchen, C., Lee, C.M.: Towards automated assessment of public speaking skills using multimodal cues. In: Proceedings of the 16th International Conference on Multimodal Interaction, pp. 200–203 (2014)

  14. Kiwelekar, A.W., Netak, L.D.: Automatic grading of student’s presentation skills based on powerpoint presentation and audio. U. Porto J. Eng. 8(2), 95–107 (2022)

    Article  Google Scholar 

  15. Bhatia, G., Saha, B., Khamkar, M., Chandwani, A., Khot, R.: Stutter diagnosis and therapy system based on deep learning. arXiv preprint arXiv:2007.08003 (2020)

  16. Casale, S., Russo, A., Scebba, G., & Serrano, S.: Speech emotion classification using machine learning algorithms. In: 2008 IEEE International Conference on Semantic Computing, pp. 158–165 (2008)

  17. Kourkounakis, T., Hajavi, A., Etemad.: FluentNet: end-to-end detection of speech disfluency with deep learning. arXiv preprint arXiv:2009.11394 (2020)

  18. Harvill, J., Hasegawa-Johnson, M., Yoo, C.: Frame-level stutter detection. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2843–2847 (2022)

  19. Das, S., Gandhi, N., Naik, T., Shilkrot, R.: Increase apparent public speaking fluency by speech augmentation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6890–6894 (2019)

  20. Qian, X., Yang, L.: Disfluency detection using multi-step stacked learning. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2013)

  21. Honnibal, M., Mark, J.: Joint incremental disfluency detection and dependency parsing. Trans. Assoc. Comput. Linguist. 2, 131–142 (2014)

    Article  Google Scholar 

  22. Rasooli, M. S., Tetreault, J.: Joint parsing and disfluency detection in linear time. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 124–129 (2013)

  23. Zayats, V., Mari O., Hannaneh, H.: Disfluency detection using a bidirectional LSTM.arXiv preprint arXiv:1604.03209 (2016)

  24. Li, S.S.: Design and Implementation of Sitting Position Recognition Software Based on Kinect Sensor. University of Electronic Science and Technology (2018)

  25. Li, S., Zhou, P., Xiao, W., Zhou, G.: A wearable system for cervical spondylosis prevention based on artificial intelligence. Zhongguo yi Liao qi xie za zhi Chinese J. Med. Instrum. 44(1), 33–37 (2020)

    Google Scholar 

  26. Liu, X.: Design and Implementation of Sitting Position Detection System Based on Inertial Sensor. Harbin University of Science and Technology (2017)

  27. Piñero-Fuentes, E., Canas-Moreno, S., Rios-Navarro, A., Domínguez-Morales, M., Sevillano, J.L., Linares-Barranco, A.: A deep-learning based posture detection system for preventing telework-related musculoskeletal disorders. Sensors 21(15), 5236 (2021)

    Article  Google Scholar 

  28. Seo, J., Lee, S.: Automated postural ergonomic risk assessment using vision-based posture classification. Autom Constr 128, 103725 (2021)

    Article  Google Scholar 

  29. Chen, K.: Sitting posture recognition based on OpenPose. In: IOP Conference Series: Materials Science and Engineering, vol. 677.3 (2019)

  30. Nguyen, A.T., Chen W., Rauterberg M.: Online feedback system for public speakers. In: IEEE Symposium on E-Learning, E-Management and E-Services (2012)

  31. Simonyan, K., Andrew, Z.: Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199 (2014)

  32. Du, Y., Wei W., Liang, W.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

  33. Nadeem, A., Ahmad, J., Kibum, K.: Automatic human posture estimation for sport activity recognition with robust body parts detection and entropy markov model. Multimedia Tools Appl. 80, 21465–21498 (2021)

    Article  Google Scholar 

  34. Ogundokun, R.O., Rytis, M., Robertas, D.: Human posture detection using image augmentation and hyperparameter-optimized transfer learning algorithms. Appl. Sci. 12(19), 10156 (2022)

    Article  Google Scholar 

  35. Gan, T., Wong, Y., Mandal, B., Chandrasekhar, V., Kankanhalli, M.S.: Multi-sensor self-quantification of presentations. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 601–610 (2015)

  36. Saunders, L., Rozaklis, L., Abels, E.G.: Repositioning Reference: New Methods and New Services for a New Age. Rowman & Littlefield, London (2014)

    Google Scholar 

  37. Bajpai, R., Deepak, J.: Movenet: A deep neural network for joint profile prediction across variable walking speeds and slopes. IEEE Trans. Instrum. Meas. 70, 1–11 (2021)

    Google Scholar 

  38. Bazarevsky, V., Grishchenko, I., Raveendran, K., Zhu, T., Zhang, F., Grundmann, M.: Blazepose: on-device real-time body pose tracking. arXiv preprint arXiv:2006.10204 (2020)

  39. Kendall, A., Matthew G., Roberto C.: Posenet: a convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

  40. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L. Microsoft coco: common objects in context. In: Computer Vision 13th European Conference, pp. 740–755. Springer (2014)

  41. Seo, J., Han, S., Lee, S., Armstrong, T.J.: Motion data-driven unsafe pose identification through biomechanical analysis. In: Computing in Civil Engineering, pp. 693–700 (2013)

  42. Woodson, W. E., Tillman, B., & Tillman, P.: Human factors design handbook: information and guidelines for the design of systems, facilities, equipment, and products for human use (1992)

  43. Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: Librispeech: an ASR corpus based on public domain audio books. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210 (2015)

  44. McFee, B., Raffel, C., Liang, D., Ellis, D.P., McVicar, M., Battenberg, E., Nieto, O.: Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, vol. 8, pp. 18–25 (2015)

  45. Stojanovic, V., Novak, N.: Robust identification of OE model with constrained output using optimal input design. J. Frankl. Inst. 353(2), 576–593 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  46. Filipovic, V., Nedic, N., Stojanovic, V.: Robust identification of pneumatic servo actuators in the real situations. Forsch Ingenieurwes 75, 183–196 (2011)

    Article  Google Scholar 

  47. Zhuang, Z., Tao, H., Chen, Y., Stojanovic, V., Paszke, W.: An optimal iterative learning control approach for linear systems with nonuniform trial lengths under input constraints. IEEE Trans. Syst. Man Cybern. Syst. 53, 3461–3473 (2022)

    Article  Google Scholar 

  48. Stojanovic, V., Nedic, N.: Joint state and parameter robust estimation of stochastic nonlinear systems. Int. J. Robust Nonlinear Control 26(14), 3058–3074 (2016)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepika Kumar.

Ethics declarations

Conflict of interest

The authors declare that they do not have any conflict of interest. This research did not involve any human or animal participation. All authors have checked and agreed on the submission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (MP4 128908 KB)

Supplementary file2 (MP4 6905 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gulati, V., Dwivedi, S., Kumar, D. et al. An automated framework to evaluate soft skills using posture and disfluency detection. Machine Vision and Applications 34, 94 (2023). https://doi.org/10.1007/s00138-023-01431-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-023-01431-0

Keywords

Navigation