Skip to main content

Novelty Detection in Human-Machine Interaction Through a Multimodal Approach

  • Conference paper
  • First Online:
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (CIARP 2023)

Abstract

As the interest in robots continues to grow across various domains, including healthcare, construction and education, it becomes crucial to prioritize improving user experience and fostering seamless interaction. These human-machine interactions (HMI) are often impersonal. Our proposal, built upon previous work in the field, aims to use biometric data of individuals to detect whether a person has been encountered before. Since many models depend on a threshold set, an optimization method using a genetic algorithm was proposed. The novelty detection is made through a multimodal approach using both voice and facial images from the individuals, although the unimodal approaches of just each single cue were also tested. To assess the effectiveness of the proposed system, we conducted comprehensive experiments on three diverse datasets, namely VoxCeleb, Mobio and AveRobot, each possessing distinct characteristics and complexities. By examining the impact of data quality on model performance, we gained valuable insights into the effectiveness of the proposed solution. Our approach outperformed several conventional novelty detection methods, yielding superior and therefore promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LoF: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000). https://doi.org/10.1145/335191.335388

  2. Campello, R.J.G.B., Moulavi, D., Zimek, A., Sander, J.: Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans. Knowl. Discov. Data 10(1) (2015). https://doi.org/10.1145/2733381

  3. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. KDD’96, pp. 226–231. AAAI Press (1996)

    Google Scholar 

  4. Freire-Obregón, D., Rosales-Santana, K., Marín-Reyes, P.A., Penate-Sanchez, A., Lorenzo-Navarro, J., Castrillón-Santana, M.: Improving user verification in human-robot interaction from audio or image inputs through sample quality assessment. Pattern Recogn. Lett. 149, 179–184 (2021). https://doi.org/10.1016/j.patrec.2021.06.014

    Article  Google Scholar 

  5. Geng, C., Huang, S., Chen, S.: Recent advances in open set recognition: a survey. CoRR abs/1811.08581 (2018)

    Google Scholar 

  6. Hu, W., Gao, J., Li, B., Wu, O., Du, J., Maybank, S.: Anomaly detection using local kernel density estimation and context-based regression. IEEE Trans. Knowl. Data Eng. 32(2), 218–233 (2020). https://doi.org/10.1109/TKDE.2018.2882404

  7. Khoury, E., El Shafey, L., McCool, C., Günther, M., Marcel, S.: Bi-modal biometric authentication on mobile phones in challenging conditions. Image Vision Comput. 1147–1160 (2014). https://doi.org/10.1016/j.imavis.2013.10.001

  8. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008). https://doi.org/10.1109/ICDM.2008.17

  9. Marras, M., Marín-Reyes, P.A., Navarro, J.J.L., Santana, M.F.C., Fenu, G.: Averobot: an audio-visual dataset for people re-identification and verification in human-robot interaction. ICPRAM (Setúbal) (2019). https://doi.org/10.5220/0007690902550265

    Article  Google Scholar 

  10. McInnes, L., Healy, J.: Accelerated hierarchical density based clustering. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, November 2017. https://doi.org/10.1109/icdmw.2017.12

  11. McInnes, L., Healy, J., Astels, S.: HDBScan: hierarchical density based clustering. J. Open Source Softw. 2(11), 205 (2017)

    Article  Google Scholar 

  12. Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013). https://doi.org/10.1109/TPAMI.2013.83

    Article  Google Scholar 

  13. Nagrani, A., Chung, J.S., Zisserman, A.: Voxceleb: a large-scale speaker identification dataset. In: INTERSPEECH (2017)

    Google Scholar 

  14. Salehi, M., Mirzaei, H., Hendrycks, D., Li, Y., Rohban, M.H., Sabokrou, M.: A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: solutions and future challenges. CoRR abs/2110.14051 (2021)

    Google Scholar 

  15. Schlegl, T., Seeböck, P., Waldstein, S.M., Schmidt-Erfurth, U., Langs, G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. CoRR abs/1703.05921 (2017)

    Google Scholar 

  16. Schölkopf, B., Williamson, R.C., Smola, A., Shawe-Taylor, J., Platt, J.: Support vector method for novelty detection. In: Solla, S., Leen, T., Müller, K. (eds.) Advances in Neural Information Processing Systems, vol. 12. MIT Press (1999)

    Google Scholar 

  17. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015). https://doi.org/10.1109/CVPR.2015.7298682

  18. Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., Khudanpur, S.: X-vectors: robust DNN embeddings for speaker recognition. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5329–5333 (2018). https://doi.org/10.1109/ICASSP.2018.8461375

  19. Stock-Homburg, R.: Survey of emotions in human-robot interactions: perspectives from robotic psychology on 20 years of research. Int. J. Soc. Robot. 14(2), 389–411 (2022)

    Google Scholar 

  20. Tax, D.M., Duin, R.P.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)

    Google Scholar 

  21. Uluer, P., Kose, H., Gumuslu, E., Barkana, D.E.: Experience with an affective robot assistant for children with hearing disabilities. Int. J. Soc. Robot. 15(4), 643–660 (2023)

    Google Scholar 

  22. Wang, X., Liang, C.J., Menassa, C.C., Kamat, V.R.: Interactive and immersive process-level digital twin for collaborative human-robot construction work. J. Comput. Civ. Eng. 35(6), 04021023 (2021)

    Article  Google Scholar 

  23. Whitley, D.: A genetic algorithm tutorial. Stat. Comput. 4(2), 65–85 (1994)

    Google Scholar 

  24. Youssef, K., Said, S., Alkork, S., Beyrouthy, T.: A survey on recent advances in social robotics. Robotics 11(4) (2022). https://doi.org/10.3390/robotics11040075

  25. Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’17, pp. 665–674. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3097983.3098052

Download references

Acknowledgments

This work is partially funded by the Spanish Ministry of Science and Innovation under project PID2021-122402OB-C22 and by the ACIISI-Gobierno de Canarias and European FEDER funds under project ULPGC Facilities Net and Grant EIS 2021 04, it is also supported by “Programa Investigo” refference code 32/39/2022-0923131539 of Servicio Canario de Empleo. “Fondos del Plan de Recuperación, Transformación y Resiliencia - Next Generation EU”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Salas-Cáceres .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Salas-Cáceres, J., Lorenzo-Navarro, J., Freire-Obregón, D., Castrillón-Santana, M. (2024). Novelty Detection in Human-Machine Interaction Through a Multimodal Approach. In: Vasconcelos, V., Domingues, I., Paredes, S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2023. Lecture Notes in Computer Science, vol 14469. Springer, Cham. https://doi.org/10.1007/978-3-031-49018-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-49018-7_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-49017-0

  • Online ISBN: 978-3-031-49018-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics