Skip to main content

Comparison of ML Solutions for HRIR Individualization Design in Binaural Audio

  • Conference paper
  • First Online:
Advanced Information Networking and Applications (AINA 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 655))

  • 617 Accesses

Abstract

The exploitation of Machine Learning (ML) solutions is presented in this paper with the aim of obtain individualized Head Related Impulse Responses (HRIRs) without measuring them directly on the interested individuals. Different regression models have been explored in order to find the most accurate one in predicting the samples of the HRIRs, given a set of anthropometric features and given a specific virtual source position.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Turing, A.M.: I.-computing machinery and intelligence. Mind LIX, 433–460 (1950)

    Google Scholar 

  2. Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79(8), 2554–2558 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  3. Deng, L., Li, X.: Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)

    Article  Google Scholar 

  4. Haeb-Umbach, R., et al.: Speech processing for digital home assistants: combining signal processing with deep-learning techniques. IEEE Sig. Process. Mag. 36(6), 111–124 (2019)

    Article  Google Scholar 

  5. Rong, F.: Audio classification method based on machine learning. In: 2016 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), pp. 81–84 (2016)

    Google Scholar 

  6. Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.-Y., Sainath, T.: Deep learning for audio signal processing. IEEE J. Sel. Top. Sig. Process. 13(2), 206–219 (2019)

    Article  Google Scholar 

  7. Hoeg, E.R., Gerry, L.J., Thomsen, L., Nilsson, N.C., Serafin, S.: Binaural sound reduces reaction time in a virtual reality search task. In: 2017 IEEE 3rd VR Workshop on Sonic Interactions for Virtual Environments (SIVE), pp. 1–4 (2017)

    Google Scholar 

  8. Thuillier, E., Gamper, H., Tashev, I.J.: Spatial audio feature discovery with convolutional neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6797–6801 (2018)

    Google Scholar 

  9. Kaneko, S., Suenaga, T., Sekine, S.: DeepEarNet: individualizing spatial audio with photography, ear shape modeling, and neural networks. J. Audio Eng. Soc. (2016)

    Google Scholar 

  10. Zhang, M., Wang, J.-H., James, D.L.: Personalized HRTF modeling using DNN-augmented BEM. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 451–455 (2021)

    Google Scholar 

  11. Lee, G.W., Kim, H.K.: Personalized HRTF modeling based on deep neural network using anthropometric measurements and images of the ear. Appl. Sci. 8(11), 2180 (2018)

    Article  Google Scholar 

  12. Miccini, R., Spagnol, S.: HRTF individualization using deep learning. In: 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pp. 390–395 (2020)

    Google Scholar 

  13. Rinaldi, C., Franchi, F., Marotta, A., Graziosi, F., Centofanti, C.: On the exploitation of 5G multi-access edge computing for spatial audio in cultural heritage applications. IEEE Access 9, 155197–155206 (2021)

    Article  Google Scholar 

  14. McMullen, K., Wan, Y.: A machine learning tutorial for spatial auditory display using head-related transfer functions. J. Acoust. Soc. Am. 151, 1277–1293 (2022)

    Article  Google Scholar 

  15. Wenzel, E., Arruda, M., Kistler, D., Wightman, F.: Localization using nonindividualized head-related transfer functions. J. Acoust. Soc. Am. 94, 111–123 (1993)

    Article  Google Scholar 

  16. Xu, S., Li, Z., Salvendy, G.: Individualization of head-related transfer function for three-dimensional virtual auditory display: a review. In: Shumaker, R. (ed.) ICVR 2007. LNCS, vol. 4563, pp. 397–407. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73335-5_44

    Chapter  Google Scholar 

  17. Li, S., Peissig, J.: Measurement of head-related transfer functions: a review. Appl. Sci. 10(14), 5014 (2020)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by the Italian Government under MiSE “Programma di supporto tecnologie emergenti - Asse I (Casa delle Tecnologie Emergenti) Progetto SICURA” - CUP C19C20000520004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Angelucci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Angelucci, S., Rinaldi, C., Franchi, F., Graziosi, F. (2023). Comparison of ML Solutions for HRIR Individualization Design in Binaural Audio. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2023. Lecture Notes in Networks and Systems, vol 655. Springer, Cham. https://doi.org/10.1007/978-3-031-28694-0_25

Download citation

Publish with us

Policies and ethics