Abstract
The exploitation of Machine Learning (ML) solutions is presented in this paper with the aim of obtain individualized Head Related Impulse Responses (HRIRs) without measuring them directly on the interested individuals. Different regression models have been explored in order to find the most accurate one in predicting the samples of the HRIRs, given a set of anthropometric features and given a specific virtual source position.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Turing, A.M.: I.-computing machinery and intelligence. Mind LIX, 433–460 (1950)
Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79(8), 2554–2558 (1982)
Deng, L., Li, X.: Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)
Haeb-Umbach, R., et al.: Speech processing for digital home assistants: combining signal processing with deep-learning techniques. IEEE Sig. Process. Mag. 36(6), 111–124 (2019)
Rong, F.: Audio classification method based on machine learning. In: 2016 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), pp. 81–84 (2016)
Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S.-Y., Sainath, T.: Deep learning for audio signal processing. IEEE J. Sel. Top. Sig. Process. 13(2), 206–219 (2019)
Hoeg, E.R., Gerry, L.J., Thomsen, L., Nilsson, N.C., Serafin, S.: Binaural sound reduces reaction time in a virtual reality search task. In: 2017 IEEE 3rd VR Workshop on Sonic Interactions for Virtual Environments (SIVE), pp. 1–4 (2017)
Thuillier, E., Gamper, H., Tashev, I.J.: Spatial audio feature discovery with convolutional neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6797–6801 (2018)
Kaneko, S., Suenaga, T., Sekine, S.: DeepEarNet: individualizing spatial audio with photography, ear shape modeling, and neural networks. J. Audio Eng. Soc. (2016)
Zhang, M., Wang, J.-H., James, D.L.: Personalized HRTF modeling using DNN-augmented BEM. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 451–455 (2021)
Lee, G.W., Kim, H.K.: Personalized HRTF modeling based on deep neural network using anthropometric measurements and images of the ear. Appl. Sci. 8(11), 2180 (2018)
Miccini, R., Spagnol, S.: HRTF individualization using deep learning. In: 2020 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), pp. 390–395 (2020)
Rinaldi, C., Franchi, F., Marotta, A., Graziosi, F., Centofanti, C.: On the exploitation of 5G multi-access edge computing for spatial audio in cultural heritage applications. IEEE Access 9, 155197–155206 (2021)
McMullen, K., Wan, Y.: A machine learning tutorial for spatial auditory display using head-related transfer functions. J. Acoust. Soc. Am. 151, 1277–1293 (2022)
Wenzel, E., Arruda, M., Kistler, D., Wightman, F.: Localization using nonindividualized head-related transfer functions. J. Acoust. Soc. Am. 94, 111–123 (1993)
Xu, S., Li, Z., Salvendy, G.: Individualization of head-related transfer function for three-dimensional virtual auditory display: a review. In: Shumaker, R. (ed.) ICVR 2007. LNCS, vol. 4563, pp. 397–407. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73335-5_44
Li, S., Peissig, J.: Measurement of head-related transfer functions: a review. Appl. Sci. 10(14), 5014 (2020)
Acknowledgement
This work was supported by the Italian Government under MiSE “Programma di supporto tecnologie emergenti - Asse I (Casa delle Tecnologie Emergenti) Progetto SICURA” - CUP C19C20000520004.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Angelucci, S., Rinaldi, C., Franchi, F., Graziosi, F. (2023). Comparison of ML Solutions for HRIR Individualization Design in Binaural Audio. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2023. Lecture Notes in Networks and Systems, vol 655. Springer, Cham. https://doi.org/10.1007/978-3-031-28694-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-28694-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28693-3
Online ISBN: 978-3-031-28694-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)