A Sound Image Reproduction Model Based on Personalized Weight Vectors

Zheng, Jiaxi; Tu, Weiping; Zhang, Xiong; Yang, Wanzhao; Zhai, Shuangxing; Shen, Chen

doi:10.1007/978-3-030-00767-6_56

A Sound Image Reproduction Model Based on Personalized Weight Vectors

Jiaxi Zheng^18,19,
Weiping Tu^18,19,
Xiong Zhang^18,19,
Wanzhao Yang^18,19,
Shuangxing Zhai^18,19 &
…
Chen Shen^18,19

Conference paper
First Online: 19 September 2018

2446 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11165))

Abstract

Many perceptual models for audio reconstruction have been proposed to create the virtual sound, but the direction of the virtual sound maybe deviate from the desired direction due to the distortion of binaural cues. In this paper, a binaural cues’ equation for real sound and virtual one reproduced by dual loudspeakers is established to derive weight vectors based on the head-related transfer function (HRTF). After being filtered by the weight vectors, sound signals emitted from the loudspeakers can deliver an accurate spatial impression to the listener. However, the HRTFs change with listeners, by which the weight vectors calculated also vary from person to person. Therefore, a radial basis function neural network (RBFNN) is designed to personalize weight vectors for each specific listener. Compared with the three methods including vector base amplitude panning (VBAP), the HRTF-based panning (HP) and the band-based panning (BP), the method in this paper can reproduce binaural cues more accurately, and subjective test also indicates that there is no significant difference in perception between real sound and virtual sound based on the proposed methods.

This work is supported by National Nature Science Foundation of China (No. 61671335); Technological Innovation Major Project of Hubei Province (No. 2017AAA123).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Wenzel, E.M., Arruda, M., Kistler, D.J., et al.: Localization using nonindividualized head-related transfer functions. J. Acoust. Soc. Am. 94(1), 111 (1993)
Article Google Scholar
Zotkin, D.N., Duraiswami, R., Davis, L.S.: Rendering localized spatial audio in a virtual auditory space. IEEE Trans. Multimed. 6(4), 553–564 (2004)
Article Google Scholar
Poletti, M.A.: Three-dimensional surround sound systems based on spherical harmonics. J. Audio Eng. Soc. 53(11), 1004–1025 (2005)
Google Scholar
Ward, D.B., Abhayapala, T.D., Member, S.: Reproduction of a plane-wave sound field using an array of loudspeakers. IEEE Trans. Speech Audio Process. 9, 697–707 (2001)
Article Google Scholar
Pulkki, V.: Localization of amplitude-panned virtual sources I: stereophonic panning. Audio Eng. Soc. 49(9), 739–752 (2001)
Google Scholar
Pulkki, V., Karjalainen, M.: Multichannel audio rendering using amplitude panning [DSP applications]. Signal Process. Mag. IEEE 25(3), 118–122 (2008)
Article Google Scholar
Pulkki, V.: Uniform spreading of amplitude panned virtual sources. In: 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 187–190. IEEE (2002)
Google Scholar
Blauert, J., Butler, R.A.: Spatial hearing: the psychophysics of human sound localization by Jens Blauert. J. Acoust. Soc. Am. 77(1), 334–335 (1996)
Article Google Scholar
Macpherson, E.A.: A computer model of binaural localization for stereo image measurement. J. Audio Eng. Soc. 39(9), 604–622 (1989)
Google Scholar
Breebaart, J.: Comparison of interaural intensity differences evoked by real and phantom sources. J. Audio Eng. Soc. 61(11), 850–859 (2013)
Google Scholar
Choi, T., Park, Y., Youn, D., et al.: Virtual sound rendering in a stereophonic loudspeaker setup. IEEE Trans. Audio Speech Lang. Process. 19(7), 1962–1974 (2011)
Article Google Scholar
Laitinen, M.V., Vilkamo, J., Kai, J., et al.: Gain normalization in amplitude panning as a function of frequency and room reverberance. In: International Conference: Spatial Sound, pp. 85–94. AES (2014)
Google Scholar
Pulkki, V., Hirvonen, T.: Localization of virtual sources in multichannel audio reproduction. IEEE Trans. Speech Audio Process. 13(1), 105–119 (2004)
Article Google Scholar
Algazi, V.R., Duda, R.O., Thompson, D.M., et al.: The CIPIC HRTF database. In: 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, pp. 99–102. IEEE (2001)
Google Scholar
Gardner, W.: HRTF measurement of a KEMAR. J. Acoust. Soc. Am. 97, 3907–3908 (1995)
Article Google Scholar
Wang, J., Wang, X., Tu, W., Chen, J., Wu, T., Ke, S.: The analysis for binaural signal’s characteristics of a real source and corresponding virtual sound image. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds.) PCM 2017. LNCS, vol. 10736, pp. 626–633. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77383-4_61
Chapter Google Scholar
Majdak, P., Masiero, B., Fels, J.: Sound localization in individualized and non-individualized crosstalk cancellation systems. J. Acoust. Soc. Am. 133(4), 2055–2068 (2013)
Article Google Scholar
Grijalva, F., Martini, L., Florencio, D., et al.: A manifold learning approach for personalizing HRTFs from anthropometric features. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 559–570 (2016)
Article Google Scholar
MingFang: Application of minimum-phase approximation on the signal processing of virtual auditory display. A dissertation of the degree of Master, South China University of Technology, Guangzhou, China (2012)
Google Scholar
Breebaart, J., Steven, V.D.P., Kohlrausch, A., et al.: Parametric coding of stereo audio. EURASIP J. Adv. Signal Process. 2005(9), 1–18 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan, 430072, China
Jiaxi Zheng, Weiping Tu, Xiong Zhang, Wanzhao Yang, Shuangxing Zhai & Chen Shen
School of Computer Science, Wuhan University, Wuhan, 430072, China
Jiaxi Zheng, Weiping Tu, Xiong Zhang, Wanzhao Yang, Shuangxing Zhai & Chen Shen

Authors

Jiaxi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Weiping Tu
View author publications
You can also search for this author in PubMed Google Scholar
Xiong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wanzhao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shuangxing Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Chen Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weiping Tu .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, J., Tu, W., Zhang, X., Yang, W., Zhai, S., Shen, C. (2018). A Sound Image Reproduction Model Based on Personalized Weight Vectors. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_56

Download citation

DOI: https://doi.org/10.1007/978-3-030-00767-6_56
Published: 19 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00766-9
Online ISBN: 978-3-030-00767-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics