Skip to main content

A Sound Image Reproduction Model Based on Personalized Weight Vectors

  • Conference paper
  • First Online:
  • 2446 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11165))

Abstract

Many perceptual models for audio reconstruction have been proposed to create the virtual sound, but the direction of the virtual sound maybe deviate from the desired direction due to the distortion of binaural cues. In this paper, a binaural cues’ equation for real sound and virtual one reproduced by dual loudspeakers is established to derive weight vectors based on the head-related transfer function (HRTF). After being filtered by the weight vectors, sound signals emitted from the loudspeakers can deliver an accurate spatial impression to the listener. However, the HRTFs change with listeners, by which the weight vectors calculated also vary from person to person. Therefore, a radial basis function neural network (RBFNN) is designed to personalize weight vectors for each specific listener. Compared with the three methods including vector base amplitude panning (VBAP), the HRTF-based panning (HP) and the band-based panning (BP), the method in this paper can reproduce binaural cues more accurately, and subjective test also indicates that there is no significant difference in perception between real sound and virtual sound based on the proposed methods.

This work is supported by National Nature Science Foundation of China (No. 61671335); Technological Innovation Major Project of Hubei Province (No. 2017AAA123).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Wenzel, E.M., Arruda, M., Kistler, D.J., et al.: Localization using nonindividualized head-related transfer functions. J. Acoust. Soc. Am. 94(1), 111 (1993)

    Article  Google Scholar 

  2. Zotkin, D.N., Duraiswami, R., Davis, L.S.: Rendering localized spatial audio in a virtual auditory space. IEEE Trans. Multimed. 6(4), 553–564 (2004)

    Article  Google Scholar 

  3. Poletti, M.A.: Three-dimensional surround sound systems based on spherical harmonics. J. Audio Eng. Soc. 53(11), 1004–1025 (2005)

    Google Scholar 

  4. Ward, D.B., Abhayapala, T.D., Member, S.: Reproduction of a plane-wave sound field using an array of loudspeakers. IEEE Trans. Speech Audio Process. 9, 697–707 (2001)

    Article  Google Scholar 

  5. Pulkki, V.: Localization of amplitude-panned virtual sources I: stereophonic panning. Audio Eng. Soc. 49(9), 739–752 (2001)

    Google Scholar 

  6. Pulkki, V., Karjalainen, M.: Multichannel audio rendering using amplitude panning [DSP applications]. Signal Process. Mag. IEEE 25(3), 118–122 (2008)

    Article  Google Scholar 

  7. Pulkki, V.: Uniform spreading of amplitude panned virtual sources. In: 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 187–190. IEEE (2002)

    Google Scholar 

  8. Blauert, J., Butler, R.A.: Spatial hearing: the psychophysics of human sound localization by Jens Blauert. J. Acoust. Soc. Am. 77(1), 334–335 (1996)

    Article  Google Scholar 

  9. Macpherson, E.A.: A computer model of binaural localization for stereo image measurement. J. Audio Eng. Soc. 39(9), 604–622 (1989)

    Google Scholar 

  10. Breebaart, J.: Comparison of interaural intensity differences evoked by real and phantom sources. J. Audio Eng. Soc. 61(11), 850–859 (2013)

    Google Scholar 

  11. Choi, T., Park, Y., Youn, D., et al.: Virtual sound rendering in a stereophonic loudspeaker setup. IEEE Trans. Audio Speech Lang. Process. 19(7), 1962–1974 (2011)

    Article  Google Scholar 

  12. Laitinen, M.V., Vilkamo, J., Kai, J., et al.: Gain normalization in amplitude panning as a function of frequency and room reverberance. In: International Conference: Spatial Sound, pp. 85–94. AES (2014)

    Google Scholar 

  13. Pulkki, V., Hirvonen, T.: Localization of virtual sources in multichannel audio reproduction. IEEE Trans. Speech Audio Process. 13(1), 105–119 (2004)

    Article  Google Scholar 

  14. Algazi, V.R., Duda, R.O., Thompson, D.M., et al.: The CIPIC HRTF database. In: 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, pp. 99–102. IEEE (2001)

    Google Scholar 

  15. Gardner, W.: HRTF measurement of a KEMAR. J. Acoust. Soc. Am. 97, 3907–3908 (1995)

    Article  Google Scholar 

  16. Wang, J., Wang, X., Tu, W., Chen, J., Wu, T., Ke, S.: The analysis for binaural signal’s characteristics of a real source and corresponding virtual sound image. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds.) PCM 2017. LNCS, vol. 10736, pp. 626–633. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77383-4_61

    Chapter  Google Scholar 

  17. Majdak, P., Masiero, B., Fels, J.: Sound localization in individualized and non-individualized crosstalk cancellation systems. J. Acoust. Soc. Am. 133(4), 2055–2068 (2013)

    Article  Google Scholar 

  18. Grijalva, F., Martini, L., Florencio, D., et al.: A manifold learning approach for personalizing HRTFs from anthropometric features. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 559–570 (2016)

    Article  Google Scholar 

  19. MingFang: Application of minimum-phase approximation on the signal processing of virtual auditory display. A dissertation of the degree of Master, South China University of Technology, Guangzhou, China (2012)

    Google Scholar 

  20. Breebaart, J., Steven, V.D.P., Kohlrausch, A., et al.: Parametric coding of stereo audio. EURASIP J. Adv. Signal Process. 2005(9), 1–18 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiping Tu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, J., Tu, W., Zhang, X., Yang, W., Zhai, S., Shen, C. (2018). A Sound Image Reproduction Model Based on Personalized Weight Vectors. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00767-6_56

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00766-9

  • Online ISBN: 978-3-030-00767-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics