Skip to main content
Log in

Sound image externalization for headphone based real-time 3D audio

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

3D audio effects can provide immersive auditory experience, but we often face the so-called in-head localization (IHL) problem in headphone sound reproduction. To address this problem, we propose an effective sound image externalization approach. Specifically, we consider several important factors related to sound propagation, which include image-source model based early reflections with distance decay, wall absorption and air absorption, late reverberation and other dynamic factors like head movement. We apply our sound image externalization approach to a headphone based real-time 3D audio system. Subjective listening tests show that the sound image externalization performance is significantly improved and the sound source direction is preserved as well. A/B preference test further shows that, as compared with a recent popular approach, the proposed approach is mostly preferred by the listeners.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Begault D, Wenzel E M, Godfroy M, Miller J D, Anderson M R. Applying spatial audio to human interfaces: 25 years of nasa experience. In: Proceedings of the 40th International Conference on Spatial Audio: Sense the Sound of Space. 2010

    Google Scholar 

  2. Seki Y, Sato T. A training system of orientation and mobility for blind people using acoustic virtual reality. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2011, 19(1): 95–104

    Article  Google Scholar 

  3. Xie B. Head-Related Transfer Function and Virtual Auditory Display. 2ed ed. Boca Raton, FL: J. Ross Publishing, 2013

    Google Scholar 

  4. Toole F E. In-head localization of acoustic images. The Journal of the Acoustical Society of America, 1970, 48(4B): 943–949

    Article  Google Scholar 

  5. Wightman F L, Kistler D J. Headphone simulation of free-field listening. II: Psychophysical validation. The Journal of the Acoustical Society of America, 1989, 85(2): 868–878

    Article  Google Scholar 

  6. Weinrich S G. Improved externalization and frontal perception of headphone signals. In: Proceedings of Audio Engineering Society Convention 92. 1992

    Google Scholar 

  7. Hartmann W M, Wittenberg A. On the externalization of sound images. The Journal of the Acoustical Society of America, 1996, 99(6): 3678–3688

    Article  Google Scholar 

  8. Durlach N I, Rigopulos A, Pang X D, Woods W S, Kulkarni A, Colburn H S, Wenzel E M. On the externalization of auditory images. Presence: Teleoperators & Virtual Environments, 1992, 1(2): 251–257

    Article  Google Scholar 

  9. Loomis J M, Hebert C, Cicinelli J G. Active localization of virtual sounds. The Journal of the Acoustical Society of America, 1990, 88(4): 1757–1764

    Article  Google Scholar 

  10. Begault D R. Perceptual effects of synthetic reverberation on threedimensional audio systems. Journal of the Audio Engineering Society, 1992, 40(11): 895–904

    Google Scholar 

  11. Liitola T. Headphone sound externalization. Dissertation for the Doctoral Degree. Espoo: Helsinki University of Technology, 2006

    Google Scholar 

  12. Xia R S, Li J F, Xu C D, Yan Y H. A sound image externalization approach for headphone reproduction by simulating binaural room impulse responses. Chinese Journal of Electronics, 2014, 23(3): 527–532

    Google Scholar 

  13. Plenge G. On the differences between localization and lateralization. The Journal of the Acoustical Society of America, 1974, 56(3): 944–951

    Article  Google Scholar 

  14. Zhang C Y, Xie B S. Platform for dynamic virtual auditory environment real-time rendering system. Chinese Science Bulletin, 2013, 58(3): 316–327

    Article  Google Scholar 

  15. Tian X H, Fu Z H, Xie L. An experimental comparison on KEMAR and BHead210 dummy heads for HRTF-based virtual auditory on Chinese subjects. In: Proceedings of the 3rd IET International Conference on Wireless, Mobile and Multimedia Networks. 2010, 369–372

    Google Scholar 

  16. Møller H, Sørensen M F, Hammershøi D, Jensen C B. Head-related transfer functions of human subjects. Journal of the Audio Engineering Society, 1995, 43(5): 300–321

    Google Scholar 

  17. Møller H, Jensen C B, Hammershøi D, Sørensen M F. Using a typical human subject for binaural recording. In: Proceedings of Audio Engineering Society Convention 100. 1996

    Google Scholar 

  18. Allen J B, Berkley D A. Image method for efficiently simulating smallroom acoustics. The Journal of the Acoustical Society of America, 1979, 65(4): 943–950

    Article  Google Scholar 

  19. Delany M E, Bazley E N. Acoustical properties of fibrous absorbent materials. Applied Acoustics, 1970, 3(2): 105–116

    Article  Google Scholar 

  20. Huopaniemi J, Savioja L, Karjalainen M. Modeling of reflections and air absorption in acoustical spaces: a digital filter design approach. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 1997, 19–22

    Google Scholar 

  21. Jones Jr R H, Jobse B D. Real-time digital audio reverberation system. US Patent 5, 530, 762. 1996

    Google Scholar 

  22. Browne S. Hybrid reverberation algorithm using truncated impulse response convolution and recursive filtering. Dissertation for the Doctoral Degree. Miami: University of Miami, 2001

    Google Scholar 

  23. Moorer J A. About this reverberation business. Computer Music Journal, 1979, 13–28

    Google Scholar 

  24. Gardner W G. A realtime multichannel room simulator. Journal of the Acoustical Society of America, 1992, 92(4): 2395

    Article  Google Scholar 

  25. Algazi V R, Duda R O, Thompson D M, Avendano C. The CIPIC HRTF database. In: Proceedings of IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics. 2001, 99–102

    Google Scholar 

  26. Gardner W G, Martin K D. HRTF measurements of a KEMAR. The Journal of the Acoustical Society of America, 1995, 97(6): 3907–3908

    Article  Google Scholar 

  27. Frigo M, Johnson S G. FFTW: An adaptive software architecture for the FFT. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 1998, 1381–1384

    Google Scholar 

  28. David H A. The method of paired comparisons. In: Kendall M G, ed. Griffin’s Statistical Monographs and Courses, Vol. 12. New York: Hafner, 1963

    Google Scholar 

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (Grant No. 61571363), Aeronautical Science Foundation of China (20155553038 and 20155553040), Science and Technology on Avionics Integration Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Xie.

Additional information

Yougen Yuan received the BE degree in computer science and technology in 2014 from Chongqing University, China. He is currently a PhD student in the School of Computer Science at Northwestern Polytechnical University, China. His current research interests include audio signal processing, automatic speech recognition, and deep learning.

Lei Xie received the PhD degree in computer science from Northwestern Polytechnical University (NPU), China in 2004. He is currently a professor with School of Computer Science, NPU. From 2001 to 2002, he was with the Department of Electronics and Information Processing, Vrije Universiteit Brussel (VUB), Belgium, as a visiting scientist. From 2004 to 2006, he was a senior research associate in the Center for Media Technology (RCMT), School of Creative Media, City University of Hong Kong, China. From 2006 to 2007, he was a postdoctoral fellow in the Human-Computer Communications Laboratory (HCCL), Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, China. He has published more than 120 papers in major journals and conference proceedings, such as the IEEE Transactions on Audio, Speech and Language Processing, IEEE Transactions on Multimedia, Information Sciences, Pattern Recognition, ACL, ACM Multimedia, Interspeech, ICPR, ICME and ICASSP. He has served as program chairs and organizing chairs in various conferences. He is a senior member of IEEE. His current research interests include speech and language processing, multimedia, and human-computer interaction.

Zhong-Hua Fu received the MS degree in computer science and applications from Northwestern Polytechnical University (NPU), China in 2000, and the PhD degree in computer application technology from NPU in 2004. During the PhD degree, he worked on robust speaker recognition. After that, he joined in the Speech and Image Processing Key Laboratory of ShaanXi Province of China and worked on single/multiple channel speech enhancement, acoustic echo cancellation, etc. He is currently an associate professor in NPU. His reseach interests include speech and audio signal processing, microphone array, and virtual auditory.

Ming Xu received the master degree from the School of Computer Science at Northwestern Polytechnical University, China in 2014. He is currently working in China National Aeronautical Ratio Electronics Research Institute. His research interests include speech and audio signal processing, automatic speech recognition, and threedimensional virtual sound.

Qi Cong received the master degree from the School of Computer Science at Northwestern Polytechnical University, China in 2015. He is currently working in TP-LINK, China. His research interest is audio signal processing.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, Y., Xie, L., Fu, ZH. et al. Sound image externalization for headphone based real-time 3D audio. Front. Comput. Sci. 11, 419–428 (2017). https://doi.org/10.1007/s11704-016-6182-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-016-6182-2

Keywords

Navigation