Skip to main content

Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7906))

Abstract

An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sasaki, Y., Kabasawa, M., Thompson, S., Kagami, S., Oro, K.: Spherical Microphone Array for Spatial Sound Localizationfor a Mobile Robot. In: Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), Algarve, Portugal, pp. 713–718 (October 2012)

    Google Scholar 

  2. Cheng, C.I., Wakefield, G.H.: Introduction to Head-Related Transfer Functions (HRTFs): Representations of HRTFs in Time, Frequency, and Space. Audio Engineering Society 49, 231–249 (2001)

    Google Scholar 

  3. Knapp, C.H., Carter, G.C.: The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. on Acoustics, Speech, and Signal Processing 24(4), 320–327 (1976)

    Article  Google Scholar 

  4. Hill, P.A., Nelson, P.A., Kirkeby, O., Hamada, H.: Resolution of Front-Back Confusion in Virtual Acoustic Imaging Systems. Acoustical Society of America 108(6), 2901–2910 (2000)

    Article  Google Scholar 

  5. Nakashima, H., Mukai, T.: 3D Sound Source Localization System Based on Learning of Binaural Hearing. In: Proc. IEEE Inter. Conf. on Systems, Man and Cybernetics (SMC), Nagoya, Japan, October 10-12, vol. 4, pp. 3534–3539 (2005)

    Google Scholar 

  6. Ovcharenko, A., Cho, S.J., Chonga, U.P.: Front-back confusion resolution in three-dimensional sound localization using databases built with a dummy head. Acoustical Society of America 122(1), 489–495 (2007)

    Article  Google Scholar 

  7. Rodemann, T., Ince, G., Joublin, F., Goerick, C.: Using Binaural and Spectral Cues for Azimuth and Elevation Localization. In: Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), Nice, France, pp. 2185–2190 (September 2008)

    Google Scholar 

  8. Blauert, J.: Spatial Hearing: The Psychophysics of Human Sound Localization (Revised Edition). MIT Press, Cambridge (1997)

    Google Scholar 

  9. Kim, U.H., Okuno, H.G.: Improved Binaural Sound Localization and Trackingfor Unknown Time-Varying Number of Speakers. Advanced Robotics (to be published)

    Google Scholar 

  10. Middlebrooks, J.C.: Sound Localization by Human Listeners. Annual Review of Psychology 42, 135–159 (1991)

    Article  Google Scholar 

  11. Suzuki, Y., Asano, F., Kim, H.-Y., Sone, T.: An Optimum Computer-Generated Pulse Signal Suitable for the Measurement of very Long Impulse Responses. Acoustical Society of America 97(2), 1119–1123 (1995)

    Article  Google Scholar 

  12. Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 6(1), 1–3 (1999)

    Article  Google Scholar 

  13. http://youtu.be/iCE--ir-JRc

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, UH., Nakadai, K., Okuno, H.G. (2013). Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears. In: Ali, M., Bosse, T., Hindriks, K.V., Hoogendoorn, M., Jonker, C.M., Treur, J. (eds) Recent Trends in Applied Artificial Intelligence. IEA/AIE 2013. Lecture Notes in Computer Science(), vol 7906. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38577-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38577-3_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38576-6

  • Online ISBN: 978-3-642-38577-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics