Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears

Kim, Ui-Hyun; Nakadai, Kazuhiro; Okuno, Hiroshi G.

doi:10.1007/978-3-642-38577-3_29

Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears

Ui-Hyun Kim²⁴,
Kazuhiro Nakadai²⁵ &
Hiroshi G. Okuno²⁴

Conference paper

4093 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7906))

Abstract

An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sasaki, Y., Kabasawa, M., Thompson, S., Kagami, S., Oro, K.: Spherical Microphone Array for Spatial Sound Localizationfor a Mobile Robot. In: Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), Algarve, Portugal, pp. 713–718 (October 2012)
Google Scholar
Cheng, C.I., Wakefield, G.H.: Introduction to Head-Related Transfer Functions (HRTFs): Representations of HRTFs in Time, Frequency, and Space. Audio Engineering Society 49, 231–249 (2001)
Google Scholar
Knapp, C.H., Carter, G.C.: The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. on Acoustics, Speech, and Signal Processing 24(4), 320–327 (1976)
Article Google Scholar
Hill, P.A., Nelson, P.A., Kirkeby, O., Hamada, H.: Resolution of Front-Back Confusion in Virtual Acoustic Imaging Systems. Acoustical Society of America 108(6), 2901–2910 (2000)
Article Google Scholar
Nakashima, H., Mukai, T.: 3D Sound Source Localization System Based on Learning of Binaural Hearing. In: Proc. IEEE Inter. Conf. on Systems, Man and Cybernetics (SMC), Nagoya, Japan, October 10-12, vol. 4, pp. 3534–3539 (2005)
Google Scholar
Ovcharenko, A., Cho, S.J., Chonga, U.P.: Front-back confusion resolution in three-dimensional sound localization using databases built with a dummy head. Acoustical Society of America 122(1), 489–495 (2007)
Article Google Scholar
Rodemann, T., Ince, G., Joublin, F., Goerick, C.: Using Binaural and Spectral Cues for Azimuth and Elevation Localization. In: Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), Nice, France, pp. 2185–2190 (September 2008)
Google Scholar
Blauert, J.: Spatial Hearing: The Psychophysics of Human Sound Localization (Revised Edition). MIT Press, Cambridge (1997)
Google Scholar
Kim, U.H., Okuno, H.G.: Improved Binaural Sound Localization and Trackingfor Unknown Time-Varying Number of Speakers. Advanced Robotics (to be published)
Google Scholar
Middlebrooks, J.C.: Sound Localization by Human Listeners. Annual Review of Psychology 42, 135–159 (1991)
Article Google Scholar
Suzuki, Y., Asano, F., Kim, H.-Y., Sone, T.: An Optimum Computer-Generated Pulse Signal Suitable for the Measurement of very Long Impulse Responses. Acoustical Society of America 97(2), 1119–1123 (1995)
Article Google Scholar
Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 6(1), 1–3 (1999)
Article Google Scholar
http://youtu.be/iCE--ir-JRc

Download references

Author information

Authors and Affiliations

Dept. of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto-shi, Japan
Ui-Hyun Kim & Hiroshi G. Okuno
Honda Research Institute Japan Co., Ltd., Wako-shi, Japan
Kazuhiro Nakadai

Authors

Ui-Hyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiro Nakadai
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi G. Okuno
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Texas State University, 78666, San Marcos, TX, USA
Moonis Ali
Agent Systems Research Group, Department of Computer Science, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1081, 1081, Amsterdam, HV, The Netherlands
Tibor Bosse
Interactive Intelligence Group, Department of Intelligent Systems, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands
Koen V. Hindriks & Catholijn M. Jonker &
Computational Intelligence Group, Department of Computer Science, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Mark Hoogendoorn
Agent Systems Research Group, Department of Computer Science, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Jan Treur

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, UH., Nakadai, K., Okuno, H.G. (2013). Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears. In: Ali, M., Bosse, T., Hindriks, K.V., Hoogendoorn, M., Jonker, C.M., Treur, J. (eds) Recent Trends in Applied Artificial Intelligence. IEA/AIE 2013. Lecture Notes in Computer Science(), vol 7906. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38577-3_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-38577-3_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38576-6
Online ISBN: 978-3-642-38577-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics