Abstract
An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sasaki, Y., Kabasawa, M., Thompson, S., Kagami, S., Oro, K.: Spherical Microphone Array for Spatial Sound Localizationfor a Mobile Robot. In: Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), Algarve, Portugal, pp. 713–718 (October 2012)
Cheng, C.I., Wakefield, G.H.: Introduction to Head-Related Transfer Functions (HRTFs): Representations of HRTFs in Time, Frequency, and Space. Audio Engineering Society 49, 231–249 (2001)
Knapp, C.H., Carter, G.C.: The Generalized Correlation Method for Estimation of Time Delay. IEEE Trans. on Acoustics, Speech, and Signal Processing 24(4), 320–327 (1976)
Hill, P.A., Nelson, P.A., Kirkeby, O., Hamada, H.: Resolution of Front-Back Confusion in Virtual Acoustic Imaging Systems. Acoustical Society of America 108(6), 2901–2910 (2000)
Nakashima, H., Mukai, T.: 3D Sound Source Localization System Based on Learning of Binaural Hearing. In: Proc. IEEE Inter. Conf. on Systems, Man and Cybernetics (SMC), Nagoya, Japan, October 10-12, vol. 4, pp. 3534–3539 (2005)
Ovcharenko, A., Cho, S.J., Chonga, U.P.: Front-back confusion resolution in three-dimensional sound localization using databases built with a dummy head. Acoustical Society of America 122(1), 489–495 (2007)
Rodemann, T., Ince, G., Joublin, F., Goerick, C.: Using Binaural and Spectral Cues for Azimuth and Elevation Localization. In: Proc. IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems (IROS), Nice, France, pp. 2185–2190 (September 2008)
Blauert, J.: Spatial Hearing: The Psychophysics of Human Sound Localization (Revised Edition). MIT Press, Cambridge (1997)
Kim, U.H., Okuno, H.G.: Improved Binaural Sound Localization and Trackingfor Unknown Time-Varying Number of Speakers. Advanced Robotics (to be published)
Middlebrooks, J.C.: Sound Localization by Human Listeners. Annual Review of Psychology 42, 135–159 (1991)
Suzuki, Y., Asano, F., Kim, H.-Y., Sone, T.: An Optimum Computer-Generated Pulse Signal Suitable for the Measurement of very Long Impulse Responses. Acoustical Society of America 97(2), 1119–1123 (1995)
Sohn, J., Kim, N.S., Sung, W.: A Statistical Model-Based Voice Activity Detection. IEEE Signal Processing Letters 6(1), 1–3 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, UH., Nakadai, K., Okuno, H.G. (2013). Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears. In: Ali, M., Bosse, T., Hindriks, K.V., Hoogendoorn, M., Jonker, C.M., Treur, J. (eds) Recent Trends in Applied Artificial Intelligence. IEA/AIE 2013. Lecture Notes in Computer Science(), vol 7906. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38577-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-38577-3_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38576-6
Online ISBN: 978-3-642-38577-3
eBook Packages: Computer ScienceComputer Science (R0)