Active audition using the parameter-less self-organising map

Berglund, Erik; Sitte, Joaquin; Wyeth, Gordon

doi:10.1007/s10514-008-9084-9

Active audition using the parameter-less self-organising map

Published: 31 January 2008

Volume 24, pages 401–417, (2008)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Erik Berglund¹,
Joaquin Sitte² &
Gordon Wyeth¹

175 Accesses
Explore all metrics

Abstract

This paper presents a novel method for enabling a robot to determine the position of a sound source in three dimensions using just two microphones and interaction with its environment. The method uses the Parameter-Less Self-Organising Map (PLSOM) algorithm and Reinforcement Learning (RL) to achieve rapid, accurate response. We also introduce a method for directional filtering using the PLSOM. The presented system is compared to a similar system to evaluate its performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mechanical intelligence for learning embodied sensor-object relationships

Article Open access 15 July 2022

Learning High-Level Navigation Strategies via Inverse Reinforcement Learning: A Comparative Analysis

Adapting to Environment Changes Through Neuromodulation of Reinforcement Learning

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Avendano, C., Algazi, V. R., & Duda, R. O. (1999). A head-and-torso model for low-frequency binaural elevation effects. In Proceedings of workshop on applications of signal processing to audio and acoustics (pp. 179–182), October 1999.
Berglund, E., & Sitte, J. (2003). The parameter-less SOM algorithm. In ANZIIS (pp. 159–164).
Berglund, E., & Sitte, J. (2006). The parameter-less self-organising map algorithm. IEEE Transactions on Neural Networks, 17(2), 305–316.
Article Google Scholar
Blauert, J. (1983). Spatial hearing. Cambridge: MIT Press.
Google Scholar
Bregman, A. (1990). Auditory scene analysis. Massachusetts: MIT Press.
Google Scholar
Brungart, D. S., & Rabiowit, W. R. (1996). Auditory localization in the near-field. In Proceedings of the ICAD, international community for auditory display.
Day, C. (2001). Researchers uncover the neural details of how Barn Owls locate sound sources. Physics Today, 54, 20–22.
Article Google Scholar
Ge, S. S., Loh, A. P., & Guan, F. (2003). Sound localization based on mask diffraction. In ICRA ’03 (Vol. 2, pp. 1972–1977), September 2003.
Gosavi, A. (2003). Simulation-based optimization: parametric optimization techniques and reinforcement learning. Dordrecht: Kluwer.
MATH Google Scholar
Guentchev, K., & Weng, J. (1998). Learning based three dimensional sound localization using a compact non-coplanar array of microphones. In AAAI spring symposium on international environments.
Huang, J., Ohnishi, N., & Sugie, N. (1995). A biometric system for localization and separation of multiple sound sources. IEEE Transactions on Instrumentation and Measurement, 44(3), 733–738.
Article Google Scholar
Huang, J., Ohnishi, N., & Sugie, N. (1997). Building ears for robots: sound localization and separation. Artificial Life and Robotics, 1(4), 157–163.
Article Google Scholar
Iske, B., Rueckert, U., Sitte, J., & Malmstrom, K. (2000). A bootstrapping method for autonomous and in site learning of generic navigation behaviour. In Proceedings of the 15th international conference on pattern recognition (Vol. 4, pp. 656–659), Barcelona, Spain, September 2000.
Kitano, H., Okuno, H. G., Nakadai, K., Matsui, T., Hidai, K., & Lourens, T. (2002). SIG, the humanoid. http://www.symbio.jst.go.jp/symbio/SIG/.
Konishi, M. (1993). Listening with two ears. Scientific American, 268(4), 34–41. Deals with how the owl locate it’s prey by hearing. Of special interest to me is the layout of the owl’s ears and neural pathways. A lot of the information on the biology of owls is redundant.
Article Google Scholar
Kuhn, G. F. (1987). Acoustics and measurements pertaining to directional hearing. In Directional hearing (pp. 3–25). New York: Springer.
Google Scholar
Kumon, M., Shimoda, T., Kohzawa, R., Mizumoto, I., & Iwai, Z. (2005). Audio servo for robotic systems with pinnae. In International conference on intelligent robots and systems (pp. 885–890).
Mershon, D. H., & Bowers, J. N. (1979). Absolute and relative cues for the auditory perception of egocentric distance. Perception, 8, 311–322.
Article Google Scholar
Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.
MATH Google Scholar
Moore, B. C. J. (1997). An introduction to the psychology of hearing (4th ed.). New York: Academic Press.
Google Scholar
Nakadai, K. (2004). Private communication.
Nakadai, K., Lourens, T., Okuno, H. G., & Kitano, H. (2000a). Active audition for humanoid. In AAAI-2000 (pp. 832–839).
Nakadai, K., Okuno, H. G., Laurens, T., & Kitano, H. (2000b). Humanoid active audition system. In IEEE-RAS international conference on humanoid robots.
Nakadai, K., Hidai, K., Mizoguchi, H., Okuno, H. G., & Kitano, H. (2001). Real-time auditory and visual multiple-object tracking for humanoids. In IJCAI (pp. 1425–1436).
Nakadai, K., Okuno, H., & Kitano, H. (2002a). Realtime sound source localization and separation for robot audition. In Proceedings IEEE international conference on spoken language processing (pp. 193–196).
Nakadai, K., Okuno, H. G., & Kitano, H. (2002b). Exploiting auditory fovea in humanoid-human interaction. In Proceedings of the eighteenth national conference on artificial intelligence (pp. 431–438).
Nakadai, K., Matsuura, D., Okuno, H. G., & Kitano, H. (2003a). Applying scattering theory to robot audition system: robust sound source localization and extraction. In Proceedings of the 20003 IEEE/RSJ international conference on intelligent robots and systems (pp. 1147–1152).
Nakadai, K., Okuno, H. G., & Kitano, H. (2003b). Robot recognizes three simultaneous speech by active audition. In ICRA ’03 (Vol. 1, pp. 398–405).
Nakadai, K., Okuno, H. G., & Kitano, H. (2003c). Robot recognizes three simultaneous speech by active audition. In ICRA ’03 (Vol. 1, pp. 398–405).
Nakashima, H., Mukai, T., & Ohnishi, N. (2002). Self-organization of a sound source localization robot by perceptual cycle. In Proceedings of the 9th international conference neural information processing (Vol. 2, pp. 834–838).
Nakatani, T., Okuno, H. G., & Kawabata, T. (1994). Auditory stream segregation in auditory scene analysis with a multi-agent system. In AAAI-94 (pp. 100–107).
Obata, K., Noguchi, K., & Tadokoro, Y. (2003). A new sound source location algorithm based on formant frequency for sound image localization. In Proceedings 2003 international conference on multimedia and expo (Vol. 1, pp. 729–732), July 2003.
Rabinkin, D., Renomeron, R., Dahl, A., French, J., Flanagan, J., & Bianchi, M. (1996a). A DSP implementation of source location using microphone arrays. Proceedings of the SPIE, 2846, 88–99.
Article Google Scholar
Rabinkin, D., Renomeron, R., French, J., & Flanagan, J. (1996b). Estimation of wavefront arrival delay using the crosspower spectrum phase technique. In Proceedings of 132nd meeting of the ASA.
Reid, G., & Milios, E. (1999). Active stereo sound localization.
Reid, G., & Milios, E. (2003). Active stereo sound localization. Journal of the Acoustical Society of America, 113(1), 185–193.
Article Google Scholar
Rucci, M., Edelman, G., & Wray, J. (1999). Adaptation of orienting behavior: from the barn owl to a robotic system. IEEE Transactions on Robotics and Automation, 15(1), 15.
Article Google Scholar
Shaw, E. A. G. (1997). Acoustical features of the human external ear. In Binaural and spatial hearing in real and virtual environments (pp. 49–75). Mahwah: Lawrence Erlbaum Associates.
Google Scholar
Sitte, J., Malmstrom, K., & Iske, B. (2000). Perception stimulated generation of simple navigation behaviour. In Proceedings of SPIE: Vol. 4195. Mobile robots, Boston, MA, USA (pp. 228–239).
Sony (2005). Open-R and Aibo documentation. http://openr.aibo.com/openr/eng/index.php4.
Strutt, 3rd Baron Rayleigh, J. W. (1896). The theory of sound (2nd ed.). London: Macmillan.
MATH Google Scholar
Sutton, R. S. (Ed.). (1992). Reinforcement learning. Dordrecht: Kluwer Academic.
Google Scholar
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: an introduction. Cambridge: MIT Press.
Google Scholar
Tamai, Y., Kagami, S., Amemiya, Y., Sasaki, Y., Mizoguchi, H., & Takano, T. (2004). Circular microphone array for robot’s audition. In Proceedings of IEEE (pp. 565–570).
Tamai, Y., Sasaki, Y., Kagami, S., & Mizoguchi, H. (2005). Three ring microphone array for 3D sound localization and separation for mobile robot audition. In Proceedings of international conference on intelligent robots and systems (pp. 903–908).
Yamamoto, K., Asano, F., van Rooijen, W. F. G., Ling, E. Y. L., Yamada, T., & Kitawaki, N. (2003). Estimation of the number of sound sources using support vector machines and its application to sound source separation. In ICASSP ’03 (Vol. 5, pp. 485–488), April 2003.

Download references

Author information

Authors and Affiliations

School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Australia
Erik Berglund & Gordon Wyeth
School of Software Engineering and Data Communication, Queensland University of Technology, Brisbane, Australia
Joaquin Sitte

Authors

Erik Berglund
View author publications
You can also search for this author inPubMed Google Scholar
Joaquin Sitte
View author publications
You can also search for this author inPubMed Google Scholar
Gordon Wyeth
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Erik Berglund.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berglund, E., Sitte, J. & Wyeth, G. Active audition using the parameter-less self-organising map. Auton Robot 24, 401–417 (2008). https://doi.org/10.1007/s10514-008-9084-9

Download citation

Received: 18 November 2005
Accepted: 03 January 2008
Published: 31 January 2008
Issue Date: May 2008
DOI: https://doi.org/10.1007/s10514-008-9084-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active audition using the parameter-less self-organising map

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mechanical intelligence for learning embodied sensor-object relationships

Learning High-Level Navigation Strategies via Inverse Reinforcement Learning: A Comparative Analysis

Adapting to Environment Changes Through Neuromodulation of Reinforcement Learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now