skip to main content
10.1145/1349822.1349851acmconferencesArticle/Chapter ViewAbstractPublication PageshriConference Proceedingsconference-collections
research-article

Loudness measurement of human utterance to a robot in noisy environment

Published: 12 March 2008 Publication History

Abstract

In order to understand utterance based human-robot interation, and to develop such a system, this paper initially analyzes how loud humans speak in a noisy environment. Experiments were conducted to measure
how loud humans speak with 1) different noise levels, 2) different number of sound sources, 3) different sound sources, and 4) different distances to a robot. Synchronized sound sources add noise to the auditory scene, and resultant utterances are recorded and compared to a previously recorded noiseless utterance. From experiments, we understand that humans generate basically the same level of sound pressure level at his/her location irrespective of distance and background noise. More precisely, there is a band according to a distance, and also according to sound sources that is including
language pronounce.
According to this understanding, we developed an online spoken command recognition system for a mobile robot. System consists of two key componenets: 1) Low side-lobe microphone array that works as omini-directional telescopic microphone, and 2) DSBF combined with FBS
method for sound source localization and segmentation. Caller location and segmented sound stream are calculated, and then the segmented sound stream is sent to voice recognition system. The system works with at most five sound sources at the same time with about at most
18[dB] sound pressure differences. Experimental results with the modile robot are also shown.

References

[1]
A.Lee, T.Kawahara and K.Shikano. Julius - an open source real-time large vocabulary recognition engine. In Proceedings of European Conference on Speech Communication and Technology, pages 1691--1694, 2001.
[2]
C. Breazeal. Designing Sociable Robots. MIT Press, 2002.
[3]
A. A. E. Weinstein, K. Steele and J. Glass. Loud: A 1020-node modular microphone array and beamformer for intelligent computing spaces. Technical Report MIT-LCS-TM-642, MIT/LCS Technical Memo, April 2004.
[4]
J. Hirokawa, T. Koga, K. Suzuki, O. Hideki, and N. Matsuhira. Development of a high performance auditory function robot in interaction with human - aprialphatm with omni-directional auditory function - (in japanese). In Proceedings of Robotics and Mechatronics Conference 2006, pages 1A1--E16, Okubo Campas, Waseda University, May 2006.
[5]
C. T. Ishi, S. Matsuda, T. Kanda, T. Jitsuhiro, H. Ishiguro, S. Nakamura, and N. Hagita. Robust speech recognition system for communication robots in real environments. In Proceedings of IEEE-RAS International Conference on Humanoid Robots(HUMANOIDS2006), pages 340--345, Genova, Italy, December 2006.
[6]
James J. Kuffner. Efficient optimal search of Euclidean-cost grids and lattices. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004.
[7]
J.-C. Junqua. The Lombard reflex and its role on human listeners and automatic speech recognizer. The Journal of the Acoustical Society of America, 93(1):510--524, 1993.
[8]
E. Martinson and A. Xchultz. Auditory evidence grids. In Proceedings of 2006 IEEE/RSJ International Conference on Intelligent Robot and Systems (IROS2006), pages 1140--1145, Beijing, China, October 2006.
[9]
T. Matsui, H. Asoh, J. Fry, Y. Motomura, F. Asano, T. Kurita, I. Hara, and N. Otsu. Integrated natural spoken dialogue system of jijo-2 mobile robot for office services. In Proceedings of the American Association of Artificial Intelligence (AAAI-99), pages 621--627, 1999.
[10]
M. Murase, S. Yamamoto, J.-M. Valin, K. Nakadai, K. Yamada, K. Komatani, T. Ogata, and H. G. Okuno. Multiple moving speaker tracking by microphone array on mobile robot. In Proceedings of Proceedings of the Nineth European Conference on Speech Communication and Technology (Interspeech-2005), pages 249--252, Lisboa, Portugal, September 2005.
[11]
K. Nakadai, H. Nakajima, M. Murase, S. Kaijiri, K. Yamada, Y. Hasegawa, H. G. Okuno, and H. Tsujino. Real-time tracking of multiple sound sources by integration of in-room and robot-embedded microphone arrays. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2006), pages 852--859, Beijing, China, September 2006.
[12]
K. Nakadai, H. Nakajima, M. Murase, S. Kaijiri, K. Yamada, T. Nakamura, Y. Hasegawa, H. G. Okuno, and H. Tsujino. Robust tracking of multiple sound sources by spatial integration of room and robot microphone arrays. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2006, pages IV 929--932, Toulouse, France, May 2006.
[13]
M. SATO, A. SUGIYAMA, and S. OHNAKA. Near-field sound-source localization and adaptive noise cancellation in a personal robot, papero (in japanese). In Proceedings of the 22th Meeting of Special Interest Group on AI Challenges, pages 41--46, October 2005.
[14]
J.-M. Valin, F. Michaud, and J. Rouat. Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering. Robotics and Autonomous Systems Journal (Elsevier), 55(3):206--228, 2007.
[15]
S. Yamamoto, K. Nakadai, H. Tsujino, T. Yokoyama, and H. G. Okuno. Improvement of robot audition by interfacing sound source separation and automatic speech recognition with missing feature theory. In Proceedings of IEEE-RAS International Conference on Robots and Automation (ICRA2004), pages 1517--1523, New Orleans, May 2004.

Cited By

View all
  • (2025)Noise Suppression Method With Low-Complexity Noise Estimation Model and Heuristic Noise-Masking Algorithm for Real-Time Processing of Robot Vacuum CleanersIEEE Access10.1109/ACCESS.2024.352293713(789-801)Online publication date: 2025
  • (2013)The false dichotomy between accessibility and usabilityProceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility10.1145/2461121.2461146(1-2)Online publication date: 13-May-2013
  • (2009)Evaluating the utility of auditory perspective-taking in robot speech presentationsProceedings of the 6th international conference on Auditory Display10.1007/978-3-642-12439-6_14(266-286)Online publication date: 18-May-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HRI '08: Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
March 2008
402 pages
ISBN:9781605580173
DOI:10.1145/1349822
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 March 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. human utterance
  2. sound localization
  3. sound pressure level

Qualifiers

  • Research-article

Conference

HRI '08
HRI '08: International Conference on Human Robot Interaction
March 12 - 15, 2008
Amsterdam, The Netherlands

Acceptance Rates

Overall Acceptance Rate 268 of 1,124 submissions, 24%

Upcoming Conference

HRI '25
ACM/IEEE International Conference on Human-Robot Interaction
March 4 - 6, 2025
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Noise Suppression Method With Low-Complexity Noise Estimation Model and Heuristic Noise-Masking Algorithm for Real-Time Processing of Robot Vacuum CleanersIEEE Access10.1109/ACCESS.2024.352293713(789-801)Online publication date: 2025
  • (2013)The false dichotomy between accessibility and usabilityProceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility10.1145/2461121.2461146(1-2)Online publication date: 13-May-2013
  • (2009)Evaluating the utility of auditory perspective-taking in robot speech presentationsProceedings of the 6th international conference on Auditory Display10.1007/978-3-642-12439-6_14(266-286)Online publication date: 18-May-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media