ABSTRACT
The Lombard effect is the involuntary tendency of speakers to increase their vocal effort when speaking in a loud noise to enhance the audibility of their voice. There is a problem in telecommunication due to the Lombard effect. A speaker talks at a louder volume than necessary for the conversation partner at a remote location. This paper proposes a volume model that is required in order to automatically adjust the volume of an operator's voice at a remote communication via a telepresence robot, and develops an optimal volume control system LombaBot equipped on a telepresence robot with the model. The volume model measures the level of noise around the robot and the distance between a conversation partner and the robot to adjust the volume of the operator's voice. It has two types of volume adjustments. Those are called comfortable volume and secret talk volume. LombaBot enables people at a remote site to listen comfortably to the voice of a robot operator. Moreover, the operator is able to talk in low voices when s/he wants to talk in secret with nearby people. We confirmed that LombaBot adjusted the volume of an operator's voice properly in the noisy remote location.
Supplemental Material
- Texai: http://www.willowgarage.com/pages/texai/principlesGoogle Scholar
- QB: https://www.anybots.com/Google Scholar
- Min Kyung Lee et al. ""Now, I Have a Body": Uses and Social Norms for Mobile Remote Presence in the Workplace", in Proc. of CHI2011, pp.33--42, 2011. Google ScholarDigital Library
- F. Tanaka et al., "Child-operated telepresence robot: A field trial connecting classrooms between Australia and Japan," in Proc. of IEEE/RAS IROS2013, pp. 5896 - 5901, 2013.Google Scholar
- Lombard, E. "Le signe de le elevation de la voix," Ann. Malad. l'Orielle. Larynx. Nez. Pharynx 37, 101--119, 1911.Google Scholar
- Barbara Hilsenbeck et al., "Listening for people: Exploiting the spectral stracture of speech to robustly perceive the presence of people" in Proc. of IEEE/RAS IROS 2011, pp. 2903--2909, 2011.Google Scholar
- A, Deleforge et al., "The Cocktail Party Robot: Sound Source Separation and Localisation with an Active Binaural Head," in Proc. of HRI2012,pp.431--438,2012. Google ScholarDigital Library
- R.Takeda et al., "ICA-Based efficient dereverberation and echo cancellation method for barge-in-able robot audition" in Proc. of IEEE ICASSP2009, pp. 3367--3680, 2009. Google ScholarDigital Library
- A.Berkhout et al., "Acoustic control by wave field synthesis," J. acoust. Soc. Amer., vol.93, pp. 2764--2778, 1993.Google ScholarCross Ref
- D.Malhan et al., "3-D sound spatialization using ambisonic techniques," J. Conput. Music, vol. 19, no. 4, pp. 58--70, 1995Google ScholarCross Ref
- V. Pullki, "Virtual sound source positioning using vector base amplitude panning" J. Audio Eng. Soc., vol. 45, pp. 456--466, 1997.Google Scholar
- Myung-Suk Song et al.,"An Interactive 3-D Audio System With Loudspeakes". IEEE TRANSACTION ON MULTIMEDIA, vol.13, no.5,pp. 844--855, 2011. Google ScholarDigital Library
- Goldenberg et.al, "R.,The Lombard Effect's Influence on Automatic Speaker Verification Systems and Methods for its Compensation, Information Technology: Research and Education,pp.233--237, 2006.Google Scholar
- Ogawa,T et.al.,"Adequacy Analysis of Simulation-based Assessment of Speech Recognition System", " in Proc. of IEEE ICASSP'98, pp. 1153--1156, 1998.Google Scholar
- John H.L.Hansen et.al., "Analysis and Compensation of Lombard Speech Across Noise Type and Levels With Application to In-Set/Out-of-Set Speaker Recognition",IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LAUNGUAGE PROCESSING, Vol.17, No.2, pp.366--378, 2009. Google ScholarDigital Library
- W. Van Summers, D. B. Pisoni, R. Bernacki, R. Pedlow, and M. Stokes, "Effects of noise on speech production:Acoustical and perceptualanalyses," J. Acous. Soc. Amer., pp. 917--928, Sep. 1988.Google ScholarCross Ref
- Andreas Paepcke et al., "Yelling In the Hall: Using Sidetone to Address a Problem with Mobile Remoto Presence Systems," in Proc. of ACM UIST2011, pp 107--116, 2011. Google ScholarDigital Library
- A.Kimura et al, "Visual Feedback: Its effect on teleconferencing," in proceeding of HCI international, pp.591--600, 2007 Google ScholarDigital Library
- P.L.Chu, "Voice-Activated AGC for Teleconferencing," in Proc. of IEEE ICASSP'96, pp. 929--932, 1996. Google ScholarDigital Library
- G.R.Steber, "Digital Signal Processing In Automatic Gain Control Systems," Industrial Electonics Society(IECON), pp. 381--384,1988.Google Scholar
- J. J. Lopez et al, "Measurement of cross-talk cancelation and equalization zones in 3-D sound reproduction under real listening conditions," in Proc. of Audio Engineering Society 16th Int. Conf.,1999.Google Scholar
- E. T. Hall, The Hidden Dimension. Doubleday, NY, 1966.Google Scholar
- M. L. Waltes, "Human Approach Distance to a Mechanical-Looking Robot with Different Robot Voice Styles,", in Proc. of IEEE RO-MAN2008, pp 707--712, 2008. http://www.sunrisemusic.co.jp/database/database00.htmlGoogle Scholar
- Kazuhiro Nakadai, Toru Takahashi, Hiroshi G.Okuno, Hirofumi Nakajima, Yuji Hasegawa, Huroshi Tsujino: Design and Implementation of Robot Audition System "HARK," Advanced Robotics, vol.24 pp.739--761, 2010. HARK Main Page: http://winnie.kuis.kyoto-u.ac.jp/HARK/ .Google ScholarCross Ref
- Morgan Quigley, Brian Gerkey, Ken Conley, Josh, Faust, Tully Foote, Jeremy Leibs, Eric Berger, RobWheeler, Andrew Ng: ROS: an open-source Robot Operating System in IEEE-RAS International Conference on Robotics and Automation (ICRA) Work shop on Open Source Software in Robotics, 2009.Google Scholar
- ROS: http://www.ros.orgGoogle Scholar
Index Terms
- Volume adaptation and visualization by modeling the volume level in noisy environments for telepresence system
Recommendations
Speech intelligibility improvement in noisy environments based on energy correlation in frequency bands
A new speech processing algorithm is proposed to improve speech intelligibility in noisy environments without increasing speech energy. The method improves the near-end speech intelligibility by optimizing the frame-based spectral energy correlation ...
A Robust Speech Recognition System for Communication Robots in Noisy Environments
The application range of communication robots could be widely expanded by the use of automatic speech recognition (ASR) systems with improved robustness for noise and for speakers of different ages. In past researches, several modules have been proposed ...
Comments