ABSTRACT
Interactive robots are increasingly being deployed in public spaces that may differ in context from moment to moment. One important aspect of this context is the soundscape of the robot and human's shared environment, such as an airport that is noisy during a weekend rush hour, yet quiet on a weekday evening. Just as humans are adept at adapting their speech appropriately to their environment, robots should adjust their speech characteristics (e.g. speech rate, volume) to their context. We studied the effect of a shared auditory soundscape on the perceived ideal speech rate of an artificial agent. We tasked raters to listen to a combination of text-to-speech (TTS) samples with different speech rates and soundscape samples from freesound.org and to evaluate the appropriateness of the speech combination and social perception of artificial speech. Contrary to our expectations, faster artificial speech in louder environments and slower speech in quieter environments were not preferred by raters. This suggests that further research into how exactly to adapt artificial speech to background noise is necessary.
- M. L. Walters, D. S. Syrdal, K. Dautenhahn, R. Boekhorst, and K. L. Koay, "Avoiding the Uncanny Valley, Robot Appearance, Personality and Consistency of Behavior in an Attention Seeking Home Scenario for a Robot Companion," Autonomous Robots, Vol. 24, No.2, pp. 159--178, 2008.Google ScholarDigital Library
- R. Meyer von Wolff, S. Hobert, and M. Schumann, "How May I Help You? -- State of the Art and Open Research Questions for Chatbots at the Digital Workplace," In Proceedings of the 52nd Hawaii International Conference on System Sciences. pp. 95--104, 2019.Google Scholar
- Y. Okuno, T. Kanda, M. Imai, H. Ishiguro, and N. Hagita, "Providing Route Directions: Design of Robot's Utterance, Gesture, and Timing". In Proceedings of the 4th ACM/IEEE International Conference on Human-Robot Interaction, pp. 53--60, 2009.Google ScholarDigital Library
- C. P. Fulford, and S. Zhang, "Perceptions of Interaction: The Critical Predictor in Distance Education". American Journal of Distance Education, Vol. 7, No. 3, pp. 8--21, 1993.Google ScholarCross Ref
- T. Johnstone, and K. R. Scherer, "Vocal Communication of Emotion". The Handbook of Emotion, pp. 220--235, 2000.Google Scholar
- D. B. Buller, B. A. L. Poire , R. K. Aune, and S. V. Eloy, "Social Perceptions as Mediators of the Effect of Speech Rate Similarity on Compliance." Human Communication Research, Vol. 19, No. 2, pp. 286--311, 1992.Google ScholarCross Ref
- B. K. Simonds, K. R. Meyer, M. M. Quinlan, and S. K. Hunt, "Effects of Instructor Speech Rate on Student Affective Learning." Recall, and Perceptions of Nonverbal Immediacy, Credibility, and Clarity. Communication Research Reports, Vol. 23, No. 3, pp. 187--197, 2006.Google ScholarCross Ref
- M. A. Goodman, D. H. Robinson, S. L. Robinson, C. H. Skinner., and H. E. Sterling, "Effects of Advertisement Speech Rates on Feature Recognition,and Product and Speaker Rating." International Journal of Listening, Vol. 13, pp. 97--110, 1999.Google ScholarCross Ref
- C. P. Fulford, "A Model of Cognitive Speed. International Journal of Instructional Media", Vo. 28, No. 1, pp. 31--41, 2001.Google Scholar
- M. W. Kraus, "Voice-only communication enhances empathic accuracy." American Psychologist Vol. 72, No. 7, pp. 644--654, 2017.Google ScholarCross Ref
- C. Breazeal, "Emotive qualities in robot speech", In Proceedings of IEEE/RSJInternational Conference on Intelligent Robots and Systems, pp. 1389--1394, 2001.Google Scholar
- P.Taylor, "Text-to-Speech Synthesis", Cambridge university press, 2009.Google Scholar
- R. W. Frick, "Communicating emotion' The role of prosodic features", Psychological Bulletin, Vol.97, No.3, pp. 412--429, 1985.Google ScholarCross Ref
- K. Scherer, "Vocal Affect Expression: A Review and a Model for Future Research," Psychological Bulletin, Vol. 99, No. 2, pp.143, 1986.Google ScholarCross Ref
- C.M. Carpinella,. A.B. Wyman, M.A. Perez and S.J. Stroessner, "The Robotic Social Attributes Scale (RoSAS): Development and Validation," In Proceedings of the 12th ACM/IEEE International Conference on Human-Robot Interaction, Vol. 6, No. 9, pp. 254--262, 2017.Google Scholar
- H. Lane and B. Tranel, "The Lombard Sign and the Role of Hearing in Speech", Journal of Speech and Hearing Research, Vol. 14, No. 4, pp.677--709, 1971.Google ScholarCross Ref
Index Terms
- Perceptual Effects of Ambient Sound on an Artificial Agent's Rate of Speech
Recommendations
Analysis and modeling of F0 contours for cantonese text-to-speech
For the generation of highly natural synthetic speech, the control of prosody is of primary importance. The fundamental frequency (F0) is one of the most important components of speech prosody. This research investigates the variation of F0 in ...
Automated assessment and treatment of speech rate and intonation in dysarthria
PervasiveHealth '13: Proceedings of the 7th International Conference on Pervasive Computing Technologies for HealthcareProsody assessment and treatment in dysarthria is clinically relevant, since prosodic impairment can have a negative impact on speech intelligibility and thus on participation in daily life conversation. We propose a speech-technology based software ...
The Effect of Human-Likeliness in French Robot-Directed Speech: A Study of Speech Rate and Fluency
Text, Speech, and DialogueAbstractRobot-directed speech refers to speech to a robotic device, ranging from small home smart speakers to full-size humanoid robots. Studies have investigated the phonetic and linguistic properties of this type of speech or the effect of ...
Comments