ABSTRACT
This paper proposes a method to shift an agent's personality during speech interaction to reduce users' negative impressions of speech recognition systems when speech recognition fails. Speech recognition failure makes users uncomfortable, and the cognitive strain in rephrasing commands is high. The proposed method aims to eliminate users' negative impression of agents by allowing an agent to have multiple personalities and accept responsibility for the failure, with the personality responsible for failure being removed from the task. System hardware remains the same, and users can continue to interact with another personality of the agent. Shifting the agent's personality is represented by a change in voice tone and LED color. Experimental results suggested that the proposed method reduces users' negative impressions by improving communication between users and the agent.
- Murtaza Bulut, Shrikanth S Narayanan, and Ann K Syrdal. 2002. Expressive speech synthesis using a concatenative synthesizer. In Seventh International Conference on Spoken Language Processing.Google Scholar
- B Çürüklü, G Dodig-Crnkovic, and B Akan. 2010. Towards industrial robots with human-like moral responsibilities. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI). Institute of Electrical and Electronics Engineers, 85--86. https://doi.org/10.1109/HRI.2010.5453259 Google ScholarDigital Library
- Susan T Fiske. 1980. Attention and weight in person perception: The impact of negative and extreme behavior. Journal of personality and Social Psychology 38, 6 (1980), 889.Google ScholarCross Ref
- Stella George. 2019. From Sex and Therapy Bots to Virtual Assistants and Tutors: How Emotional Should Artificially Intelligent Agents Be?. In Proceedings of the 1st International Conference on Conversational User Interfaces (Dublin, Ireland) (CUI '19). Association for Computing Machinery, New York, NY, USA, Article 19, 3 pages. https://doi.org/10.1145/3342775.3342807 Google ScholarDigital Library
- Steven Guamán, Adrián Calvopiña, Pamela Orta, Freddy Tapia, and Sang Guun Yoo. 2018. Device Control System for a Smart Home Using Voice Commands: A Practical Case. In Proceedings of the 2018 10th International Conference on Information Management and Engineering (Salford, United Kingdom) (ICIME 2018). Association for Computing Machinery, New York, NY, USA, 86--89. https://doi.org/10.1145/3285957.3285977 Google ScholarDigital Library
- Nobukatsu Hojo, Yusuke Ijima, and Hideyuki Mizuno. 2018. DNN-based speech synthesis using speaker codes. IEICE TRANSACTIONS on Information and Systems 101, 2 (2018), 462--472.Google ScholarCross Ref
- Tomoyuki Kato, Jun Okamoto, and Makoto Shozakai. 2008. Analysis of drivers' speech in a car environment. In {INTERSPEECH} 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008. ISCA, 1634--1637. http://www.isca-speech.org/archive/interspeech_2008/i08_1634.htmlGoogle Scholar
- T Le, P Gilberton, and N Q K Duong. 2019. Discriminate Natural versus Loudspeaker Emitted Speech. In ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Institute of Electrical and Electronics Engineers, 501--505. https://doi.org/10.1109/ICASSP.2019.8683227Google ScholarCross Ref
- Nadim Nachar. 2008. The Mann-Whitney U: A Test for Assessing Whether Two Independent Samples Come from the Same Distribution. Tutorials in Quantitative Methods for Psychology 4 (03 2008). https://doi.org/10.20982/tqmp.04.1.p013Google Scholar
- Ryota Nishimura, Yuki Todo, Kazumasa Yamamoto, and Seiichi Nakagawa. 2013. Chat-like Spoken Dialog System for a Multi-party Dialog Incorporating Two Agents and a User. In Proc. of iHAI2013: The 1st International Conference on Human-Agent Interaction. II-2-p13.Google Scholar
- Helena Webb, Marina Jirotka, Alan F.T. Winfield, and Katie Winkle. 2019. Human-Robot Relationships and the Development of Responsible Social Robots. In Proceedings of the Halfway to the Future Symposium 2019 (HTTF 2019). Association for Computing Machinery, New York, NY, USA, 1--7. https://doi.org/10.1145/3363384.3363396 Google ScholarDigital Library
- K Yamamoto, K Inoue, S Nakamura, K Takanashi, and T Kawahara. 2018. Dialogue Behavior Control Model for Expressing a Character of Humanoid Robots. In 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). Institute of Electrical and Electronics Engineers, 1732--1737. https://doi.org/10.23919/APSIPA.2018.8659624Google ScholarCross Ref
- Yuichiro Yoshikawa, Takamasa Iio, Tsunehiro Arimoto, Hiroaki Sugiyama, and Hiroshi Ishiguro. 2017. Proactive Conversation between Multiple Robots to Improve the Sense of Human-Robot Conversation. In AAAI 2017 Fall Symposium Series. 288--294.Google Scholar
Index Terms
- Designing Personality Shifting Agent for Speech Recognition Failure
Recommendations
Prosody modification for speech recognition in emotionally mismatched conditions
A degradation in the performance of automatic speech recognition systems (ASR) is observed in mismatched training and testing conditions. One of the reasons for this degradation is due to the presence of emotions in the speech. The main objective of ...
Application of Emotion Recognition and Modification for Emotional Telugu Speech Recognition
AbstractMajority of the automatic speech recognition systems (ASR) are trained with neutral speech and the performance of these systems are affected due to the presence of emotional content in the speech. The recognition of these emotions in human speech ...
MFCC-GMM based accent recognition system for Telugu speech signals
Speech processing is very important research area where speaker recognition, speech synthesis, speech codec, speech noise reduction are some of the research areas. Many of the languages have different speaking styles called accents or dialects. ...
Comments