Online Evaluation of Text to Speech Systems for Three Social Robots

Alonso-Martín, Fernando; Malfaz, María; Castro-González, Álvaro; Castillo, José Carlos; Salichs, Miguel A.

doi:10.1007/978-3-030-35888-4_15

Fernando Alonso-Martín¹⁵,
María Malfaz¹⁵,
Álvaro Castro-González¹⁵,
José Carlos Castillo¹⁵ &
…
Miguel A. Salichs¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11876))

Included in the following conference series:

International Conference on Social Robotics

2404 Accesses
1 Citations
1 Altmetric

Abstract

The success of social robots is mainly based on their capacity for interaction with people. In this regard, verbal and non-verbal communication skills are essential for social robots to get a natural human-robot interaction. This paper focuses on the first of them since the majority of social robots implement a Text to Speech system. We present a comparative study of 8 off-the-shelf systems used in social robots where 125 participants evaluated the performance of the systems. The results show that, in general, the participants detect differences between the Text to Speech systems, being able to determine which are the more intelligible, expressive, and artificial ones. Besides, the participants also conclude that there are some systems more suitable than others depending on the physical appearance of the robots.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://developer.nuance.com/public/index.php?task=mix.
2.
https://store.google.com/es/product/google_home.
3.
https://www.acapela-group.com.
4.
http://monarch-fp7.eu.
5.
http://www.wizzardsoftware.com/text-to-speech-sdk.php.
6.
http://espeak.sourceforge.net.
7.
https://cloud.google.com/text-to-speech.
8.
https://azure.microsoft.com/es-es/services/cognitive-services/text-to-speech.
9.
https://www.ivona.com.
10.
https://www.nuance.com/es-es/omni-channel-customer-engagement/support/loquendo.html.
11.
https://www.nuance.com/es-es/omni-channel-customer-engagement/voice-and-ivr/text-to-speech/vocalizer.html.
12.
https://github.com/naggety/picotts.
13.
https://www.verbio.com.
14.
Online questionnaires (in Spanish): Mini: http://bit.ly/2K3a4I6; Mbot: http://bit.ly/2XwlAyC; and Maggie: http://bit.ly/2WoUIUS.

References

Comparison of speech synthesizers (2017). https://en.wikipedia.org/wiki/Comparison_of_speech_synthesizers
Alonso-Martín, F., Castro-González, A., Luengo, F., Salichs, M.: Augmented robotics dialog system for enhancing human-robot interaction. Sensors 15(7), 15799–15829 (2015)
Article Google Scholar
Bakhsh, N.K., Alshomrani, S., Khan, I.: A comparative study of arabic text-to-speech synthesis systems. Int. J. Inf. Eng. Electron. Bus. 6(4), 27 (2014)
Google Scholar
Dutoit, T., Pagel, V., Pierret, N., Bataille, F., Van der Vrecken, O.: The mbrola project: towards a set of high quality speech synthesizers free of use for non commercial purposes. In: Proceeding of Fourth International Conference on Spoken Language Processing, ICSLP 1996, vol. 3, pp. 1393–1396. IEEE (1996)
Google Scholar
González-Pacheco, V., Castro-González, Á., Malfaz, M., Salichs, M.A.: Human-robot interaction in the MOnarCH project. In: Robocity2030 13th Workshop, pp. 1–8 (2015)
Google Scholar
Handley, Z.: Is text-to-speech synthesis ready for use in computer-assisted language learning? Speech Commun. 51(10), 906–919 (2009)
Article Google Scholar
Kenmochi, H., Ohshita, H.: VOCALOID-commercial singing synthesizer based on sample concatenation. In: INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, pp. 4009–4010 (2007)
Google Scholar
Klatt, D.H.: Review of text-to-speech conversion for English. J. Acoust. Soc. Am. 82(3), 737 (1987)
Article Google Scholar
Lafaye, J., Gouaillier, D., Wieber, P.B.: Linear model predictive control of the locomotion of Pepper, a humanoid robot with omnidirectional wheels. In: 2014 IEEE-RAS International Conference on Humanoid Robots, pp. 336–341. IEEE (2014)
Google Scholar
O’Malley, M.: Text-to-speech conversion technology. Computer 23(8), 17–23 (1990)
Article Google Scholar
Pappas, C.: Top 10 text to speech (TTS) software for elearning (2015). https://elearningindustry.com/top-10-text-to-speech-tts-software-elearning
Roehling, S., MacDonald, B., Watson, C.: Towards expressive speech synthesis in English on a robotic platform. In: Proceedings of the Australasian International Conference on Speech Science and Technology, pp. 130–135 (2006)
Google Scholar
Salichs, E., Fernández-Rodicio, E., Castillo, J.C., Castro-González, Á., Malfaz, M., Salichs, M.Á.: A social robot assisting in cognitive stimulation therapy. In: Demazeau, Y., An, B., Bajo, J., Fernández-Caballero, A. (eds.) PAAMS 2018. LNCS (LNAI), vol. 10978, pp. 344–347. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94580-4_35
Chapter Google Scholar
Salichs, M., et al.: Maggie: a robotic platform for human-robot social interaction. In: 2006 IEEE Conference on Robotics, Automation and Mechatronics, Bangkok, pp. 1–7. IEEE (2006)
Google Scholar
Shamsuddin, S., et al.: Humanoid robot NAO: review of control and motion exploration. In: 2011 IEEE International Conference on Control System, Computing and Engineering, Penang, Malaysia, pp. 511–516 (2011)
Google Scholar
Shruthi, G., et al.: Comparative study of text to speech system for Indian language. Int. J. Adv. Comput. Inf. Technol. 1, 199–209 (2012)
Google Scholar
Tachibana, M., Nakaoka, S., Kenmochi, H.: A singing robot realized by a collaboration of VOCALOID and cybernetic human HRP-4C. In: Interdisciplinary Workshop on Singing Voice (InterSinging 2010), Tokyo, Japan (2010)
Google Scholar
Taylor, P., Black, A.W., Caley, R.: The architecture of the festival speech synthesis system. In: The Third ESCA Workshop in Speech Synthesis, pp. 147–151 (1998). 10.1.1.52.2650
Google Scholar
Tsagarakis, N., Metta, G., Sandini, G.: iCub: the design and realization of an open humanoid platform for cognitive and neuroscience research. Adv. Robot. 21(10), 1151–1175 (2007)
Article Google Scholar
Viswanathan, M., Viswanathan, M.: Measuring speech quality for text-to-speech systems: development and assessment of a modified mean opinion score (MOS) scale. Comput. Speech Lang. 19(1), 55–83 (2005)
Article MathSciNet Google Scholar

Download references

Acknowledgement

The research leading to these results has received funding from the projects: Development of social robots to help seniors with cognitive impairment (ROBSEN), funded by the Ministerio de Economia y Competitividad; and RoboCity2030-DIH-CM, funded by Comunidad de Madrid and co-funded by Structural Funds of the EU.

Author information

Authors and Affiliations

Department of Systems Engineering and Automation, Universidad Carlos III de Madrid, Leganés, Spain
Fernando Alonso-Martín, María Malfaz, Álvaro Castro-González, José Carlos Castillo & Miguel A. Salichs

Authors

Fernando Alonso-Martín
View author publications
You can also search for this author in PubMed Google Scholar
María Malfaz
View author publications
You can also search for this author in PubMed Google Scholar
Álvaro Castro-González
View author publications
You can also search for this author in PubMed Google Scholar
José Carlos Castillo
View author publications
You can also search for this author in PubMed Google Scholar
Miguel A. Salichs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fernando Alonso-Martín .

Editor information

Editors and Affiliations

Robotics Lab, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Miguel A. Salichs
The National University of Singapore, Singapore, Singapore
Shuzhi Sam Ge
Faculty of Industrial Design, Eindhoven University of Technology, Eindhoven, Noord-Brabant, The Netherlands
Emilia Ivanova Barakova
Mechanical & Industrial, Qatar University, Doha, Qatar
John-John Cabibihan
Department of Aerospace Engineering, The Pennsylvania State University, University Park, PA, USA
Alan R. Wagner
Robotics Lab - Department of Systems Engineering and Automation, Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Álvaro Castro-González
Wichita State University, Wichita, KS, USA
Hongsheng He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alonso-Martín, F., Malfaz, M., Castro-González, Á., Castillo, J.C., Salichs, M.A. (2019). Online Evaluation of Text to Speech Systems for Three Social Robots. In: Salichs, M., et al. Social Robotics. ICSR 2019. Lecture Notes in Computer Science(), vol 11876. Springer, Cham. https://doi.org/10.1007/978-3-030-35888-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-35888-4_15
Published: 17 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35887-7
Online ISBN: 978-3-030-35888-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics