skip to main content
research-article

Using Text-to-Speech to Prototype Game Dialog

Published: 12 November 2018 Publication History

Abstract

Voice acting is common in computer games in many genres. The recording and processing of voice acting is a time-consuming process that involves, for instance, voice actors, directors, audio engineers, and game writers. Changes to the script of a game after the voice acting has been recorded are expensive. At the same time, playtests of games without voice acting may give different results than testing where it is present. This creates a situation where improvements identified from play testing are either ignored or leads to extensive re-recording of voice acting. This article presents a design science research project where text-to-speech (TTS) synthesis is used as a substitute for recorded voice acting in the early stages of game production. We propose a set of design principles that have been evaluated in a sharp game production. Our results indicate several benefits of using TTS as a prototyping tool: It can be a source of inspiration for game writers, it gives good estimations on timing and pacing of the game, and it allows for early tests of how the dialog will be perceived by players. The quality and characteristics of the voices provided by the TTS system play an important role in this process. The rapid development in the speech technology field opens many future possibilities.

References

[1]
Acapela Group. 2017. Brainy voices: Innovative voice creation based on deep learning by acapela group research lab. Retrieved from http://www.acapela-group.com/innovation-acapela-dnn/.
[2]
Gunhild Agger. 2013. Danish TV Christmas calendars: Folklore, myth and cultural history. Journal of Scandinavian Cinema 3, 3, 267--280.
[3]
Jason T. Bowey and Regan L. Mandryk. 2017. Those are not the stories you are looking for: Using text prototypes to evaluate game narratives early. In CHI PLAY’17. 265--276.
[4]
Rob Bridgett. 2009. A holistic approach to game dialogue production. Gamasutra. Retrieved from https://www.gamasutra.com/view/feature/132566/a_holistic_approach_to_game_.php.
[5]
J. Alison Bryant, Anna Akerman, and Jordana Drell. 2010. Diminutive subjects, design strategy, and driving sales: Preschoolers and the Nintendo DS. Game Studies 10, 1.
[6]
Bungie. 2004. Halo 2.
[7]
Maya Daneva. 2014. How practitioners approach gameplay requirements? An exploration into the context of massive multiplayer online role-playing games. In Proceedings of the 2014 IEEE 22nd International Requirements Engineering Conference (RE’14). 3--12.
[8]
David Doukhan, Albert Rilliard, Sophie Rosset, Martine Adda-Decker, and Christophe D'Alessandro. 2011. Prosodic analysis of a corpus of tales. In INTERSPEECH 2011. 3129--3132.
[9]
Henrik Engström, Björn Berg Marklund, Per Backlund, and Marcus Toftedahl. 2018. Game development from a software and creative product perspective: A quantitative literature review approach. Entertainment Computing 27, 10--22.
[10]
Henrik Engström, Jenny Brusk, and Per Anders Östblad. 2015. Including visually impaired players in a graphical adventure game: A study of immersion. IADIS International Journal on Computer Science and Information System 10, 2, 95--112.
[11]
Gunn Sara Enli. 2008. Serving the children in public service broadcasting: Exploring the TV-channel NRK SUPER. In RIPE@2008: Public Service Media in the 21st Century: Participation, Partnership and Media Development. 1--19.
[12]
Shirley Gregor and Alan R. Hevner. 2013. Positioning and presenting design science research for maximum impact. MIS Quarterly 37, 2, 337--355.
[13]
Zöe Handley. 2009. Is text-to-speech synthesis ready for use in computer-assisted language learning? Speech Communication 51, 10, 906--919.
[14]
Alan R. Hevner, Salvatore T. March, Jinsoo Park, and Sudha Ram. 2004. Design science in information systems research. MIS Quarterly 28, 1, 75--105.
[15]
Damian Hodgson and Louise Briand. 2013. Controlling the uncontrollable: “Agile” teams and illusions of autonomy in creative work. Work, Employment and Society 27, 2, 308--325.
[16]
Sander Huiberts. 2010. Captivating Sound. Utrecht School of the Arts.
[17]
Markéta Jůzová, Jan Romportl, and Daniel Tihelka. 2015. Speech corpus preparation for voice banking of laryngectomised patients. In Proceedings of the International Conference on Text, Speech, and Dialogue. 282--290.
[18]
Jussi Kasurinen, Andrey Maglyas, and Kari Smolander. 2014. Is requirements engineering useless in game development? In International Working Conference on Requirements Engineering: Foundation for Software Quality. 1--16.
[19]
Jussi Kasurinen and Kari Smolander. 2014. What do game developers test in their products? In Proceedings of the International Symposium on Empirical Software Engineering and Measurement.
[20]
Sangramsing Kayte, Monica Mundada, and Jayesh Gujrathi. 2015. Hidden Markov model based speech synthesis: A review. International Journal of Computer Applications 130, 3, 35--39.
[21]
Veton Këpuska and Gamal Bohouta. 2018. Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google home). In Proceedings of the Computing and Communication Workshop and Conference (CCWC). 99--103.
[22]
Anna Kipnis. 2015. Dialogue systems in double fine games. Retrieved from https://www.gdcvault.com/play/1021930/Dialog-Systems-in-Double-Fine.
[23]
Dennis H. Klatt. 1987. Review of text‐to‐speech conversion for English. The Journal of the Acoustical Society of America 82, 3, 737--793.
[24]
Jussi Koutonen and Mauri Leppänen. 2013. How are agile methods and practices deployed in video game development? A Survey into Finnish Game Studios. In International Conference on Agile Software Development, 135--149.
[25]
Annakaisa Kultima. 2010. The organic nature of game ideation: game ideas arise from solitude and mature by bouncing. In Proceedings of the International Academic Conference on the Future of Game Design and Technology. 33--39.
[26]
Alastair MacGregor. 2015. The sound of grand theft auto V. Retrieved from https://www.gdcvault.com/play/1020587/The-Sound-of-Grand-Theft.
[27]
Graham McAllister and Gareth R. White. 2015. Video Game development and user experience. In Game User Experience Evaluation, Regina Bernhaupt (Ed.). Springer International Publishing, Cham, 11--35.
[28]
Emerson Murphy-Hill, Thomas Zimmermann, and Nachiappan Nagappan. 2014. Cowboys, ankle sprains, and keepers of quality: How is video game development different from software development? In Proceedings of the International Conference on Software Engineering. 1--11.
[29]
C O'Donnell. 2011. Games are not convergence: The lost promise of digital production and convergence. Convergence 17, 3, 271--286.
[30]
Ann Osborne O'Hagan, Gerry Coleman, and Rory V. O'Connor. 2014. Software development processes for games: A systematic literature review. In Proceedings of the European Conference on Software Process Improvement. 182--193.
[31]
Per Anders Östblad and Henrik Engström. 2016. Audio-driven game design. Retrieved from https://www.gdcvault.com/play/1022934/Audio-Driven-Game.
[32]
Ken Peffers, Tuure Tuunanen, Marcus A. Rothenberger, and Samir Chatterjee. 2007. A design science research methodology for information systems research. Journal of Management Information Systems 24, 3, 45--77.
[33]
Rockstar North. 2013. Grand Theft Auto V.
[34]
Marc Schmalz, Aimee Finn, and Hazel Taylor. 2014. Risk management in video game development projects. In Proceedings of the Annual Hawaii International Conference on System Sciences. 4325--4334.
[35]
Marc Schröder. 2001. Emotional speech synthesis: A review. In Proceedings of the 7th European Conference on Speech Communication and Technology.
[36]
Slack Technologies. 2017. Slack. Retrieved from https://slack.com.
[37]
Ted F. Tschang and J. Szczypula. 2006. Idea creation, constructivism and evolution as key characteristics in the videogame artifact design process. European Management Journal 24, 4, 270--287.
[38]
Unity Technologies. 2017. Unity. Retrieved from https://unity3d.com.
[39]
Unity Technologies. 2018. Unity manual: Gameobjects. Retrieved from https://docs.unity3d.com/Manual/GameObjects.html.
[40]
University of Skövde. 2015. Frekvens saknad. Available at https://play.google.com/store/apps/details?id=com.his.frekvenssaknad.
[41]
University of Skövde. 2017. Marvinter. Available at https://play.google.com/store/apps/details?id=se.his.marvinter.
[42]
Vijay Vaishnavi and William Kuechler. 2015. Design Science Research Methods and Patterns: Innovating Information and Communication Technology. CRC Press, Boca Raton.
[43]
Alf Inge Wang and Njål Nordmark. 2015. Software architecture and the creative process in game development. In Entertainment Computing (ICEC’15). 272--285.
[44]
Heiga Zen, Keiichi Tokuda, and Alan W. Black. 2009. Statistical parametric speech synthesis. Speech Communication 51, 11, 1039--1064.

Cited By

View all
  • (2024)The consolidation of game software engineering: A systematic literature review of software engineering for industry-scale computer gamesInformation and Software Technology10.1016/j.infsof.2023.107330165(107330)Online publication date: Jan-2024
  • (2023)Virtual Agents in Immersive Virtual Reality Environments: Impact of Humanoid Avatars and Output Modalities on Shopping ExperienceInternational Journal of Human–Computer Interaction10.1080/10447318.2023.224129340:19(5771-5793)Online publication date: 17-Aug-2023
  • (2021)Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis2021 6th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK52708.2021.9558954(506-511)Online publication date: 15-Sep-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Computers in Entertainment
Computers in Entertainment   Volume 16, Issue 4
FINAL EDITION
November 2018
82 pages
EISSN:1544-3574
DOI:10.1145/3292146
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2018
Accepted: 01 July 2018
Received: 01 May 2018
Published in CIE Volume 16, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Game development
  2. design science research
  3. game audio
  4. game writing
  5. speech technology
  6. text-to-speech

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The consolidation of game software engineering: A systematic literature review of software engineering for industry-scale computer gamesInformation and Software Technology10.1016/j.infsof.2023.107330165(107330)Online publication date: Jan-2024
  • (2023)Virtual Agents in Immersive Virtual Reality Environments: Impact of Humanoid Avatars and Output Modalities on Shopping ExperienceInternational Journal of Human–Computer Interaction10.1080/10447318.2023.224129340:19(5771-5793)Online publication date: 17-Aug-2023
  • (2021)Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis2021 6th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK52708.2021.9558954(506-511)Online publication date: 15-Sep-2021
  • (2020)Marvinter: A case study of an inclusive transmedia storytelling productionConvergence: The International Journal of Research into New Media Technologies10.1177/135485652092397227:1(103-123)Online publication date: 28-May-2020
  • (2019)‘I have a different kind of brain’—a script-centric approach to interactive narratives in gamesDigital Creativity10.1080/14626268.2019.1570942(1-22)Online publication date: 24-Jan-2019

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media