Abstract
The goal of this study was to perform an evaluation of a set of voice interaction models (supported by a hands-free solution activated by a wake-up word, a mobile app and a TV remote control with microphone) to identify the most appropriate solution for interactive television. The research addressed issues associated with natural language systems such as usability, interaction and privacy perception, and aimed to analyze the strengths and limitations of the voice interaction models. On a first evaluation approach, a prototype based on a Wizard-of-Oz methodology was used, while a second approach was based on a functional prototype. The preferred interaction model was the hands-free solution activated by a wake-up word because it was easy to use and raised the least difficulties in any task execution. Despite this result, the other two models are not disregarded for a future voice interaction system in television. The TV remote control was the most natural way of interaction for the study’s participants. The need for control provided by the remote and by the app makes the participants feel like these grant more privacy. Participants considered that a voice-operated system for TV would be very useful and almost all were receptive to having such a system at home. Lastly, based on commercial standards and guidelines, solutions to issues identified by participants in the visual interface of the TV system were proposed and considered for the next phase of prototype development, also benefiting other researches in the field.
Similar content being viewed by others
Notes
All the TV images used in this paper come from our partner and TV operator Altice Labs.
The original interfaces were designed in Portuguese for testing purposes. However, to improve the understanding of this section, new interfaces are presented with placeholders in English.
References
Abreu J, Beça P, Santos R, Cardoso B, Fernandes S, & Rodrigues A (2018) Voice interaction on TV: analysis of natural language interaction models. Proceedings of the XIX International Conference on Human Computer Interaction (pp. 8:1--8:8). New York, NY, USA: ACM. https://doi.org/10.1145/3233824.3233853
Alexa Design Guide (2019) Voice design best practices (legacy). Retrieved February 15, 2019, from https://developer.amazon.com/docs/custom-skills/voice-design-best-practices-legacy.html. Accessed 15 Feb 2019
Amazon Alexa Best Practices (Legacy) (2019) Custom Skills. Retrieved February 15, 2019, from https://developer.amazon.com/docs/custom-skills/voice-design-best-practices-legacy.html. Accessed 15 Feb 2019
Archer J. (2013) LG Smart TV - Voice recognition and content discovery. Retrieved from http://www.trustedreviews.com/lg-smart-tv-review-voice-recognition-and-recommendations-page-2. Accessed 15 Feb 2019
Bangor A, Kortum P, Miller J (2009) Determining what individual SUS scores mean: adding an adjective rating scale. Journal of Usability Studies 4(3):114–123. 66.39.39.113
Bernhaupt R, Boutonnet M, Gatellier B, Gimenez Y, Pouchepanadin C, & Souiba, L. (2012) A set of recommendations for the control of IPTV-systems via smart phones based on the understanding of users practices and needs. https://doi.org/10.1145/2325616.2325645
Bernhaupt R, Drouet D, Manciet F, Pirker M, & Pottier G (2017) Using Speech to search: Comparing built-in and ambient speech search in terms of privacy and user experience. Retrieved from https://www.ibc.org/download?ac=3894. Accessed 15 Feb 2019
Brooke J (1996) SUS - a quick and dirty usability scale. Usability Evaluation in Industry 189(194):4–7. https://doi.org/10.1002/hbm.20701
Cadwalladr C, Graham-Harrison E (2018) Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach | News | The Guardian. Retrieved February 15, 2019, from https://www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-us-election. Accessed 15 Feb 2019
Corpuz J (2018) Best android remote apps 2018 - control your TV, PC or smart devices. Retrieved February 14, 2019, from https://www.tomsguide.com/us/pictures-story/494-android-tv-remote-apps.html#s1. Accessed 15 Feb 2019
Cutsinger P (2018) How Building for Voice Differs from Building for the Screen: Individualize Your Entire Interaction: Alexa Blogs. Retrieved February 15, 2019, from https://developer.amazon.com/blogs/alexa/post/7092d81b-f57e-4a52-997f-21e61983eb55/how-building-for-voice-differs-from-building-for-the-screen-individualize-your-entire-interaction. Accessed 15 Feb 2019
DECO (2014) Comandar a televisão por voz e movimento não dispensa comando remoto. Retrieved March 23, 2018, from https://www.deco.proteste.pt/tecnologia/televisores/noticias/comandar-a-televisao-por-voz-e-movimento-nao-dispensa-comando-remoto. Accessed 15 Feb 2019
Elder H a (1970) On the feasibility of voice input to an on-line computer processing system. Commun ACM 13(6):339–346. https://doi.org/10.1145/362384.362387
Furnas GW, Landauer TK, Gomez LM, Dumais ST (1987) The vocabulary problem in human-system communication. Commun ACM 30(11):964–971. https://doi.org/10.1145/32206.32212
Giangola J (2017) Conversation design: speaking the same language - library - Google design. Retrieved from https://design.google/library/conversation-design-speaking-same-language/%0A. Accessed 15 Feb 2019
Giles (2017) What will the TV of Tomorrow look like? – W12 Studios – Medium. Retrieved January 22, 2019, from https://medium.com/w12studios/what-will-the-tv-of-tomorrow-look-like-cd61029380e8. Accessed 15 Feb 2019
Goto J, Kim Y-B, Strl N, Miyazaki M, Komine K, & Uratani N (2004) A spoken dialogue interface for TV operations based on data collected by using WOZ method. Retrieved from https://pdfs.semanticscholar.org/c8dd/1235fbd0f336a1a1d7f2c6eb4614f15fbb90.pdf. Accessed 15 Feb 2019
Ismail A (2018) The 5 Best Apps for Controlling Your TV | Digital Trends. Retrieved February 14, 2019, from https://www.digitaltrends.com/mobile/best-tv-remote-apps/. Accessed 15 Feb 2019
Kishore A (2016) Use a smartphone as a remote for your TV, Set-top box or console. Retrieved February 14, 2019, from https://www.online-tech-tips.com/gadgets/use-your-smartphone-as-a-remote-control-for-your-tv/. Accessed 15 Feb 2019
Mortensen D (2018). How to design voice user interfaces. Retrieved February 15, 2019, from https://www.interaction-design.org/literature/article/how-to-design-voice-user-interfaces. Accessed 15 Feb 2019
Pasztor D (2017) Combining graphical and voice interfaces for a better user experience — Smashing Magazine. Retrieved February 15, 2019, from https://www.smashingmagazine.com/2017/10/combining-graphical-voice-interfaces/. Accessed 15 Feb 2019
Pearl C (2017) Designing voice user interfaces: principles of conversational experiences. O'Reilly, Beijing. Accessed 15 Feb 2019
Samsung (2014) Voice control. Retrieved February 15, 2019, from http://www.samsung.com/ph/smarttv/voice_control.html. Accessed 15 Feb 2019
Seifert D (2018) Amazon fire TV cube review: a smarter streaming box - The Verge. Retrieved June 30, 2018, from https://www.theverge.com/2018/6/21/17484412/amazon-fire-tv-cube-review-alexa-echo. Accessed 15 Feb 2019
Spiliotopoulos D, Stavropoulou P, Kouroupetroglou G (2009) Spoken dialogue interfaces: integrating usability. In: Holzinger A, Miesenberger K (eds) HCI and usability for e-inclusion: 5th Symposium of the workgroup human-computer interaction and usability engineering of the Austrian computer society, USAB 2009, Linz, Austria, November 9–10, 2009 proceedings. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 484–499. https://doi.org/10.1007/978-3-642-10308-7_36
TIVO (2016) Q4 2016 video trends report. Retrieved from https://pt.slideshare.net/shurm/q4-2016-video-trends-report. Accessed 15 Feb 2019
Turunen M, Melto A, Hella J, Heimonen T, Hakulinen J, Mäkinen E, Laivo T, Soronen H (2009) User expectations and user experience with different modalities in a mobile phone-controlled home entertainment system. In with Mobile Devices (pp. 1–4). New York, NY, USA: ACM. https://doi.org/10.1145/1613858.1613898
Ward N, Rivera AG, Ward K, Novick DG (2005) Some usability issues and research priorities in spoken dialog applications, departmental technical reports (CS). Paper 253. http://digitalcommons.utep.edu/cs_techrep/253. Accessed 15 Feb 2019
Whitenton K (2017) Voice First: The Future of Interaction?. Retrieved January 20, 2018, from https://www.nngroup.com/articles/voice-first/. Accessed 15 Feb 2019
Whitenton K (2017) Audio signifiers for voice interaction. Retrieved January 20, 2018, from https://www.nngroup.com/articles/audio-signifiers-voice-interaction/?utm_source=Alertbox&utm_campaign=0741ff983b-audiosignifiers_dontvalidatedesign_2017_09_11&utm_medium=email&utm_term=0_7f29a2b335-0741ff983b-24092741. Accessed 15 Feb 2019
William L, Holden K, Butler J (2003) Universal principles of design. Rockport Publishers, Gloucester
Yankelovich N, Levow G-A, & Marx M (n.d.) Designing speech acts: issues in speech user interfaces. Retrieved from https://www.media.mit.edu/speech/papers/1995/yankelovich_CHI95_speechacts.pdf. Accessed 15 Feb 2019
Acknowledgements
This paper is a result of the CHIC – Cooperative Holistic for Internet and Content project (grant agreement number 24498), funded by COMPETE 2020 and Portugal 2020 through the European Regional Development Fund (FEDER).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Santos, R., Abreu, J., Beça, P. et al. Voice interaction on TV: analysis of natural language interaction models and recommendations for voice user interfaces. Multimed Tools Appl 79, 35689–35716 (2020). https://doi.org/10.1007/s11042-020-08710-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-08710-2