ABSTRACT
A recently-introduced product of Comcast, a large cable company in the United States, is a "voice remote" that accepts spoken queries from viewers. We present an analysis of a large query log from this service to answer the question: "What do viewers say to their TVs?" In addition to a descriptive characterization of queries and sessions, we describe two complementary types of analyses to support query understanding. First, we propose a domain-specific intent taxonomy to characterize viewer behavior: as expected, most intents revolve around watching programs---both direct navigation as well as browsing---but there is a non-trivial fraction of non-viewing intents as well. Second, we propose a domain-specific tagging scheme for labeling query tokens, that when combined with intent and program prediction, provides a multi-faceted approach to understand voice queries directed at entertainment systems.
- A. Acero, N. Bernstein, R. Chambers, Y. C. Ju, X. Li, J. Odell, P. Nguyen, O. Scholz, and G. Zweig. 2008. Live Search for Mobile: Web Services by Voice on the Cellphone. ICASSP.Google Scholar
- C. Chelba and J. Schalkwyk. 2013. Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search. Mobile Speech and Advanced Natural Language Solutions.Google Scholar
- J. Feng and S. Bangalore. 2009. Effects of Word Confusion Networks on Voice Search. EACL. Google ScholarDigital Library
- I. Guy. 2016. Searching by Talking: Analysis of Voice Queries on Mobile Web Search. SIGIR. Google ScholarDigital Library
- L. Kong, C. Alberti, D. Andor, I. Bogatyy, and D. Weiss. 2017. DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks. arXiv:1703.04474.Google Scholar
- H. Liao, G. Pundak, O. Siohan, M. Carroll, N. Coccaro, Q. Jiang, T. Sainath, A. Senior, F. Beaufays, and M. Bacchiani. 2015. Large Vocabulary Automatic Speech Recognition for Children. Interspeech.Google Scholar
- T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. NIPS. Google ScholarDigital Library
- J. Rao, F. Ture, H. He, O. Jojic, and J. Lin. 2017. Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks. CIKM. Google ScholarDigital Library
- J. Schalkwyk, D. Beeferman, F. Beaufays, B. Byrne, C. Chelba, M. Cohen, M. Garrett, and B. Strope. 2010. "Your Word is My Command": Google Search by Voice: A Case Study. Advances in Speech Recognition.Google Scholar
- J. Shan, G. Wu, Z. Hu, X. Tang, M. Jansche, and P. Moreno. 2010. Search by Voice in Mandarin Chinese. INTERSPEECH.Google Scholar
- Y. Wang, D. Yu, Y. Ju, and A. Acero. 2008. An Introduction to Voice Search. IEEE Signal Processing Magazine 25, 3, 29--38.Google Scholar
- J. Yi and F. Maghoul. 2011. Mobile Search Pattern Evolution: The Trend and the Impact of Voice Queries. WWW. Google ScholarDigital Library
Index Terms
- What Do Viewers Say to Their TVs?: An Analysis of Voice Queries to Entertainment Systems
Recommendations
Dynamic Subtitles: The User Experience
TVX '15: Proceedings of the ACM International Conference on Interactive Experiences for TV and Online VideoSubtitles (closed captions) on television are typically placed at the bottom-centre of the screen. However, placing subtitles in varying positions, according to the underlying video content (`dynamic subtitles'), has the potential to make the overall ...
Categorization of Japanese TV Viewers Based on Program Genres They Watch
Although programpreferences can be characterized on the basis of demographic attributes like sex, age or occupation or by taking the cultural studies approach focused on ethnic or social traits, preferences for programs often differ among people of the ...
Discrimination of media moments and media intervals: sticker-based watch-and-comment annotation
In this paper we discuss the problem of how to discriminate moments of interest on videos or live broadcast shows. The primary contribution is a system which allows users to personalize their programs with previously created media stickers--pieces of ...
Comments