skip to main content
10.1145/3491102.3517432acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

What Could Possibly Go Wrong When Interacting with Proactive Smart Speakers? A Case Study Using an ESM Application

Authors Info & Claims
Published:29 April 2022Publication History

ABSTRACT

Voice user interfaces (VUIs) have made their way into people’s daily lives, from voice assistants to smart speakers. Although VUIs typically just react to direct user commands, increasingly, they incorporate elements of proactive behaviors. In particular, proactive smart speakers have the potential for many applications, ranging from healthcare to entertainment; however, their usability in everyday life is subject to interaction errors. To systematically investigate the nature of errors, we designed a voice-based Experience Sampling Method (ESM) application to run on proactive speakers. We captured 1,213 user interactions in a 3-week field deployment in 13 participants’ homes. Through auxiliary audio recordings and logs, we identify substantial interaction errors and strategies that users apply to overcome those errors. We further analyze the interaction timings and provide insights into the time cost of errors. We find that, even for answering simple ESMs, interaction errors occur frequently and can hamper the usability of proactive speakers and user experience. Our work also identifies multiple facets of VUIs that can be improved in terms of the timing of speech.

Skip Supplemental Material Section

Supplemental Material

3491102.3517432-talk-video.mp4

mp4

81.6 MB

References

  1. [1] [n.d.]. https://www.amazon.com/cubic-ai-5-Minute-Plank-Workout/dp/B06XHTCB3ZGoogle ScholarGoogle Scholar
  2. [2] [n.d.]. https://assistant.google.com/services/a/uid/000000addca8c8f3Google ScholarGoogle Scholar
  3. 2021. Smart Speaker Market Global Industry Trends, Share, Size and Forecast Report. https://www.marketwatch.com/press-release/smart-speaker-market-global-industry-trends-share-size-and-forecast-report-2021-02-17?tesla=yGoogle ScholarGoogle Scholar
  4. Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, and Fabio Crestani. 2020. Harnessing evolution of multi-turn conversations for effective answer retrieval. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. Association for Computing Machinery, New York, NY, USA, 33–42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tawfiq Ammari, Jofish Kaye, Janice Y Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants.ACM Trans. Comput. Hum. Interact. 26, 3 (2019), 17–1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bruce Balentine and David P. Morgan. 2001. How to build a speech recognition application: a style guide for telephony dialogues. EIG Press, San Ramon, CA.Google ScholarGoogle Scholar
  7. Curtis A Becker. 1979. Semantic context and word frequency effects in visual word recognition.Journal of Experimental Psychology: Human Perception and Performance 5, 2(1979), 252.Google ScholarGoogle ScholarCross RefCross Ref
  8. Erin Beneteau, Olivia K Richards, Mingrui Zhang, Julie A Kientz, Jason Yip, and Alexis Hiniker. 2019. Communication breakdowns between families and Alexa. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300473Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottridge. 2018. Understanding the long-term use of smart speaker assistants. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3 (2018), 1–24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Timothy W Bickmore, Ha Trinh, Stefan Olafsson, Teresa K O’Leary, Reza Asadi, Nathaniel M Rickles, and Ricardo Cruz. 2018. Patient and consumer safety risks when using conversational assistants for medical information: an observational study of Siri, Alexa, and Google Assistant. Journal of medical Internet research 20, 9 (2018), e11510.Google ScholarGoogle ScholarCross RefCross Ref
  11. Niall Bolger and Jean-Philippe Laurenceau. 2013. Intensive longitudinal methods: An introduction to diary and experience sampling research. Guilford Press, New York, NY, US.Google ScholarGoogle Scholar
  12. Julia Cambre, Alex C Williams, Afsaneh Razi, Ian Bicking, Abraham Wallin, Janice Tsai, Chinmay Kulkarni, and Jofish Kaye. 2021. Firefox Voice: An Open and Extensible Voice Assistant Built Upon the Web. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–18. https://doi.org/10.1145/3411764.3445409Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Justine Cauell, Tim Bickmore, Lee Campbell, and Hannes Vilhjalmsson. 2000. Designing embodied conversational agents. Embodied conversational agents 29 (2000), 29–63.Google ScholarGoogle Scholar
  14. Irene Celino and Gloria Re Calegari. 2020. Submitting surveys via a conversational interface: an evaluation of user acceptance and approach effectiveness. International Journal of Human-Computer Studies 139 (2020), 102410.Google ScholarGoogle ScholarCross RefCross Ref
  15. Narae Cha, Auk Kim, Cheul Young Park, Soowon Kang, Mingyu Park, Jae-Gil Lee, Sangsu Lee, and Uichin Lee. 2020. Hello There! Is Now a Good Time to Talk? Opportune Moments for Proactive Interactions with Smart Speakers. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ruth Chambers and Paul Beaney. 2020. The potential of placing a digital assistant in patients’ homes.Google ScholarGoogle Scholar
  17. Amy Cheng, Vaishnavi Raghavaraju, Jayanth Kanugo, Yohanes P Handrianto, and Yi Shang. 2018. Development and evaluation of a healthy coping voice interface application using the Google home for elderly patients with type 2 diabetes. In 2018 15th IEEE Annual Consumer Communications & Networking Conference (CCNC). IEEE, New York, US, 1–5. https://doi.org/10.1109/CCNC.2018.8319283Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Janghee Cho and Emilee Rader. 2020. The Role of Conversational Grounding in Supporting Symbiosis Between People and Digital Assistants. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1(2020), 1–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Woohyeok Choi, Sangkeun Park, Duyeon Kim, Youn-kyung Lim, and Uichin Lee. 2019. Multi-stage receptivity model for mobile just-in-time health intervention. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 2 (2019), 1–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Justin Edwards, Brendan Spillane, Emer Gilmartin, Christine Murad, Cosmin Munteanu, 2019. What makes a good conversation? Challenges in designing truly conversational agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300705Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Richard L Clayton and Debbie LS Winter. 1992. Speech data entry: results of a test of voice recognition for survey data collection. JOURNAL OF OFFICIAL STATISTICS-STOCKHOLM- 8 (1992), 377–377.Google ScholarGoogle Scholar
  22. Michael H Cohen, Michael Harris Cohen, James P Giangola, and Jennifer Balogh. 2004. Voice user interface design. Addison-Wesley Professional, Boston, MA, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hasan Shahid Ferdous, Bernd Ploderer, Hilary Davis, Frank Vetere, and Kenton O’hara. 2016. Commensality and the social use of technology during family mealtime. ACM Transactions on Computer-Human Interaction (TOCHI) 23, 6(2016), 1–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Anna K Fletcher and Greg Shaw. 2011. How voice-recognition software presents a useful transcription tool for qualitative and mixed methods researchers. International Journal of Multiple Research Approaches 5, 2 (2011), 200–206.Google ScholarGoogle ScholarCross RefCross Ref
  25. Markus Funk, Carie Cunningham, Duygu Kanver, Christopher Saikalis, and Rohan Pansare. 2020. Usable and Acceptable Response Delays of Conversational Agents in Automotive User Interfaces. In 12th International Conference on Automotive User Interfaces and Interactive Vehicular Applications. Association for Computing Machinery, New York, NY, USA, 262–269. https://doi.org/10.1145/3409120.3410651Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Shiyoh Goetsu and Tetsuya Sakai. 2020. Different types of voice user interface failures may cause different degrees of frustration. arXiv preprint arXiv:2002.03582(2020).Google ScholarGoogle Scholar
  27. Sharon Goldwater, Dan Jurafsky, and Christopher D Manning. 2010. Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates. Speech Communication 52, 3 (2010), 181–200.Google ScholarGoogle ScholarCross RefCross Ref
  28. Danula Hettiachchi, Zhanna Sarsenbayeva, Fraser Allison, Niels van Berkel, Tilman Dingler, Gabriele Marini, Vassilis Kostakos, and Jorge Goncalves. 2020. ”Hi! I am the Crowd Tasker” Crowdsourcing through Digital Voice Assistants. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376320Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jeff Huang and Efthimis N Efthimiadis. 2009. Analyzing and evaluating query reformulation strategies in web search logs. In Proceedings of the 18th ACM conference on Information and knowledge management. Association for Computing Machinery, New York, NY, USA, 77–86. https://doi.org/10.1145/1645953.1645966Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors? Lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. Association for Computing Machinery, New York, NY, USA, 143–152. https://doi.org/10.1145/2484028.2484092Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Alan Kennedy, Alan Wilkes, Leona Elder, and Wayne S Murray. 1988. Dialogue with machines. Cognition 30, 1 (1988), 37–72.Google ScholarGoogle ScholarCross RefCross Ref
  32. Auk Kim, Woohyeok Choi, Jungmi Park, Kyeyoon Kim, and Uichin Lee. 2018. Interrupting Drivers for Interactions: Predicting Opportune Moments for In-vehicle Proactive Auditory-verbal Tasks. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Auk Kim, Jung-Mi Park, and Uichin Lee. 2020. Interruptibility for in-vehicle multitasking: influence of voice task demands and adaptive behaviors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1–22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R Rickford, Dan Jurafsky, and Sharad Goel. 2020. Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences 117, 14(2020), 7684–7689.Google ScholarGoogle ScholarCross RefCross Ref
  35. Mitsuki Komori, Yuichiro Fujimoto, Jianfeng Xu, Kazuyuki Tasaka, Hiromasa Yanagihara, and Kinya Fujita. 2019. Experimental Study on Estimation of Opportune Moments for Proactive Voice Information Service Based on Activity Transition for People Living Alone. In International Conference on Human-Computer Interaction. Springer, Springer International Publishing, Cham, 527–539.Google ScholarGoogle Scholar
  36. Deepak Kumar, Riccardo Paccagnella, Paul Murley, Eric Hennenfent, Joshua Mason, Adam Bates, and Michael Bailey. 2018. Skill squatting attacks on Amazon Alexa. In 27th {USENIX} Security Symposium ({USENIX} Security 18). USENIX Association, Baltimore, MD, 33–47. https://www.usenix.org/conference/usenixsecurity18/presentation/kumarGoogle ScholarGoogle Scholar
  37. Dounia Lahoual and Myriam Frejus. 2019. When users assist the voice assistants: From supervision to failure resolution. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3343413.3377968Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Reed Larson and Mihaly Csikszentmihalyi. 2014. The experience sampling method. In Flow and the foundations of positive psychology. Springer, Dordrecht, 21–34.Google ScholarGoogle Scholar
  39. Sunok Lee, Minji Cho, and Sangsu Lee. 2020. What If Conversational Agents Became Invisible? Comparing Users’ Mental Models According to Physical Entity of AI Speaker. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1–24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Toby Jia-Jun Li, Jingya Chen, Haijun Xia, Tom M Mitchell, and Brad A Myers. 2020. Multi-Modal Repairs of Conversational Breakdowns in Task-Oriented Dialogs. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. Association for Computing Machinery, New York, NY, USA, 1094–1107. https://doi.org/10.1145/3379337.3415820Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Anthony J Liddicoat. 2021. An introduction to conversation analysis. Bloomsbury Publishing, London, England.Google ScholarGoogle Scholar
  42. Ewa Luger and Abigail Sellen. 2016. ”Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 5286–5297. https://doi.org/10.1145/2858036.2858288Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yuhan Luo, Bongshin Lee, and Eun Kyoung Choe. 2020. TandemTrack: Shaping consistent exercise experience by complementing a mobile app with a smart speaker. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376616Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Raju Maharjan, Darius Adam Rohani, Per Bækgaard, Jakob Bardram, and Kevin Doherty. 2021. Can we talk? Design Implications for the Questionnaire-Driven Self-Report of Health and Wellbeing via Conversational Agent. In CUI 2021-3rd Conference on Conversational User Interfaces. Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3469595.3469600Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Donald McMillan, Moira McGregor, and Barry Brown. 2015. From in the Wild to in Vivo: Video Analysis of Mobile Device Use. In Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services (Copenhagen, Denmark) (MobileHCI ’15). Association for Computing Machinery, New York, NY, USA, 494–503. https://doi.org/10.1145/2785830.2785883Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Michael Frederick McTear, Zoraida Callejas, and David Griol. 2016. The Conversational Interface. Springer, Cham.Google ScholarGoogle Scholar
  47. Hyunsu Mun, Hyungjin Lee, Soohyun Kim, and Youngseok Lee. 2020. A Smart Speaker Performance Measurement Tool. In Proceedings of the 35th Annual ACM Symposium on Applied Computing. Association for Computing Machinery, New York, NY, USA, 755–762. https://doi.org/10.1145/3341105.3373990Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Hyunsu Mun and Youngseok Lee. 2020. Accelerating Smart Speaker Service with Content Prefetching and Local Control. In 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC). IEEE, New York, US, 1–6.Google ScholarGoogle Scholar
  49. Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for how users overcome obstacles in voice user interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–7. https://doi.org/10.1145/3173574.3173580Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Chelsea M Myers, Anushay Furqan, and Jichen Zhu. 2019. The impact of user characteristics and preferences on performance with an unfamiliar voice user interface. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3290605.3300277Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Nick Nikiforakis, Marco Balduzzi, Lieven Desmet, Frank Piessens, and Wouter Joosen. 2014. Soundsquatting: Uncovering the use of homophones in domain squatting. In International Conference on Information Security. Springer International Publishing, Cham, 291–308.Google ScholarGoogle ScholarCross RefCross Ref
  52. Chaewon Park, Yoonseob Lim, Jongsuk Choi, and Jee Eun Sung. 2021. Changes in linguistic behaviors based on smart speaker task performance and pragmatic skills in multiple turn-taking interactions. Intelligent Service Robotics 14, 3 (2021), 1–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Sonia Paul. 2017. Voice Is the Next Big Platform, Unless You Have an Accent | Backchannel. https://www.wired.com/2017/03/voice-is-the-next-big-platform-unless-you-have-an-accent/Google ScholarGoogle Scholar
  54. Hannah RM Pelikan and Mathias Broth. 2016. Why that nao? how humans adapt to a conventional humanoid robot in taking turns-at-talk. In Proceedings of the 2016 CHI conference on human factors in computing systems. Association for Computing Machinery, New York, NY, USA, 4921–4932. https://doi.org/10.1145/2858036.2858478Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Martin Pielot, Bruno Cardoso, Kleomenis Katevas, Joan Ser à, Aleksandar Matic, and Nuria Oliver. 2017. Beyond interruptibility: Predicting opportune moments to engage mobile phone users. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 3 (sep 2017), 1–25. https://doi.org/10.1145/3130956Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Martin Pielot, Tilman Dingler, Jose San Pedro, and Nuria Oliver. 2015. When Attention is Not Scarce - Detecting Boredom from Mobile Phone Usage. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Osaka, Japan) (UbiComp ’15). Association for Computing Machinery, New York, NY, USA, 825–836. https://doi.org/10.1145/2750858.2804252Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice interfaces in everyday life. In proceedings of the 2018 CHI conference on human factors in computing systems. Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3174214Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Martin Porcheron, Joel E Fischer, and Sarah Sharples. 2017. ”Do Animals Have Accents?” Talking with Agents in Multi-Party Conversation. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. Association for Computing Machinery, New York, NY, USA, 207–219. https://doi.org/10.1145/2998181.2998298Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Aung Pyae and Paul Scifleet. 2019. Investigating the role of user’s English language proficiency in using a voice user interface: A case of Google Home smart speaker. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/3290607.3313038Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Stuart Reeves, Martin Porcheron, and Joel Fischer. 2018. ’This is not what we wanted’ designing for conversation with voice interfaces. Interactions 26, 1 (2018), 46–51.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Melanie Revilla, Mick P Couper, Oriol J Bosch, and Marc Asensio. 2020. Testing the use of voice input in a smartphone web survey. Social Science Computer Review 38, 2 (2020), 207–224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Felicia Roberts, Alexander L Francis, and Melanie Morgan. 2006. The interaction of inter-turn silence with prosodic cues in listener perceptions of “trouble” in conversation. Speech communication 48, 9 (2006), 1079–1093.Google ScholarGoogle Scholar
  63. Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I. Hong. 2018. ”Hey Alexa, What’s Up?”: A Mixed-Methods Studies of In-Home Conversational Agent Usage. In Proceedings of the 2018 Designing Interactive Systems Conference (Hong Kong, China) (DIS ’18). Association for Computing Machinery, New York, NY, USA, 857–868. https://doi.org/10.1145/3196709.3196772Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Hyewon Suh, Nina Shahriaree, Eric B Hekler, and Julie A Kientz. 2016. Developing and validating the user burden scale: A tool for assessing user burden in computing systems. In Proceedings of the 2016 CHI conference on human factors in computing systems. Association for Computing Machinery, New York, NY, USA, 3988–3999. https://doi.org/10.1145/2858036.2858448Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Jaime Teevan, Eytan Adar, Rosie Jones, and Michael A. S. Potts. 2007. Information Re-Retrieval: Repeat Queries in Yahoo’s Logs. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Amsterdam, The Netherlands) (SIGIR ’07). Association for Computing Machinery, New York, NY, USA, 151–158. https://doi.org/10.1145/1277741.1277770Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Daphne Townsend, Frank Knoefel, and Rafik Goubran. 2011. Privacy versus autonomy: a tradeoff model for smart home monitoring technologies. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, New York, US, 4749–4752.Google ScholarGoogle ScholarCross RefCross Ref
  67. Johanne R. Trippas, Damiano Spina, Lawrence Cavedon, Hideo Joho, and Mark Sanderson. 2018. Informing the Design of Spoken Conversational Search: Perspective Paper. In Proceedings of the 2018 Conference on Human Information Interaction & Retrieval(CHIIR ’18). Association for Computing Machinery, New York, NY, USA, 32–41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Niels Van Berkel, Denzil Ferreira, and Vassilis Kostakos. 2017. The experience sampling method on mobile devices. ACM Computing Surveys (CSUR) 50, 6 (2017), 1–40. https://doi.org/10.1145/3123988Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Jing Wei, Tilman Dingler, and Vassilis Kostakos. 2021. Developing the Proactive Speaker Prototype Based on Google Home. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/3411763.3451642Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Yukang Yan, Chun Yu, Wengrui Zheng, Ruining Tang, Xuhai Xu, and Yuanchun Shi. 2020. FrownOnError: Interrupting Responses from Smart Speakers by Facial Expressions. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376810Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. 2019. Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, New York, US, 1381–1396. https://doi.org/10.1109/SP.2019.00016Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. What Could Possibly Go Wrong When Interacting with Proactive Smart Speakers? A Case Study Using an ESM Application

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
        April 2022
        10459 pages
        ISBN:9781450391573
        DOI:10.1145/3491102

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 April 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate6,199of26,314submissions,24%

        Upcoming Conference

        CHI '24
        CHI Conference on Human Factors in Computing Systems
        May 11 - 16, 2024
        Honolulu , HI , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format