skip to main content
10.1145/3411764.3445312acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Heuristic Evaluation of Conversational Agents

Published: 07 May 2021 Publication History

Abstract

Conversational interfaces have risen in popularity as businesses and users adopt a range of conversational agents, including chatbots and voice assistants. Although guidelines have been proposed, there is not yet an established set of usability heuristics to guide and evaluate conversational agent design. In this paper, we propose a set of heuristics for conversational agents adapted from Nielsen’s heuristics and based on expert feedback. We then validate the heuristics through two rounds of evaluations conducted by participants on two conversational agents, one chatbot and one voice-based personal assistant. We find that, when using our heuristics to evaluate both interfaces, evaluators were able to identify more usability issues than when using Nielsen’s heuristics. We propose that our heuristics successfully identify issues related to dialogue content, interaction design, help and guidance, human-like characteristics, and data privacy.

References

[1]
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–13.
[2]
Timothy Bickmore and Justine Cassell. 2005. Social dialongue with embodied conversational agents. In Advances in natural multimodal dialogue systems. Springer, 23–54.
[3]
Piotr Calak. 2013. Smartphone evaluation heuristics for older adults. Ph.D. Dissertation.
[4]
Dana E Chisnell, Janice C Ginny Redish, and AMY Lee. 2006. New heuristics for understanding older adults as web users. Technical Communication 53, 1 (2006), 39–59.
[5]
Leigh Clark, Phillip Doyle, Diego Garaialde, Emer Gilmartin, Stephan Schlögl, Jens Edlund, Matthew Aylett, João Cabral, Cosmin Munteanu, and Benjamin Cowan. 2018. The state of speech in hci: Trends, themes and challenges. arXiv preprint arXiv:1810.06828(2018).
[6]
Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Justin Edwards, Brendan Spillane, Emer Gilmartin, Christine Murad, Cosmin Munteanu, 2019. What makes a good conversation? challenges in designing truly conversational agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
[7]
Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. What can i help you with?: infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services. ACM, 43.
[8]
James Glass. 1999. Challenges for spoken dialogue systems. In Proceedings of the 1999 IEEE ASRU Workshop.
[9]
Google. n.d. Learn about conversation - Conversation design. https://designguidelines.withgoogle.com/conversation/conversation-design/learn-about-conversation.html.
[10]
Herbert P Grice. 1975. Logic and conversation. In Speech acts. Brill, 41–58.
[11]
Setia Hermawati and Glyn Lawson. 2016. Establishing usability heuristics for heuristics evaluation in a specific domain: Is there a consensus?Applied ergonomics 56(2016), 34–51.
[12]
Apple Computer Inc. 1987. Apple Human Interface Guidelines: The Apple Desktop Interface. Addison Wesley Publishing Company.
[13]
Keith Instone. 1997. Site Usability Heuristics for the Web. http://instone.org/heuristics.
[14]
Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N Patel. 2018. Evaluating and informing the design of chatbots. In Proceedings of the 2018 Designing Interactive Systems Conference. 895–906.
[15]
Rafal Kocielnik, Elena Agapie, Alexander Argyle, Dennis T Hsieh, Kabir Yadav, Breena Taira, and Gary Hsieh. 2019. HarborBot: A Chatbot for Social Needs Screening. In AMIA Annual Symposium Proceedings, Vol. 2019. American Medical Informatics Association, 552.
[16]
Q Vera Liao, Matthew Davis, Werner Geyer, Michael Muller, and N Sadat Shami. 2016. What Can You Do? Studying Social-Agent Orientation and Agent Proactive Interactions with an Agent for Employees. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems. 264–275.
[17]
Ross James Lordon. 2019. Design, Development, and Evaluation of a Patient-Centered Health Dialog System to Support Inguinal Hernia Surgery Patient Information-Seeking. Ph.D. Dissertation. University of Washington.
[18]
Jennifer Mankoff, Anind K Dey, Gary Hsieh, Julie Kientz, Scott Lederer, and Morgan Ames. 2003. Heuristic evaluation of ambient displays. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 169–176.
[19]
Rolf Molich and Jakob Nielsen. 1990. Improving a human-computer dialogue. Commun. ACM 33, 3 (1990), 338–348.
[20]
Christine Murad, Cosmin Munteanu, Benjamin R Cowan, and Leigh Clark. 2019. Revolution or Evolution? Speech Interaction and HCI Design Guidelines. IEEE Pervasive Computing 18, 2 (2019), 33–45.
[21]
Jakob Nielsen. 1994. How to Conduct a Heuristic Evaluation. https://www.nngroup.com/articles/how-to-conduct-a-heuristic-evaluation/.
[22]
Jakob Nielsen and Rolf Molich. 1990. Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 249–256.
[23]
Aleksandra Przegalinska, Leon Ciechanowski, Anna Stroz, Peter Gloor, and Grzegorz Mazurek. 2019. In bot we trust: A new methodology of chatbot performance measures. Business Horizons 62, 6 (2019), 785–797.
[24]
Nina Svenningsson and Montathar Faraon. 2019. Artificial Intelligence in Conversational Agents: A Study of Factors Related to Perceived Humanness in Chatbots. In Proceedings of the 2019 2nd Artificial Intelligence and Cloud Computing Conference. 151–161.
[25]
Paul Thomas, Daniel McDuff, Mary Czerwinski, and Nick Craswell. 2020. Expressions of Style in Information Seeking Conversation with an Agent. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1171–1180.
[26]
Zhuxiaona Wei and James A Landay. 2018. Evaluating Speech-Based Smart Devices Using New Usability Heuristics. IEEE Pervasive Computing 17, 2 (2018), 84–96.
[27]
Yunhan Wu, Justin Edwards, Orla Cooney, Anna Bleakley, Philip R. Doyle, Leigh Clark, Daniel Rough, and Benjamin R. Cowan. 2020. Mental Workload and Language Production in Non-Native Speaker IPA Interaction. In Proceedings of the 2nd Conference on Conversational User Interfaces (Bilbao, Spain) (CUI ’20). Association for Computing Machinery, New York, NY, USA, Article 3, 8 pages. https://doi.org/10.1145/3405755.3406118
[28]
Jiajie Zhang, Todd R Johnson, Vimla L Patel, Danielle L Paige, and Tate Kubose. 2003. Using usability heuristics to evaluate patient safety of medical devices. Journal of biomedical informatics 36, 1-2 (2003), 23–30.

Cited By

View all
  • (2024)Understanding the Determinants of Using Government AI-Chatbots by Citizens in Saudi ArabiaInternational Journal of Electronic Government Research10.4018/IJEGR.34973320:1(1-20)Online publication date: 7-Aug-2024
  • (2024)Person-based design and evaluation of MIA, a digital medical interview assistant for radiologyFrontiers in Artificial Intelligence10.3389/frai.2024.14311567Online publication date: 16-Aug-2024
  • (2024)Designing multi-model conversational AI financial systems: understanding sensitive values of women entrepreneurs in BrazilProceedings of the 2024 ACM International Conference on Interactive Media Experiences Workshops10.1145/3672406.3672409(11-18)Online publication date: 12-Jun-2024
  • Show More Cited By

Index Terms

  1. Heuristic Evaluation of Conversational Agents
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
    May 2021
    10862 pages
    ISBN:9781450380966
    DOI:10.1145/3411764
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Best Paper

    Author Tags

    1. conversational agents
    2. heuristic evaluation
    3. user interface design

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CHI '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)798
    • Downloads (Last 6 weeks)79
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Understanding the Determinants of Using Government AI-Chatbots by Citizens in Saudi ArabiaInternational Journal of Electronic Government Research10.4018/IJEGR.34973320:1(1-20)Online publication date: 7-Aug-2024
    • (2024)Person-based design and evaluation of MIA, a digital medical interview assistant for radiologyFrontiers in Artificial Intelligence10.3389/frai.2024.14311567Online publication date: 16-Aug-2024
    • (2024)Designing multi-model conversational AI financial systems: understanding sensitive values of women entrepreneurs in BrazilProceedings of the 2024 ACM International Conference on Interactive Media Experiences Workshops10.1145/3672406.3672409(11-18)Online publication date: 12-Jun-2024
    • (2024)Body Language for VUIs: Exploring Gestures to Enhance Interactions with Voice User InterfacesProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660691(133-150)Online publication date: 1-Jul-2024
    • (2024)HASI: A Model for Human-Agent Speech InteractionProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665885(1-8)Online publication date: 8-Jul-2024
    • (2024)Beyond Functionality: Unveiling Dimensions of User Experience in Embodied Conversational Agents for Customer ServiceProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665544(1-11)Online publication date: 8-Jul-2024
    • (2024)A Case Study Exploring the Applicability of Heuristic Evaluation in Smart Home SystemsExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3637131(1-7)Online publication date: 11-May-2024
    • (2024)Conversational Voice Interfaces: Translating Research Into Actionable DesignExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3636277(1-3)Online publication date: 11-May-2024
    • (2024)Voice Assistive Technology for Activities of Daily Living: Developing an Alexa Telehealth Training for Adults with Cognitive-Communication DisordersProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642788(1-15)Online publication date: 11-May-2024
    • (2024)The Promise and Peril of ChatGPT in Higher Education: Opportunities, Challenges, and Design ImplicationsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642785(1-21)Online publication date: 11-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media