skip to main content
10.1145/3196709.3196735acmconferencesArticle/Chapter ViewAbstractPublication PagesdisConference Proceedingsconference-collections
research-article

Evaluating and Informing the Design of Chatbots

Published: 08 June 2018 Publication History

Abstract

Text messaging-based conversational agents (CAs), popularly called chatbots, received significant attention in the last two years. However, chatbots are still in their nascent stage: They have a low penetration rate as 84% of the Internet users have not used a chatbot yet. Hence, understanding the usage patterns of first-time users can potentially inform and guide the design of future chatbots. In this paper, we report the findings of a study with 16 first-time chatbot users interacting with eight chatbots over multiple sessions on the Facebook Messenger platform. Analysis of chat logs and user interviews revealed that users preferred chatbots that provided either a 'human-like' natural language conversation ability, or an engaging experience that exploited the benefits of the familiar turn-based messaging interface. We conclude with implications to evolve the design of chatbots, such as: clarify chatbot capabilities, sustain conversation context, handle dialog failures, and end conversations gracefully.

References

[1]
2002. A.L.I.C.E. Foundation website. (2002). Retrieved January 4, 2017 from http://alicebot.org
[2]
2013. Mitsuku. (2013). Retrieved January 4, 2017 from http://www.mitsuku.com
[3]
2013. Rose. (2013). Retrieved January 4, 2017 from http://brilligunderstanding.com/rosedemo.html
[4]
2016. Facebook Messenger bots. (2016). Retrieved Dec 1, 2016 from https://chatbottle.co/bots/messenger
[5]
2017. Facebook Messenger Alterra. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/alterra.cc
[6]
2017. Facebook Messenger Call of Duty. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/CallofDuty
[7]
2017. Facebook Messenger chatShopper. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/chatShopper
[8]
2017. Facebook Messenger CNN. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/cnn
[9]
2017. Facebook Messenger Hi Poncho. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/hiponcho
[10]
2017. Facebook Messenger Pandorabots. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/chatbots.io
[11]
2017. Facebook Messenger Swelly. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/swell.bot
[12]
2017. Facebook Messenger Trivia Blast. (2017). Retrieved January 30, 2017 from https://www.messenger.com/t/triviablast1
[13]
Timothy W. Bickmore and Justine Cassell. 2005. Social dialongue with embodied conversational agents. In Advances in natural multimodal dialogue systems. Springer, 23--54.
[14]
Timothy W. Bickmore and Rosalind W. Picard. 2005. Establishing and Maintaining Long-term Human-computer Relationships. ACM Trans. Comput.-Hum. Interact. 12, 2 (June 2005), 293--327.
[15]
Dan Bohus and Alexander I. Rudnicky. 2003. Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda. In INTERSPEECH. ISCA.
[16]
Susan Brennan. 1990. Conversation as Direct Manipulation: An Iconoclastic View. The Art of Human-Computer Interface Design (1990).
[17]
Justine Cassell. 2000. Embodied conversational agents. MIT press.
[18]
Kathleen Chaykowski. 2016. More Than 11,000 Bots Are Now On Facebook Messenger. (2016). Retrieved Dec 28, 2016 from http: //www.forbes.com/sites/kathleenchaykowski/2016/07/01/ more-than-11000-bots-are-now-on-facebook-messenger/
[19]
O' Brien Chris. 2016. Facebook Messenger chief says platform's 34,000 chatbots are finally improving user experience. (2016). Retrieved February 7, 2017 from http://venturebeat.com/2016/11/11/ facebook-messenger-chief-says-platforms-34000/ -chatbots-are-finally-improving-user-experience/
[20]
Mary Czerwinski, Eric Horvitz, and Susan Wilhite. 2004. A Diary Study of Task Switching and Interruptions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '04). ACM, New York, NY, USA, 175--182.
[21]
Craig Elimeliah. 2016. Why chatbots are replacing apps. (2016). Retrieved January 20, 2017 from http://venturebeat.com/2016/08/02/ why-chatbots-are-replacing-apps/
[22]
Facebook. 2017. Discover. (2017). Retrieved May 31, 2017 from https://developers.facebook.com/docs/ messenger-platform/discover
[23]
Matt Grech. 2017. The Current State of Chatbots in 2017. (2017). Retrieved Jan 5, 2018 from https://getvoip.com/blog/2017/04/21/ the-current-state-of-chatbots-in-2017/
[24]
Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in psychology 52 (1988), 139--183.
[25]
Orange Hive. 2017. First time bot users deserve good bots. (2017). Retrieved Jan 5, 2018 from https://unfiltered.orangehive.de/ first-time-bot-users-deserve-good-bots/
[26]
Jason L Hutchens. 1996. How to pass the Turing test by cheating. School of Electrical, Electronic and Computer Engineering research report TR97-05. Perth: University of Western Australia (1996).
[27]
Mohit Jain, Ramachandra Kota, Pratyush Kumar, and Shwetak Patel. 2018. Convey: Exploring the Use of a Context View for Chatbots. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, 6.
[28]
Jiepu Jiang, Ahmed Hassan Awadallah, Rosie Jones, Umut Ozertem, Imed Zitouni, Ranjitha Gurunath Kulkarni, and Omar Zia Khan. 2015. Automatic Online Evaluation of Intelligent Assistants. In Proceedings of the 24th International Conference on World Wide Web (WWW '15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 506--516.
[29]
Stefan Kopp, Lars Gesellensetter, Nicole C. Krämer, and Ipke Wachsmuth. 2005. Lecture Notes in Computer Science. Springer-Verlag, London, UK, UK, Chapter A Conversational Agent As Museum Guide: Design and Evaluation of a Real-world Application, 329--343.
[30]
Q. Vera Liao, Matthew Davis, Werner Geyer, Michael Muller, and N. Sadat Shami. 2016. What Can You Do?: Studying Social-Agent Orientation and Agent Proactive Interactions with an Agent for Employees. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems (DIS '16). ACM, New York, NY, USA, 264--275.
[31]
Vera Q. Liao, Muhammed Masud Hussain, Praveen Chandar, Matthew Davis, Marco Crasso, Dakuo Wang, Michael Muller, Sadat N. Shami, and Werner Geyer. 2018. All Work and no Play? Conversations with a Question-and-Answer Chatbot in the Wild. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI '18). ACM, New York, NY, USA, 13.
[32]
J. C. R. Licklider. 1960. IRE Transactions on Human Factors in Electronics HFE-1 (March 1960), 4--11.
[33]
Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA": The Gulf Between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 5286--5297.
[34]
Donald A. Norman. 2002. The Design of Everyday Things. Basic Books, Inc., New York, NY, USA.
[35]
Amy Ogan, Samantha Finkelstein, Elijah Mayfield, Claudia D'Adamo, Noboru Matsuda, and Justine Cassell. 2012. "Oh Dear Stacy!": Social Interaction, Elaboration, and Learning with Teachable Agents. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 39--48.
[36]
Susan Robinson, Antonio Roque, and David R. Traum. 2010. Dialogues in Context: An Objective User-Oriented Evaluation Approach for Virtual Human Dialogue. In 7th International Conference on Language Resources and Evaluation (LREC). Valletta, Malta. http://people.ict. usc.edu/~traum/Papers/Robinson-LREC2010.pdf
[37]
Susan Robinson, David R. Traum, Midhun Ittycheriah, and Joe Henderer. 2008. What would you ask a conversational agent? Observations of Human-Agent dialogues in a museum setting. In Language Resources and Evaluation Conference (LREC). Marrakech (Morocco). http://people.ict.usc.edu/~traum/Papers/ Blackwell-LREC08.pdf
[38]
Ronald Rosenfeld, Dan Olsen, and Alex Rudnicky. 2001. Universal Speech Interfaces. interactions 8, 6 (Oct. 2001), 34--44.
[39]
Bayan Abu Shawar and Eric Atwell. 2002. A comparison between ALICE and Elizabeth chatbot systems. (2002).
[40]
Statista. 2017. Most popular global mobile messenger apps as of January 2017. (2017). Retrieved February 7, 2017 from https://www.statista.com/statistics/258749/ most-popular-global-mobile-messenger-apps/
[41]
N. Suzuki, K. Ishii, and M. Okada. 1998. Talking Eye: autonomous creature as accomplice for human. In Proceedings. 3rd Asia Pacific Computer Human Interaction (Cat. No.98EX110). 409--414.
[42]
Indrani M Thies, Nandita Menon, Sneha Magapu, Manisha Subramony, and Jacki O'Neill. 2017. How do you want your chatbot? An exploratory Wizard-of-Oz study with young, urban Indians. In Proceedings of the International Conference on Human-Computer Interaction (HCI) (INTERACT '17). IFIP, 20.
[43]
Marilyn A. Walker, John S. Aberdeen, Julie E. Boland, Elizabeth Owen Bratt, John S. Garofolo, Lynette Hirschman, Audrey N. Le, Sungbok Lee, Shrikanth S. Narayanan, Kishore Papineni, Bryan L. Pellom, Joseph Polifroni, Alexandros Potamianos, P. Prabhu, Alexander I. Rudnicky, Gregory A. Sanders, Stephanie Seneff, David Stallard, and Steve Whittaker. 2001. DARPA communicator dialog travel planning systems: the june 2000 data collection. In INTERSPEECH.
[44]
Joseph Weizenbaum. 1966. ELIZA - A computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1 (1966), 36--45.
[45]
Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Pei-hao Su, David Vandyke, and Steve J. Young. 2015. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems. CoRR abs/1508.01745 (2015). http://arxiv.org/abs/1508.01745
[46]
Yorick Wilks. 2010. Close Engagements with Artificial Companions: Key Social, Psychological, Ethical, and Design Issues. John Benjamins Publishing Company, Amsterdam.
[47]
Steve Young. 1996. A review of large-vocabulary continuous-speech. IEEE Signal Processing Magazine 13, 5 (Sept 1996), 45--.

Cited By

View all
  • (2025)Development of a Manychat Based Chatbot for Automated Customer Support SystemInternational Journal of Latest Technology in Engineering Management & Applied Science10.51583/IJLTEMAS.2025.140200514:2(39-49)Online publication date: 7-Mar-2025
  • (2025)Breaking down barriers to warning technology adoption: usability and usefulness of a messenger app warning boti-com10.1515/icom-2024-0067Online publication date: 26-Feb-2025
  • (2025)Measuring Artificial Intelligence Customer Experience: Scale Development and ValidationInternational Journal of Human–Computer Interaction10.1080/10447318.2025.2466064(1-14)Online publication date: 24-Feb-2025
  • Show More Cited By

Index Terms

  1. Evaluating and Informing the Design of Chatbots

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DIS '18: Proceedings of the 2018 Designing Interactive Systems Conference
    June 2018
    1418 pages
    ISBN:9781450351980
    DOI:10.1145/3196709
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 June 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. chatbot
    2. conversational agent
    3. evaluation
    4. messenger

    Qualifiers

    • Research-article

    Conference

    DIS '18
    Sponsor:

    Acceptance Rates

    DIS '18 Paper Acceptance Rate 107 of 487 submissions, 22%;
    Overall Acceptance Rate 1,158 of 4,684 submissions, 25%

    Upcoming Conference

    DIS '25
    Designing Interactive Systems Conference
    July 5 - 9, 2025
    Funchal , Portugal

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)735
    • Downloads (Last 6 weeks)75
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Development of a Manychat Based Chatbot for Automated Customer Support SystemInternational Journal of Latest Technology in Engineering Management & Applied Science10.51583/IJLTEMAS.2025.140200514:2(39-49)Online publication date: 7-Mar-2025
    • (2025)Breaking down barriers to warning technology adoption: usability and usefulness of a messenger app warning boti-com10.1515/icom-2024-0067Online publication date: 26-Feb-2025
    • (2025)Measuring Artificial Intelligence Customer Experience: Scale Development and ValidationInternational Journal of Human–Computer Interaction10.1080/10447318.2025.2466064(1-14)Online publication date: 24-Feb-2025
    • (2025)AI chatbot interventions in combatting marijuana-impaired driving: the role of gender, linguistic style, and hypocrisy inductionInternational Journal of Advertising10.1080/02650487.2025.2452048(1-30)Online publication date: 23-Jan-2025
    • (2025)ChatGPT and meInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103400194:COnline publication date: 1-Feb-2025
    • (2025)Addendum 1—Because of its Relevance: Artificial Intelligence in the Metaverse?Next Generation Internet10.1007/978-3-658-46424-0_8(195-201)Online publication date: 8-Jan-2025
    • (2024)DFA-RAGProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693984(47033-47055)Online publication date: 21-Jul-2024
    • (2024)Elevating Business ExperiencesTransforming the Financial Landscape With ICTs10.4018/979-8-3693-1503-3.ch001(1-27)Online publication date: 29-Mar-2024
    • (2024)A Framework Design for Centralised Monitoring of Patient Disease Diagnosis for Better ImprovementInternational Journal of Engineering and Advanced Technology10.35940/ijeat.D4438.1304042413:4(47-52)Online publication date: 30-Apr-2024
    • (2024)The Effect of Eye Contact in Multi-Party Conversations with Virtual Humans and Mitigating the Mona Lisa EffectElectronics10.3390/electronics1302043013:2(430)Online publication date: 19-Jan-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media