skip to main content
10.1145/3461778.3462147acmconferencesArticle/Chapter ViewAbstractPublication PagesdisConference Proceedingsconference-collections
research-article

Chatbot or Chat-Blocker: Predicting Chatbot Popularity before Deployment

Authors Info & Claims
Published:28 June 2021Publication History

ABSTRACT

Chatbots are widely employed in various scenarios. However, given the high costs of chatbot development and chatbots’ tremendous social influence, chatbot failures may inevitably lead to a huge economic loss. Previous chatbot evaluation frameworks rely heavily on human evaluation, lending little support for automatic early-stage chatbot examination prior to deployment. To reduce the risk of potential loss, we propose a computational approach to extracting features and training models that make a priori prediction about chatbots’ popularity, which indicates chatbot general performance. The features we extract cover chatbot Intent, Conversation Flow, and Response Design. We studied 1050 customer service chatbots on one of the most popular chatbot service platforms. Our model achieves 77.36% prediction accuracy among very popular and very unpopular chatbots, making the first step towards computational feedback before chatbot deployment. Our evaluation results also reveal the key design features associated with chatbot popularity and offer guidance on chatbot design.

References

  1. 2016. New DigitasLBi Research Shows More than 1 in 3 Americans are willing to make Purchases via Chatbots. digitas.com.Google ScholarGoogle Scholar
  2. 2017. Enterprises by business size. data.oecd.org.Google ScholarGoogle Scholar
  3. 2019. The Ultimate Guide for chatbot conversation flow. rakebots.com.Google ScholarGoogle Scholar
  4. 2020. 20+ Metrics for Chatbot Analytics in 2020: The Ultimate Guide. research.aimultiple.com.Google ScholarGoogle Scholar
  5. 2021. 8 Epic Chatbot/Conversational Bot Failures. research.aimultiple.com.Google ScholarGoogle Scholar
  6. Hadeel Al-Zubaide and Ayman A Issa. 2011. Ontbot: Ontology based chatbot. In International Symposium on Innovations in Information and Communications Technology. IEEE, 7–12. https://doi.org/10.1109/ISIICT.2011.6149594Google ScholarGoogle ScholarCross RefCross Ref
  7. Zahra Ashktorab, Mohit Jain, Q Vera Liao, and Justin D Weisz. 2019. Resilient Chatbots: Repair Strategy Preferences for Conversational Breakdowns. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 254. https://doi.org/10.1145/3290605.3300484Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Paul Boutin. 2017. Why Most Chatbots Fail. chatbotsmagazine.com.Google ScholarGoogle Scholar
  9. Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M Buhmann. 2010. The balanced accuracy and its posterior distribution. In 2010 20th International Conference on Pattern Recognition. IEEE, 3121–3124. https://doi.org/10.1109/ICPR.2010.764Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Heloisa Candello, Claudio Pinhanez, and Flavio Figueiredo. 2017. Typefaces and the perception of humanness in natural language chatbots. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 3476–3487. https://doi.org/10.1145/3025453.3025919Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Heloisa Candello, Claudio Pinhanez, Mauro Pichiliani, Paulo Cavalin, Flavio Figueiredo, Marisa Vasconcelos, and Haylla Do Carmo. 2019. The Effect of Audiences on the User Experience with Conversational Interfaces in Physical Spaces. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 90. https://doi.org/10.1145/3290605.3300320Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mark Cirillo. 2019. 6 Examples of SMART Chatbot to Improve Your Customer Services. leadsbridge.com.Google ScholarGoogle Scholar
  13. David Coniam. 2014. The linguistic accuracy of chatbots: usability from an ESL perspective. Text & Talk 34, 5 (2014), 545–567. https://doi.org/10.1515/text-2014-0018Google ScholarGoogle ScholarCross RefCross Ref
  14. Menal Dahiya. 2017. A tool of conversation: Chatbot. International Journal of Computer Sciences and Engineering 5, 5(2017), 158–161.Google ScholarGoogle Scholar
  15. Edgar Dale and Jeanne S Chall. 1948. A formula for predicting readability: Instructions. Educational research bulletin(1948), 37–54.Google ScholarGoogle Scholar
  16. Christopher Dossman. 2019. AI Scholar: Chatbots that improve after deployment. towardsdatascience.com.Google ScholarGoogle Scholar
  17. Jon Doyle. 1992. Rationality and its roles in reasoning. Computational Intelligence 8, 2 (1992), 376–409. https://doi.org/10.1111/j.1467-8640.1992.tb00371.xGoogle ScholarGoogle ScholarCross RefCross Ref
  18. Akash Dubey. 2018. Feature Selection Using Random forest. towardsdatascience.com.Google ScholarGoogle Scholar
  19. Ahmed Fadhil, Gianluca Schiavo, Yunlong Wang, and Bereket A Yilma. 2018. The effect of emojis when interacting with conversational interface assisted health coaching system. In Proceedings of the 12th EAI International Conference on Pervasive Computing Technologies for Healthcare. ACM, 378–383. https://doi.org/10.1145/3240925.3240965Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ethan Fast, Binbin Chen, and Michael S Bernstein. 2016. Empath: Understanding topic signals in large-scale text. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 4647–4657. https://doi.org/10.1145/2858036.2858535Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ong Sing Goh, Cemal Ardil, Wilson Wong, and Chun Che Fung. 2007. A black-box approach for response quality evaluation of conversational agent systems. International Journal of Computational Intelligence 3, 3(2007), 195–203. https://doi.org/10.5281/zenodo.1076854Google ScholarGoogle Scholar
  22. Jonathan Grudin and Richard Jacques. 2019. Chatbots, Humbots, and the Quest for Artificial General Intelligence. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 209. https://doi.org/10.1145/3290605.3300439Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Robert Gunning. 1969. The fog index after twenty years. Journal of Business Communication 6, 2 (1969), 3–13.Google ScholarGoogle ScholarCross RefCross Ref
  24. Lakisha Hall. 2018. 6 steps to successful conversational design. ibm.com.Google ScholarGoogle Scholar
  25. Chan Chun Ho, Ho Lam Lee, Wing Kwan Lo, and Kwok Fai Andrew Lui. 2018. Developing a chatbot for college student programme advisement. In 2018 International Symposium on Educational Technology (ISET). IEEE, 52–56. https://doi.org/10.1109/ISET.2018.00021Google ScholarGoogle Scholar
  26. HN Io and CB Lee. 2017. Chatbots and conversational agents: A bibliometric analysis. In 2017 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM). IEEE, 215–219. https://doi.org/10.1109/IEEM.2017.8289883Google ScholarGoogle ScholarCross RefCross Ref
  27. Mohammed Kaleem, Omar Alobadi, James O’Shea, and Keeley Crockett. 2016. Framework for the formulation of metrics for conversational agent evaluation. In RE-WOCHAT: Workshop on Collecting and Generating Resources for Chatbots and Conversational Agents-Development and Evaluation Workshop Programme (May 28 th, 2016). 20.Google ScholarGoogle Scholar
  28. Ralph L Keeney and Howard Raiffa. 1993. Decisions with multiple objectives: preferences and value trade-offs. Cambridge university press. https://doi.org/10.1017/CBO9781139174084Google ScholarGoogle Scholar
  29. Soomin Kim, Joonhwan Lee, and Gahgene Gweon. 2019. Comparing Data from Chatbot and Web Surveys: Effects of Platform and Conversational Style on Survey Response Quality. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 86. https://doi.org/10.1145/3290605.3300316Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J Peter Kincaid, Robert P Fishburne Jr, Richard L Rogers, and Brad S Chissom. 1975. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. (1975).Google ScholarGoogle Scholar
  31. Karolina Kuligowska. 2015. Commercial chatbot: Performance evaluation, usability metrics and quality standards of embodied conversational agents. Professionals Center for Business Research 2 (2015).Google ScholarGoogle Scholar
  32. Justin Lee. 2018. The practical guide to chatbot metrics and analytics. blog.growthbot.org.Google ScholarGoogle Scholar
  33. Michael McTear. 2018. Conversation modelling for chatbots: current approaches and future directions. Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2018 (2018), 175–185.Google ScholarGoogle Scholar
  34. M Meira and AM de P Canuto. 2015. Evaluation of emotional agents’ architectures: an approach based on quality metrics and the influence of emotions on users. In Proceedings of the world congress on engineering, Vol. 1. 1–8.Google ScholarGoogle Scholar
  35. Isaac Oswalt. 2017. What Will a Chatbot Cost Me - And Is It Worth It?21handshake.com.Google ScholarGoogle Scholar
  36. David J Pasta. 2009. Learning when to be discrete: continuous vs. categorical predictors. In SAS Global Forum, Vol. 248.Google ScholarGoogle Scholar
  37. James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71, 2001 (2001), 2001.Google ScholarGoogle Scholar
  38. Juanan Pereira and Oscar Díaz. 2018. A quality analysis of facebook messenger’s most popular chatbots. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing. ACM, 2144–2150. https://doi.org/10.1145/3167132.3167362Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Nicole M Radziwill and Morgan C Benton. 2017. Evaluating quality of chatbots and intelligent conversational agents. arXiv preprint arXiv:1704.04579(2017).Google ScholarGoogle Scholar
  40. Prateek Saxena. 2018. How Much Does it Cost to Develop A Chatbot. appinventiv.com.Google ScholarGoogle Scholar
  41. Joao Sedoc, Daphne Ippolito, Arun Kirubarajan, Jai Thirani, Lyle Ungar, and Chris Callison-Burch. 2019. Chateval: A tool for chatbot evaluation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). 60–65. https://doi.org/10.18653/v1/N19-4011Google ScholarGoogle Scholar
  42. RJ Senter and Edgar A Smith. 1967. Automated readability index. Technical Report. CINCINNATI UNIV OH.Google ScholarGoogle Scholar
  43. Bayan Abu Shawar and Eric Atwell. 2007. Chatbots: are they really useful?. In Ldv forum, Vol. 22. 29–49.Google ScholarGoogle Scholar
  44. Bayan Abu Shawar and Eric Atwell. 2007. Different measurements metrics to evaluate a chatbot system. In Proceedings of the workshop on bridging the gap: Academic and industrial research in dialog technologies. Association for Computational Linguistics, 89–96.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Heung-Yeung Shum, Xiao-dong He, and Di Li. 2018. From Eliza to XiaoIce: challenges and opportunities with social chatbots. Frontiers of Information Technology & Electronic Engineering 19, 1(2018), 10–26.Google ScholarGoogle ScholarCross RefCross Ref
  46. Emma VA Sylvester, Paul Bentzen, Ian R Bradbury, Marie Clément, Jon Pearce, John Horne, and Robert G Beiko. 2018. Applications of random forest feature selection for fine-scale genetic population assignment. Evolutionary applications 11, 2 (2018), 153–165. https://doi.org/10.1111/eva.12524Google ScholarGoogle Scholar
  47. Anu Venkatesh, Chandra Khatri, Ashwin Ram, Fenfei Guo, Raefer Gabriel, Ashish Nagar, Rohit Prasad, Ming Cheng, Behnam Hedayatnia, Angeliki Metallinou, 2018. On evaluating and comparing conversational agents. arXiv preprint arXiv:1801.03625 4 (2018), 60–68.Google ScholarGoogle Scholar
  48. Michael Vetter. 2002. Quality aspects of bots. In Software quality and software testing in internet times. Springer, 165–184. https://doi.org/10.1007/978-3-642-56333-1_11Google ScholarGoogle Scholar
  49. Marilyn A Walker, Diane J Litman, Candace A Kamm, and Alicia Abella. 1997. PARADISE: A Framework for Evaluating Spoken Dialogue Agents. In 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics. 271–280. https://doi.org/10.3115/976909.979652Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Joseph Weizenbaum. 1966. ELIZA—a computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1 (1966), 36–45. https://doi.org/10.1145/365153.365168Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Anbang Xu, Zhe Liu, Yufan Guo, Vibha Sinha, and Rama Akkiraju. 2017. A new chatbot for customer service on social media. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 3506–3510. https://doi.org/10.1145/3025453.3025496Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics 46, 1 (2020), 53–93. https://doi.org/10.1162/coli_a_00368Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Michelle X Zhou, Gloria Mark, Jingyi Li, and Huahai Yang. 2019. Trusting Virtual Agents: The Effect of Personality. ACM Transactions on Interactive Intelligent Systems (TiiS) 9, 2-3(2019), 10. https://doi.org/10.1145/3232077Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Rui Zhou, Jasmine Hentschel, and Neha Kumar. 2017. Goodbye text, hello emoji: mobile communication on wechat in China. In Proceedings of the 2017 CHI conference on human factors in computing systems. ACM, 748–759. https://doi.org/10.1145/3025453.3025800Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DIS '21: Proceedings of the 2021 ACM Designing Interactive Systems Conference
    June 2021
    2082 pages
    ISBN:9781450384766
    DOI:10.1145/3461778

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 28 June 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,158of4,684submissions,25%

    Upcoming Conference

    DIS '24
    Designing Interactive Systems Conference
    July 1 - 5, 2024
    IT University of Copenhagen , Denmark

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format