Testing Chatbots with Charm

Bravo-Santos, Sergio; Guerra, Esther; de Lara, Juan

doi:10.1007/978-3-030-58793-2_34

Sergio Bravo-Santos⁹,
Esther Guerra⁹ &
Juan de Lara⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1266))

Included in the following conference series:

International Conference on the Quality of Information and Communications Technology

1466 Accesses
12 Citations

Abstract

Chatbots are software programs with a conversational user interface, typically embedded in webs or messaging systems like Slack, Facebook Messenger or Telegram. Many companies are investing in chatbots to improve their customer support. This has led to a proliferation of chatbot creation platforms (e.g., Dialogflow, Lex, Watson). However, there is currently little support for testing chatbots, which may impact in their final quality.

To alleviate this problem, we propose a methodology that automates the generation of coherence, sturdiness and precision tests for chatbots, and exploits the test results to improve the chatbot precision. The methodology is supported by a tool called Charm, which uses Botium as the backend for automated test execution. Moreover, we report on experiments aimed at improving Dialogflow chatbots built by third parties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bozic, J., Tazl, O.A., Wotawa, F.: Chatbot testing using AI planning. In: AITest, pp. 37–44. IEEE (2019)
Google Scholar
Bozic, J., Wotawa, F.: Security testing for chatbots. In: Medina-Bulo, I., Merayo, M.G., Hierons, R. (eds.) ICTSS 2018. LNCS, vol. 11146, pp. 33–38. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99927-2_3
Chapter Google Scholar
Bozic, J., Wotawa, F.: Testing chatbots using metamorphic relations. In: Gaston, C., Kosmatov, N., Le Gall, P. (eds.) ICTSS 2019. LNCS, vol. 11812, pp. 41–55. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31280-0_3
Chapter Google Scholar
Erlenhov, L., de Oliveira Neto, F.G., Scandariato, R., Leitner, P.: Current and future bots in software development. In: Proceedings of the 1st International Workshop on Bots in Software Engineering BotSE@ICSE, pp. 7–11. IEEE / ACM (2019)
Google Scholar
Jin, D., Jin, Z., Zhou, J.T., Szolovits, P.: Is BERT really robust? a strong baseline for natural language attack on text classification and entailment. In: AAAI (2020)
Google Scholar
Pérez-Soler, S., Guerra, E., de Lara, J.: Collaborative modeling and group decision making using chatbots in social networks. IEEE Softw. 35(6), 48–54 (2018)
Article Google Scholar
Ren, R., Castro, J.W., Acuña, S.T., de Lara, J.: Evaluation techniques for chatbot usability: a systematic mapping study. Int. J. Softw. Eng. Knowl. Eng. 29(11&12), 1673–1702 (2019)
Article Google Scholar
Ruane, E., Faure, T., Smith, R., Bean, D., Carson-Berndsen, J., Ventresque, A.: Botest: a framework to test the quality of conversational agents using divergent input examples. In: IUI Companion. ACM (2018)
Google Scholar
Shevat, A.: Designing Bots: Creating Conversational Experiences. O’Reilly, Sebastopol (2017)
Google Scholar
Solís, C., Wang, X.: A study of the characteristics of behaviour driven development. In: 37th EUROMICRO Conference on Software Engineering and Advanced Applications SEAA, pp. 383–387. IEEE Computer Society (2011)
Google Scholar
Vasconcelos, M., Candello, H., Pinhanez, C., dos Santos, T.: Bottester: testing conversational systems with simulated users. In: IHC, pp. 73:1–73:4. ACM (2017)
Google Scholar
Zeller, A., Gopinath, R., Böhme, M., Fraser, G., Holler, C.: Mutation-based fuzzing. In: The Fuzzing Book. Saarland University (2019). https://www.fuzzingbook.org/html/MutationFuzzer.html. Accessed June 2020

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their comments. This work has been partially funded by the Spanish Ministry of Science (project MASSIVE, RTI2018-095255-B-I00) and the R&D programme of Madrid (project FORTE, P2018/TCS-4314).

Author information

Authors and Affiliations

Modelling and Software Engineering Research Group, Computer Science Department, Universidad Autónoma de Madrid, Madrid, Spain
Sergio Bravo-Santos, Esther Guerra & Juan de Lara

Authors

Sergio Bravo-Santos
View author publications
You can also search for this author in PubMed Google Scholar
Esther Guerra
View author publications
You can also search for this author in PubMed Google Scholar
Juan de Lara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan de Lara .

Editor information

Editors and Affiliations

Brunel University, London, UK
Martin Shepperd
Lisbon University Institute, Lisbon, Portugal
Fernando Brito e Abreu
University of Lisbon, Lisbon, Portugal
Alberto Rodrigues da Silva
University of Castilla-La Mancha, Talavera de la Reina, Spain
Ricardo Pérez-Castillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bravo-Santos, S., Guerra, E., de Lara, J. (2020). Testing Chatbots with Charm. In: Shepperd, M., Brito e Abreu, F., Rodrigues da Silva, A., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2020. Communications in Computer and Information Science, vol 1266. Springer, Cham. https://doi.org/10.1007/978-3-030-58793-2_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-58793-2_34
Published: 31 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58792-5
Online ISBN: 978-3-030-58793-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics