Abstract
Recently, most of open-domain dialogue systems or chatbots have been trained using the deep learning technique on large human conversations from the internet. They can generate more natural and diverse responses than task-oriented or retrieval-based ones. However, their response generation is difficult to control, and they can learn and produce unsuitable and even unsafe responses. In this paper, we investigate the ability of deep learning based chatbots to produce unsafe medical advice when they receive requests for medical advice from the end users. We introduce a new benchmark for training medical context detector in a chatbot message. Then we conduct experiments to assess the safety of two well-known chatbots answers to medical advice requests and discuss the limitations of the proposed method. Our study demonstrates that popular neural network based chatbot models have a significant propensity to produce unsafe medical advice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abacha, A.B., Demner-Fushman, D.: A question-entailment approach to question answering. BMC Bioinform. 20(1), 511:1–511:23 (2019). https://arxiv.org/abs/1901.08079
Babakov, N., Logacheva, V., Kozlova, O., Semenov, N., Panchenko, A.: Detecting inappropriate messages on sensitive topics that could harm a company’s reputation. In: Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pp. 26–36. Association for Computational Linguistics (2021)
Baumgartner, J., Zannettou, S., Keegan, B., Squire, M., Blackburn, J.: The pushshift reddit dataset. CoRR abs/2001.08435 (2020)
Bickmore, T.W., et al.: Patient and consumer safety risks when using conversational assistants for medical information: an observational study of Siri, Alexa, and google assistant. J. Med. Internet Res. 20(9), e11510 (2018). https://doi.org/10.2196/11510
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics (2019)
Dinan, E., et al.: Anticipating safety issues in E2E conversational AI: framework and tooling. CoRR abs/2107.03451 (2021)
Ferrucci, D.A., Levas, A., Bagchi, S., Gondek, D., Mueller, E.T.: Watson: beyond jeopardy! Artif. Intell. 199, 93–105 (2013)
Henderson, P., et al.: Ethical challenges in data-driven dialogue systems. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 123–129 (2018)
Huang, M., Zhu, X., Gao, J.: Challenges in building intelligent open-domain dialog systems. ACM Trans. Inf. Syst. (TOIS) 38, 1–32 (2020)
Liu, H., Dacon, J., Fan, W., Liu, H., Liu, Z., Tang, W.: Does gender matter? Towards fairness in dialogue systems. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 4403–4416 (2020)
Liu, H., Wang, W., Wang, Y., Liu, H., Liu, Z., Tang, J.: Mitigating gender bias for neural dialogue generation with adversarial learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 893–903. Association for Computational Linguistics (2020b)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Raux, A., Langner, B., Bohus, D., Black, A.W., Eskénazi, M.: Let’s go public! taking a spoken dialog system to the real world. In: INTERSPEECH, pp. 885–888. ISCA (2005)
Ritter, A., Cherry, C., Dolan, W.B.: Data-driven response generation in social media. In: Proceedings of the 2011 Conference on Empirical Methods, Natural Language Processing (EMNLP), pp. 583–593, (2011). John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL, 27–31 July 2011
Roller, S., et al.: Recipes for building an open-domain chatbot. In: EACL, pp. 300–325. Association for Computational Linguistics (2021)
Serban, I.V., Sordoni, A., Bengio, Y., Courville, A.C., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: Schuurmans, D., Wellman, M.P. (eds.) Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 12–17 Feb 2016, Phoenix, Arizona, USA, pp. 3776–3784. AAAI Press (2016). http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11957
Serban, I.V., et al.: A hierarchical latent variable encoder-decoder model for generating dialogues. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (2017)
Shang, L., Lu, Z., Hang, L.: Neural responding machine for short-text conversation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1577–1586. Association for Computational Linguistics (2015)
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: EMNLP (Demos), pp. 38–45. Association for Computational Linguistics (2020)
Xu, J., Ju, D., Li, M., Boureau, Y., Weston, J., Dinan, E.: Recipes for safety in open-domain chatbots. CoRR abs/2010.07079 (2020)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. CoRR abs/1906.08237 (2019)
Zeng, G., et al.: Large-scale medical dialogue dataset. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2020)
Zhang, Y., Ren, P., de Rijke, M.: Detecting and classifying malevolent dialogue responses: taxonomy, data and methodology. arxiv. CoRR abs/2008.09706 (2020)
Zhang, Y., et al.: DialoGPT: large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Omri, S., Abdelkader, M., Hamdi, M., Kim, TH. (2023). Safety Issues Investigation in Deep Learning Based Chatbots Answers to Medical Advice Requests. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1792. Springer, Singapore. https://doi.org/10.1007/978-981-99-1642-9_51
Download citation
DOI: https://doi.org/10.1007/978-981-99-1642-9_51
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1641-2
Online ISBN: 978-981-99-1642-9
eBook Packages: Computer ScienceComputer Science (R0)