AI Unreliable Answers: A Case Study on ChatGPT

Amaro, Ilaria; Della Greca, Attilio; Francese, Rita; Tortora, Genoveffa; Tucci, Cesare

doi:10.1007/978-3-031-35894-4_2

Ilaria Amaro⁹,
Attilio Della Greca⁹,
Rita Francese⁹,
Genoveffa Tortora⁹ &
…
Cesare Tucci⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14051))

Included in the following conference series:

International Conference on Human-Computer Interaction

2363 Accesses
1 Citations

Abstract

ChatGPT is a general domain chatbot which is object of great attention stimulating all the world discussions on the power and the consequences of the Artificial Intelligence diffusion in all the field, ranging from education, research, music to software development, health care, cultural heritage, and entertainment.

In this paper, we try to investigate whether and when the answers provided by ChatGPT are unreliable and how this is perceived by expert users, such as Computer Science students. To this aim, we first analyze the reliability of the answers provided by ChatGPT by experimenting its narrative, problem solving, searching, and logic capabilities and report examples of answers. Then, we conducted a user study in which 15 participants that already knew the chatbot proposed a set of predetermined queries generating both correct and incorrect answers and then we collected their satisfaction. Results revealed that even if the present version of ChatGPT sometimes is unreliable, people still plan to use it. Thus, it is recommended to use the present version of ChatGPT always with the support of human verification and interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://chat.openai.com/chat.

References

Adamopoulou, E., Moussiades, L.: An overview of chatbot technology. In: Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds.) AIAI 2020. IAICT, vol. 584, pp. 373–383. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49186-4_31
Chapter Google Scholar
Aydın, N., Erdem, O.A.: A research on the new generation artificial intelligence technology generative pretraining transformer 3. In: 2022 3rd International Informatics and Software Engineering Conference (IISEC), pp. 1–6. IEEE (2022)
Google Scholar
Aydın, Ö., Karaarslan, E.: OpenAI ChatGPT generated literature review: digital twin in healthcare. Available at SSRN 4308687 (2022)
Google Scholar
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems 33, pp. 1877–1901 (2020)
Google Scholar
Choi, J.H., Hickman, K.E., Monahan, A., Schwarcz, D.: ChatGPT goes to law school. Available at SSRN (2023)
Google Scholar
Fan, X., et al.: The influence of agent reliability on trust in human-agent collaboration. In: Proceedings of the 15th European Conference on Cognitive Ergonomics: The Ergonomics of Cool Interaction, pp. 1–8 (2008)
Google Scholar
Forbes: Microsoft confirms its \$10 billion investment into ChatGPT, changing how Microsoft competes with Google, Apple and other tech giants. https://www.forbes.com/sites/qai/2023/01/27/microsoft-confirms-its-10-billion-investment-into-chatgpt-changing-how-microsoft-competes-with-google-apple-and-other-tech-giants/?sh=24dd324b3624
Glass, A., McGuinness, D.L., Wolverton, M.: Toward establishing trust in adaptive agents. In: Proceedings of the 13th International Conference on Intelligent User Interfaces, pp. 227–236 (2008)
Google Scholar
Katrak, M.: The role of language prediction models in contractual interpretation: the challenges and future prospects of GPT-3. In: Legal Analytics, pp. 47–62 (2023)
Google Scholar
Khanna, A., Pandey, B., Vashishta, K., Kalia, K., Pradeepkumar, B., Das, T.: A study of today’s AI through chatbots and rediscovery of machine intelligence. Int. J. u- and e-Serv. Sci. Technol. 8(7), 277–284 (2015)
Google Scholar
Korngiebel, D.M., Mooney, S.D.: Considering the possibilities and pitfalls of generative pre-trained transformer 3 (GPT-3) in healthcare delivery. NPJ Digit. Med. 4(1), 93 (2021)
Article Google Scholar
Krugman, P.: Does ChatGPT mean robots are coming for the skilled jobs? The New York Times. https://www.nytimes.com/2022/12/06/opinion/chatgpt-ai-skilled-jobs-automation.html
Alshater, M.M.: Exploring the role of artificial intelligence in enhancing academic performance: a case study of ChatGPT. Available at SSRN (2022)
Google Scholar
Moran, S., et al.: Team reactions to voiced agent instructions in a pervasive game. In: Proceedings of the 2013 International Conference on Intelligent User Interfaces, pp. 371–382 (2013)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Google Scholar
Roose, K.: The brilliance and weirdness of ChatGPT. The New York Times. https://www.nytimes.com/2022/12/05/technology/chatgpt-ai-twitter.html
Sandzer-Bell, E.: ChatGPT music prompts for generating chords and lyrics. https://www.audiocipher.com/post/chatgpt-music
Stack Overflow. https://meta.stackoverflow.com/questions/421831/temporary-policy-chatgpt-is-banned
Vetere, G.: Posso chiamarti prosdocimo? perché è bene non fidarsi troppo delle risposte di ChatGPT. https://centroriformastato.it/posso-chiamarti-prosdocimo/

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Salerno, Fisciano, Italy
Ilaria Amaro, Attilio Della Greca, Rita Francese, Genoveffa Tortora & Cesare Tucci

Authors

Ilaria Amaro
View author publications
You can also search for this author in PubMed Google Scholar
Attilio Della Greca
View author publications
You can also search for this author in PubMed Google Scholar
Rita Francese
View author publications
You can also search for this author in PubMed Google Scholar
Genoveffa Tortora
View author publications
You can also search for this author in PubMed Google Scholar
Cesare Tucci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rita Francese .

Editor information

Editors and Affiliations

Siemens Corporation, Princeton, NJ, USA
Helmut Degen
Foundation for Research and Technology – Hellas (FORTH), Heraklion, Crete, Greece
Stavroula Ntoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amaro, I., Della Greca, A., Francese, R., Tortora, G., Tucci, C. (2023). AI Unreliable Answers: A Case Study on ChatGPT. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2023. Lecture Notes in Computer Science(), vol 14051. Springer, Cham. https://doi.org/10.1007/978-3-031-35894-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-35894-4_2
Published: 09 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35893-7
Online ISBN: 978-3-031-35894-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AI Unreliable Answers: A Case Study on ChatGPT