What Is Waiting for Us at the End? Inherent Biases of Game Story Endings in Large Language Models

Taveekitworachai, Pittawat; Abdullah, Febri; Gursesli, Mustafa Can; Dewantoro, Mury F.; Chen, Siyuan; Lanata, Antonio; Guazzini, Andrea; Thawonmas, Ruck

doi:10.1007/978-3-031-47658-7_26

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14384))

Included in the following conference series:

International Conference on Interactive Digital Storytelling

1232 Accesses

Abstract

This study investigates biases present in large language models (LLMs) when utilized for narrative tasks, specifically in game story generation and story ending classification. Our experiment involves using popular LLMs, including GPT-3.5, GPT-4, and Llama 2, to generate game stories and classify their endings into three categories: positive, negative, and neutral. The results of our analysis reveal a notable bias towards positive-ending stories in the LLMs under examination. Moreover, we observe that GPT-4 and Llama 2 tend to classify stories into uninstructed categories, underscoring the critical importance of thoughtfully designing downstream systems that employ LLM-generated outputs. These findings provide a groundwork for the development of systems that incorporate LLMs in game story generation and classification. They also emphasize the necessity of being vigilant in addressing biases and improving system performance. By acknowledging and rectifying these biases, we can create more fair and accurate applications of LLMs in various narrative-based tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

From Playing the Story to Gaming the System: Repeat Experiences of a Large Language Model-Based Interactive Story

GPT and Its Ability to Tell Stories—A Study

eXplainable AI with GPT4 for story analysis and generation: A novel framework for diachronic sentiment analysis

Article 11 October 2023

Notes

1.
Converting a raw text string into a key-value object in memory.
2.
As the temperature increases, the output from the model becomes more stochastic. The possible value range for ChatGPT is from 0 to 2, where 1 is the default value.
3.
https://platform.openai.com/docs/api-reference/chat.
4.
https://huggingface.co/meta-LLaMA/LLaMA-2-13b-chat-hf.

References

Baek, S., Im, H., Ryu, J., et al.: PromptCrafter: crafting text-to-image prompt through mixed-initiative dialogue with LLM. arXiv preprint arXiv:2307.08985 (2023)
Christiano, P., Leike, J., Brown, T.B., et al.: Deep reinforcement learning from human preferences (2023)
Google Scholar
Dang, H., Mecke, L., Lehmann, F., et al.: How to prompt? Opportunities and challenges of zero-and few-shot learning for human-AI interaction in creative applications of generative models. arXiv preprint arXiv:2209.01390 (2022)
Gilbert, L.: “Assassin’s Creed reminds us that history is human experience’’: students’ senses of empathy while playing a narrative video game. Theory Res. Soc. Educ. 47(1), 108–137 (2019). https://doi.org/10.1080/00933104.2018.1560713
Article Google Scholar
Grace, L.: Game type and game genre (2005). Accessed 22 Feb 2009
Google Scholar
Jozefowicz, R., Vinyals, O., Schuster, M., et al.: Exploring the limits of language modeling (2016)
Google Scholar
Kasneci, E., Sessler, K., Küchemann, S., et al.: ChatGPT for good? On opportunities and challenges of large language models for education. Learn. Individ. Differ. 103, 102274 (2023). https://doi.org/10.1016/j.lindif.2023.102274. https://www.sciencedirect.com/science/article/pii/S1041608023000195
Khandelwal, U., He, H., Qi, P., et al.: Sharp nearby, fuzzy far away: how neural language models use context. arXiv preprint arXiv:1805.04623 (2018)
Lanzi, P.L., Loiacono, D.: ChatGPT and other large language models as evolutionary engines for online interactive collaborative game design (2023)
Google Scholar
Lu, Y., Bartolo, M., Moore, A., et al.: Fantastically ordered prompts and where to find them: overcoming few-shot prompt order sensitivity. arXiv preprint arXiv:2104.08786 (2021)
Murray, J.: From game-story to cyberdrama. In: First Person: New Media as Story, Performance, and Game, vol. 1, pp. 2–11 (2004)
Google Scholar
OpenAI: Introducing ChatGPT (2022). https://openai.com/blog/chatgpt
OpenAI: GPT-4 technical report (2023)
Google Scholar
Porteous, J., Cavazza, M.: Controlling narrative generation with planning trajectories: the role of constraints. In: Iurgel, I.A., Zagalo, N., Petta, P. (eds.) ICIDS 2009. LNCS, vol. 5915, pp. 234–245. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10643-9_28
Chapter Google Scholar
Roemmele, M., Gordon, A.S.: Creative help: a story writing assistant. In: Schoenau-Fog, H., Bruni, L.E., Louchart, S., Baceviciute, S. (eds.) ICIDS 2015. LNCS, vol. 9445, pp. 81–92. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27036-4_8
Chapter Google Scholar
Sallam, M.: ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare 11(6) (2023). https://www.mdpi.com/2227-9032/11/6/887
Shaikh, O., Zhang, H., Held, W., et al.: On second thought, let’s not think step by step! bias and toxicity in zero-shot reasoning (2023)
Google Scholar
Taveekitworachai, P., Abdullah, F., Dewantoro, M.F., et al.: ChatGPT4PCG competition: character-like level generation for science birds (2023)
Google Scholar
Touvron, H., Martin, L., Stone, K., et al.: LLaMA 2: open foundation and fine-tuned chat models (2023)
Google Scholar
Venkit, P.N., Gautam, S., Panchanadikar, R., et al.: Nationality bias in text generation (2023)
Google Scholar
Värtinen, S., Hämäläinen, P., Guckelsberger, C.: Generating role-playing game quests with GPT language models. IEEE Trans. Games 1–12 (2022). https://doi.org/10.1109/TG.2022.3228480
Wang, G., Xie, Y., Jiang, Y., et al.: Voyager: an open-ended embodied agent with large language models (2023)
Google Scholar
Wang, Z., Xie, Q., Ding, Z., et al.: Is ChatGPT a Good Sentiment Analyzer? A Preliminary Study (2023)
Google Scholar
Webson, A., Pavlick, E.: Do prompt-based models really understand the meaning of their prompts? arXiv preprint arXiv:2109.01247 (2021)
Wei, J., Tay, Y., Bommasani, R., et al.: Emergent abilities of large language models (2022)
Google Scholar
White, J., Fu, Q., Hays, S., et al.: A prompt pattern catalog to enhance prompt engineering with ChatGPT (2023)
Google Scholar
Wu, M., Aji, A.F.: Style over substance: Evaluation biases for large language models (2023)
Google Scholar
Yuan, A., Coenen, A., Reif, E., et al.: Wordcraft: story writing with large language models. In: 27th International Conference on Intelligent User Interfaces, IUI 2022, pp. 841–852. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3490099.3511105
Zhao, W.X., et al.: A survey of large language models (2023)
Google Scholar
Zhou, Y., Muresanu, A.I., Han, Z., et al.: Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910 (2022)

Download references

Author information

Authors and Affiliations

Graduate School of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
Pittawat Taveekitworachai, Febri Abdullah, Mury F. Dewantoro & Siyuan Chen
Department of Information Engineering, Università degli Studi di Firenze, Florence, Italy
Mustafa Can Gursesli & Antonio Lanata
Department of Education, Literatures, Intercultural Studies, Languages and Psychology, Università degli Studi di Firenze, Florence, Italy
Andrea Guazzini
College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
Ruck Thawonmas

Authors

Pittawat Taveekitworachai
View author publications
You can also search for this author in PubMed Google Scholar
Febri Abdullah
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Can Gursesli
View author publications
You can also search for this author in PubMed Google Scholar
Mury F. Dewantoro
View author publications
You can also search for this author in PubMed Google Scholar
Siyuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Lanata
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Guazzini
View author publications
You can also search for this author in PubMed Google Scholar
Ruck Thawonmas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pittawat Taveekitworachai .

Editor information

Editors and Affiliations

University of Skövde, Skövde, Sweden
Lissa Holloway-Attaway
University of Central Florida, Orlando, FL, USA
John T. Murray

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Taveekitworachai, P. et al. (2023). What Is Waiting for Us at the End? Inherent Biases of Game Story Endings in Large Language Models. In: Holloway-Attaway, L., Murray, J.T. (eds) Interactive Storytelling. ICIDS 2023. Lecture Notes in Computer Science, vol 14384. Springer, Cham. https://doi.org/10.1007/978-3-031-47658-7_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-47658-7_26
Published: 31 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47657-0
Online ISBN: 978-3-031-47658-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

What Is Waiting for Us at the End? Inherent Biases of Game Story Endings in Large Language Models