skip to main content
10.1145/3652988.3673960acmconferencesArticle/Chapter ViewAbstractPublication PagesivaConference Proceedingsconference-collections
research-article

Exploring Theory of Mind in Large Language Models through Multimodal Negotiation

Published: 26 December 2024 Publication History

Abstract

With the advancement of Large Language Models (LLMs), they are increasingly being used as a backend for interactive virtual agents and assistants. Therefore, a critical social skill for these agents is Theory of Mind (ToM): the ability to model and reason about other agents. Research has investigated ToM in LLMs using standard, modified, and extended versions of false-belief tasks. These tests include explicit prompts asking LLMs to answer questions about other agents. However, in real situations, people have to use ToM unprompted to navigate social life. Additionally, oftentimes, people have to rely on nonverbal cues such as facial expressions. This work seeks to address this gap by studying implicit ToM in LLMs in a negotiation task. In negotiation, agents have to implicitly reason about other agents to reach an agreed-upon best possible deal. We conducted the negotiation experiment by prompting different LLMs to roleplay as characters and pitting them against rule-based agents that may respond with different facial expressions. We measure and compare the outcomes of the negotiation across models. Our results show that strong LLMs like GPT-4 turbo and Claude 3 Opus can perform decently and adjust their offers based on access to facial expression information, but weaker models are far behind. Our work contributes to our understanding of LLMs’ capabilities and limitations for serving as intelligent and interactive agents.

References

[1]
Simon Baron-Cohen, Sally Wheelwright, Jacqueline Hill, Yogini Raste, and Ian Plumb. 2001. The “Reading the Mind in the Eyes” Test revised version: a study with normal adults, and adults with Asperger syndrome or high-functioning autism. The Journal of Child Psychology and Psychiatry and Allied Disciplines 42, 2 (2001), 241–251.
[2]
Yifan Bian, Dennis Küster, Hui Liu, and Eva G Krumhuber. 2023. Understanding naturalistic facial expressions with deep learning and multimodal large language models. Sensors 24, 1 (2023), 126.
[3]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[4]
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, 2023. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023).
[5]
Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, and Steven Hoi. 2023. InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning. arxiv:2305.06500 [cs.CV]
[6]
Tim R Davidson, Veniamin Veselovsky, Martin Josifoski, Maxime Peyrard, Antoine Bosselut, Michal Kosinski, and Robert West. 2024. Evaluating language model agency through negotiations. arXiv preprint arXiv:2401.04536 (2024).
[7]
Celso M. de Melo, Peter Carnevale, and Jonathan Gratch. [n. d.]. The effect of expression of anger and happiness in computer agents on negotiations with humans. In The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3 (Richland, SC, 2011-05-02) (AAMAS ’11). International Foundation for Autonomous Agents and Multiagent Systems, 937–944.
[8]
Celso M De Melo, Peter J Carnevale, Stephen J Read, and Jonathan Gratch. 2014. Reading people’s minds from emotion expressions in interdependent decision making.Journal of personality and social psychology 106, 1 (2014), 73.
[9]
Morteza Dehghani, Peter J. Carnevale, and Jonathan Gratch. [n. d.]. Interpersonal effects of expressed anger and sorrow in morally charged negotiation. 9, 2 ([n. d.]), 104–113.
[10]
Shelley Derksen and H. J. Keselman. [n. d.]. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. 45, 2 ([n. d.]), 265–282. https://doi.org/10.1111/j.2044-8317.1992.tb00992.x
[11]
P Ekman and H Oster. [n. d.]. Facial Expressions of Emotion. 30 ([n. d.]), 527–54.
[12]
Chris Frith and Uta Frith. 2005. Theory of mind. Current biology 15, 17 (2005), R644–R645.
[13]
Yao Fu, Hao Peng, Tushar Khot, and Mirella Lapata. 2023. Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv preprint arXiv:2305.10142 (2023).
[14]
Kanishk Gandhi, Jan-Philipp Fränken, Tobias Gerstenberg, and Noah Goodman. 2024. Understanding social reasoning in language models with language models. Advances in Neural Information Processing Systems 36 (2024).
[15]
Jonathan Gratch, David DeVault, and Gale Lucas. [n. d.]. The Benefits of Virtual Humans for Teaching Negotiation. In Intelligent Virtual Agents, David Traum, William Swartout, Peter Khooshabeh, Stefan Kopp, Stefan Scherer, and Anton Leuski (Eds.). Vol. 10011. Springer International Publishing, 283–294. https://doi.org/10.1007/978-3-319-47665-0_25 Series Title: Lecture Notes in Computer Science.
[16]
Jonathan Gratch, David DeVault, Gale M. Lucas, and Stacy Marsella. 2015. Negotiation as a Challenge Problem for Virtual Humans. In Intelligent Virtual Agents, Willem-Paul Brinkman, Joost Broekens, and Dirk Heylen (Eds.). Springer International Publishing, 201–215.
[17]
Winston Haynes. 2013. Bonferroni Correction. Springer New York, New York, NY, 154–154. https://doi.org/10.1007/978-1-4419-9863-7_1213
[18]
Jessie Hoegen, David DeVault, and Jonathan Gratch. 2022. Exploring the Function of Expressions in Negotiation: The DyNego-WOZ Corpus. IEEE Transactions on Affective Computing (2022).
[19]
Chuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer Ullman, Antonio Torralba, Joshua B Tenenbaum, and Tianmin Shu. 2024. Mmtom-qa: Multimodal theory of mind question answering. arXiv preprint arXiv:2401.08743 (2024).
[20]
Julia M. Kim, Randall W. Hill, Paula J. Durlach, H. Chad Lane, Eric Forbell, Mark Core, Stacy Marsella, David Pynadath, and John Hart. 2009. BiLAT: A Game-Based Environment for Practicing Negotiation in a Cultural Context. Int. J. Artif. Intell. Ed. 19, 3 (aug 2009), 289–308.
[21]
Michal Kosinski. 2023. Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083 4 (2023), 169.
[22]
Deuksin Kwon, Emily Weiss, Tara Kulshrestha, Kushal Chawla, Gale M Lucas, and Jonathan Gratch. 2024. Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues. arXiv preprint arXiv:2402.13550 (2024).
[23]
Matthew Le, Y-Lan Boureau, and Maximilian Nickel. 2019. Revisiting the evaluation of theory of mind through question answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 5872–5877.
[24]
Raz Lin, Yinon Oshrat, and Sarit Kraus. [n. d.]. Investigating the Benefits of Automated Negotiations in Enhancing People’s Negotiation Skills. ([n. d.]).
[25]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2024. Visual instruction tuning. Advances in neural information processing systems 36 (2024).
[26]
Desmond C Ong, Jamil Zaki, and Noah D Goodman. 2019. Computational models of emotion inference in theory of mind: A review and roadmap. Topics in cognitive science 11, 2 (2019), 338–357.
[27]
OpenAI. 2023. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023).
[28]
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730–27744.
[29]
David Premack and Guy Woodruff. 1978. Does the chimpanzee have a theory of mind?Behavioral and brain sciences 1, 4 (1978), 515–526.
[30]
R Core Team. 2023. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[31]
Motoaki Sato, Kazunori Terada, and Jonathan Gratch. 2023. Teaching reverse appraisal to improve negotiation skills. IEEE Transactions on Affective Computing (2023).
[32]
Natalie Shapira, Mosh Levy, Seyed Hossein Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, and Vered Shwartz. 2023. Clever hans or neural theory of mind? stress testing social reasoning in large language models. arXiv preprint arXiv:2305.14763 (2023).
[33]
Natalie Shapira, Guy Zwirn, and Yoav Goldberg. 2023. How Well Do Large Language Models Perform on Faux Pas Tests?. In Findings of the Association for Computational Linguistics: ACL 2023. 10438–10451.
[34]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
[35]
Jeanne L Tsai. [n. d.]. Tsai, J. & Clobert, M.(in press). Cultural influences on emotion: Empirical patterns and emerging trends. In S. Kitayama & D. Cohen (Eds). Handbook of cultural psychology. Oxford University Press. ([n. d.]).
[36]
Jeanne L Tsai and Robert W Levenson. 1997. Cultural influences on emotional responding: Chinese American and European American dating couples during interpersonal conflict. Journal of Cross-Cultural Psychology 28, 5 (1997), 600–625.
[37]
Tomer Ullman. 2023. Large language models fail on trivial alterations to theory-of-mind tasks. arXiv preprint arXiv:2302.08399 (2023).
[38]
Gerben A Van Kleef, Carsten KW De Dreu, Davide Pietroni, and Antony SR Manstead. 2006. Power and emotion in negotiation: Power moderates the interpersonal effects of anger and happiness on concession making. European Journal of Social Psychology 36, 4 (2006), 557–581.
[39]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[40]
Mudit Verma, Siddhant Bhambri, and Subbarao Kambhampati. 2024. Theory of Mind abilities of Large Language Models in Human-Robot Interaction: An Illusion?arXiv preprint arXiv:2401.05302 (2024).
[41]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 35 (2022), 24824–24837.
[42]
Henry M Wellman, Susan Carey, Lila Gleitman, Elissa L Newport, and Elizabeth S Spelke. 1990. The child’s theory of mind. The MIT Press.
[43]
Andrew Whiten and RW Byrne. 1991. Natural theories of mind: Evolution, development and simulation of everyday mindreading. B. Blackwell Oxford, UK.
[44]
Hadley Wickham, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, Alex Hayes, Lionel Henry, Jim Hester, Max Kuhn, Thomas Lin Pedersen, Evan Miller, Stephan Milton Bache, Kirill Müller, Jeroen Ooms, David Robinson, Dana Paige Seidel, Vitalie Spinu, Kohske Takahashi, Davis Vaughan, Claus Wilke, Kara Woo, and Hiroaki Yutani. 2019. Welcome to the tidyverse. Journal of Open Source Software 4, 43 (2019), 1686. https://doi.org/10.21105/joss.01686
[45]
Yujian Yuan, Jiabei Zeng, and Shiguang Shan. 2023. Describe Your Facial Expressions by Linking Image Encoders and Large Language Models. (2023).
[46]
Pei Zhou, Aman Madaan, Srividya Pranavi Potharaju, Aditya Gupta, Kevin R McKee, Ari Holtzman, Jay Pujara, Xiang Ren, Swaroop Mishra, Aida Nematzadeh, 2023. How FaR Are Large Language Models From Agents with Theory-of-Mind?arXiv preprint arXiv:2310.03051 (2023).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IVA '24: Proceedings of the 24th ACM International Conference on Intelligent Virtual Agents
September 2024
337 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 December 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Facial expressions
  2. Large Language Models
  3. Negotiation
  4. Theory of Mind

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • UK Research & Innovation

Conference

IVA '24
Sponsor:
IVA '24: ACM International Conference on Intelligent Virtual Agents
September 16 - 19, 2024
GLASGOW, United Kingdom

Acceptance Rates

Overall Acceptance Rate 53 of 196 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 70
    Total Downloads
  • Downloads (Last 12 months)70
  • Downloads (Last 6 weeks)35
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media