research-article

Enhancing Virtual Human Interactions by Designing a Real-Time Dialog Filter for Mitigating Nonsensical Responses

Authors:

Alexandre Gomes de Siqueira,

Sarah Bloch-elkouby,

Olivia Lawrence,

Giuseppe Sarli,

Megan L. Rogers,

Rohith Venkatakrishnan,

Roshan Venkatakrishnan,

Benjamin LokAuthors Info & Claims

SVR '24: Proceedings of the 26th Symposium on Virtual and Augmented Reality

Pages 51 - 60

https://doi.org/10.1145/3691573.3691597

Published: 30 September 2024 Publication History

SVR '24: Proceedings of the 26th Symposium on Virtual and Augmented Reality

Enhancing Virtual Human Interactions by Designing a Real-Time Dialog Filter for Mitigating Nonsensical Responses

Pages 51 - 60

Abstract
References

Abstract

Virtual Humans (VHs) are crucial in facilitating discussions on sensitive topics and training interpersonal interactions. However, conversational errors, like nonsensical responses, challenge VH simulation effectiveness. This paper explores real-time dialog filters to detect such undesired exchanges. We employ a five-step prompt design iteratively and leverage OpenAI’s GPT large language model to demonstrate feasibility. Our filter distinguishes meaningful from nonsensical responses generated by a rule-based system, achieving high F1 scores (0.84) and accuracy (0.78). Comparison with human-expert classifications validates its efficacy. Filtering nonsensical responses ensures coherent and relevant interactions, significantly enhancing efficacy. This study underscores how leveraging large language models can refine existing VH systems and improve virtual human dialogues.

References

[1]

2023. Chat-openAI. https://chat.openai.com/. (Accessed on 06/23/2023).

[2]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

[3]

Sabarish Babu, Stephen Schmugge, Raj Inugala, Srinivasa Rao, Tiffany Barnes, and Larry F Hodges. 2005. Marve: a prototype virtual human interface framework for studying human-virtual human interaction. In Intelligent Virtual Agents: 5th International Working Conference, IVA 2005, Kos, Greece, September 12-14, 2005. Proceedings 5. Springer, 120–133.

Digital Library

[4]

Srinivas Bangalore and Michael Johnston. 2003. Balancing data-driven and rule-based approaches in the context of a multimodal conversational system. In 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No. 03EX721). IEEE, 221–226.

[5]

Shira Barzilay, Krystel Assounga, Jacqueline Veras, Courtnie Beaubian, Sarah Bloch-Elkouby, and Igor Galynker. 2020. Assessment of near-term risk for suicide attempts using the suicide crisis inventory. Journal of affective disorders 276 (2020), 183–190.

[6]

Som S Biswas. 2023. Potential use of chat gpt in global warming. Annals of biomedical engineering 51, 6 (2023), 1126–1127.

[7]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.

[8]

Carma L Bylund and Gregory Makoul. 2005. Examining empathy in medical encounters: an observational study using the empathic communication coding system. Health communication 18, 2 (2005), 123–140.

[9]

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).

[10]

Cora Diamond. 1981. What nonsense might be. Philosophy 56, 215 (1981), 5–22.

[11]

Igor Galynker. 2017. The suicidal crisis: Clinical guide to the assessment of imminent suicide risk. Oxford University Press.

[12]

Joseph C Giarratano and Gary Riley. 1989. Expert systems: principles and programming. Brooks/Cole Publishing Co.

[13]

Alexandre Gomes de Siqueira, Heng Yao, Anokhi Bafna, Sarah Bloch-Elkouby, Jenelle Richards, Lauren B Lloveras, Kathleen Feeney, Stephanie Morris, Erica D Musser, Benjamin Lok, 2021. Investigating the Effects of Virtual Patients’ Nonsensical Responses on Users’ Facial Expressions in Mental Health Training Scenarios. In Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology. 1–10.

Digital Library

[14]

Kevin A Hallgren. 2012. Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology 8, 1 (2012), 23.

[15]

Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, 2022. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556 (2022).

[16]

Rainer Knauf, Avelino J Gonzalez, and Thomas Abel. 2002. A framework for validation of rule-based systems. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 32, 3 (2002), 281–295.

Digital Library

[17]

Anis Koubaa. 2023. GPT-4 vs. GPT-3.5: A concise showdown. (2023).

[18]

Janice L Krieger, Jordan M Neil, Kyle A Duke, Mohan S Zalake, Fatemeh Tavassoli, Melissa J Vilaro, Danyell S Wilson-Howard, Sarah Y Chavez, Eric B Laber, Marie Davidian, 2021. A pilot study examining the efficacy of delivering colorectal cancer screening messages via virtual health assistants. American journal of preventive medicine (2021).

[19]

Gale M Lucas, Jill Boberg, David Traum, Ron Artstein, Jonathan Gratch, Alesia Gainer, Emmanuel Johnson, Anton Leuski, and Mikio Nakano. 2018. Culture, errors, and rapport-building dialogue in social agents. In Proceedings of the 18th International Conference on intelligent virtual agents. 51–58.

Digital Library

[20]

Gale M Lucas, Jonathan Gratch, Aisha King, and Louis-Philippe Morency. 2014. It’s only a computer: Virtual humans increase willingness to disclose. Computers in Human Behavior 37 (2014), 94–100.

Digital Library

[21]

Potsawee Manakul, Adian Liusie, and Mark JF Gales. 2023. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. arXiv preprint arXiv:2303.08896 (2023).

[22]

Bertalan Meskó. 2023. Prompt engineering as an important emerging skill for medical professionals: tutorial. Journal of Medical Internet Research 25 (2023), e50638.

[23]

Neelang Parghi, Lakshmi Chennapragada, Shira Barzilay, Saskia Newkirk, Brian Ahmedani, Benjamin Lok, and Igor Galynker. 2021. Assessing the predictive ability of the Suicide Crisis Inventory for near-term suicidal behavior using machine learning approaches. International journal of methods in psychiatric research 30, 1 (2021), e1863.

[24]

Albert Rizzo, Greg Reger, Karen Perlman, Barbara Rothbaum, JoAnn Difede, Rob McLay, Ken Graap, Greg Gahm, Scott Johnston, Rob Deal, 2011. Virtual reality posttraumatic stress disorder (PTSD) exposure therapy results with active duty OIF/OEF service members. (2011).

[25]

Brent Rossen and Benjamin Lok. 2012. A crowdsourcing method to develop virtual human conversational agents. International Journal of Human-Computer Studies 70, 4 (2012), 301–319.

Digital Library

[26]

Allison Schuck, Raffaella Calati, Shira Barzilay, Sarah Bloch-Elkouby, and Igor Galynker. 2019. Suicide Crisis Syndrome: A review of supporting evidence for a new suicide-specific diagnosis. Behavioral sciences & the law 37, 3 (2019), 223–239.

[27]

Richard Skarbez, Aaron Kotranza, Frederick P Brooks, Benjamin Lok, and Mary C Whitton. 2011. An initial exploration of conversational errors as a novel method for evaluating virtual human experiences. In 2011 IEEE Virtual Reality Conference. IEEE, 243–244.

Digital Library

[28]

Marina Sokolova and Guy Lapalme. 2009. A systematic analysis of performance measures for classification tasks. Information processing & management 45, 4 (2009), 427–437.

[29]

Ars Technica. 2024. You can now run a GPT-3-level AI model on your laptop, phone, and Raspberry Pi. https://arstechnica.com/information-technology/2023/03/you-can-now-run-a-gpt-3-level-ai-model-on-your-laptop-phone-and-raspberry-pi/. (Accessed on 03/30/2024).

[30]

Sandeep A Thorat and Vishakha Jadhav. 2020. A review on implementation issues of rule-based chatbot systems. In Proceedings of the international conference on innovative computing & communications (ICICC).

[31]

Yuqiong Wang, Peter Khooshabeh, and Jonathan Gratch. 2013. Looking real and making mistakes. In International Workshop on Intelligent Virtual Agents. Springer, 339–348.

[32]

Heng Yao, Alexandre Gomes de Siqueira, Adriana Foster, Igor Galynker, and Benjamin Lok. 2020. Toward Automated Evaluation of Empathetic Responses in Virtual Human Interaction Systems for Mental Health Scenarios. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents. 1–8.

Digital Library

Index Terms

Enhancing Virtual Human Interactions by Designing a Real-Time Dialog Filter for Mitigating Nonsensical Responses
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Intelligent agents
    2. Natural language processing

Recommendations

Investigating the Effects of Virtual Patients’ Nonsensical Responses on Users’ Facial Expressions in Mental Health Training Scenarios
VRST '21: Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology

This report investigates how clinician-participants react to virtual patients’ sensical vs. nonsensical responses in a training simulation that aims to help clinicians acquire empathetic skills toward high-risk patients with symptoms of the Suicide ...
Embodied agents for multi-party dialogue in immersive virtual worlds
AAMAS '02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2

Immersive virtual worlds are increasingly being used for education, training, and entertainment, and virtual humans that can interact with human users in these worlds play many important roles. However, current computational models of dialogue do not ...
Virtual human personality masks: a human computation approach to modeling verbal personalities in virtual humans
IVA'12: Proceedings of the 12th international conference on Intelligent Virtual Agents

Modeling virtual humans that can exhibit realistic personalities is becoming increasingly important as virtual humans are being widely used for inter-personal skills education. We present Virtual Human Personality Masks, a system that combines human ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SVR '24: Proceedings of the 26th Symposium on Virtual and Augmented Reality

September 2024

346 pages

ISBN:9798400709791

DOI:10.1145/3691573

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Institutes of Health
American Foundation for Suicide Prevention

Conference

SVR 2024

SVR 2024: Symposium on Virtual and Augmented Reality

September 30 - October 3, 2024

Manaus, Brazil

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
47
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)14

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten