Authors:
Sara Mirabi
1
;
Bahadorreza Ofoghi
1
;
John Yearwood
1
;
Diego Molla-Aliod
2
and
Vicky Mak-Hau
1
Affiliations:
1
School of Information Technology, Deakin University, Melbourne, Australia
;
2
School of Computing, Macquarie University, Sydney, Australia
Keyword(s):
Dialogue Systems, Multi-Agent Conversational Systems, Noisy Answers, Answer Validation, Error Detection, Linear Programming, Optimization.
Abstract:
Goal-oriented conversational systems based on large language models (LLMs) provide the potential capability to gather the necessary requirements for solving tasks or developing solutions. However, in real-world scenarios, non-expert users may respond incorrectly to dialogue questions, which can impede the system’s effectiveness in eliciting accurate information. This paper presents a novel approach to detecting and categorizing noisy answers in goal-oriented conversations, with a focus on modeling linear programming problems. Using a current LLM, Gemini, we develop multi-agent synthetic conversations based on problem statements from the benchmark optimization modeling dataset NL4Opt to generate dialogues in the presence of noisy answers too. Our experiments show the LLM is not sufficiently equipped with the capabilities to detect noisy answers and hence, in almost 59% of the cases where there is a noisy answer, the LLM continues with the conversation without any attempts at resolving
the noise. Thus, we also propose a two-step answer validation method for the identification and classification of noisy answers. Our findings demonstrate that while some LLM and non-LLM-based models perform well in detecting answer inaccuracies, there is a need for further improvements in classifying noisy answers into fine-grained stress types.
(More)