Abstract
Large language models (LLMs) can be augmented by interacting with external tools and knowledge bases, allowing them to overcome some of their known limitations, such as not having access to up-to-date information or struggling to solve math problems, thereby going beyond the knowledge and capabilities obtained during pre-training. Recent prompting techniques have enabled tool-augmented LLMs to combine reasoning and action to solve complex problems with the help of tools. This is essential for allowing LLMs to strategically determine the timing and nature of tool-calling actions in order to enhance their decision-making process and improve their outputs. However, the reliance of current prompting techniques on a single reasoning path or their limited ability to adjust plans within that path can adversely impact the performance of tool-augmented LLMs. In this paper, we introduce a novel prompting method, whereby an LLM agent selects and executes one among multiple candidate strategies. We assess the effectiveness of our method on three question answering datasets, on which it outperforms state-of-the-art methods like ReWOO, while also being a competitive and more cost-efficient alternative to ReAct. We also investigate the impact of selecting a reasoning trajectory from different strategy pool sizes, further highlighting the risks in only considering a single strategy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In this paper, we use the terms reasoning trajectory and strategy interchangeably.
- 2.
GPT-3.5-Turbo is used in our experiment.
- 3.
- 4.
- 5.
- 6.
- 7.
https://platform.openai.com/docs/models/gpt-3-5. access on 1st, Sep, 2023.
References
Beatty, I.D., Gerace, W.J., Leonard, W.J., Dufresne, R.J.: Designing effective questions for classroom response system teaching. Am. J. Phys. 74(1), 31–39 (2006)
Cobbe, K., et al.: Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168 (2021)
Hao, S., Gu, Y., Ma, H., Hong, J.J., Wang, Z., Wang, D.Z., Hu, Z.: Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992 (2023)
Hosseini-Asl, E., McCann, B., Wu, C.S., Yavuz, S., Socher, R.: A simple language model for task-oriented dialogue. Adv. Neural. Inf. Process. Syst. 33, 20179–20191 (2020)
Ji, Z., et al.: Survey of hallucination in natural language generation. ACM Comput. Surv. 55(12), 1–38 (2023)
Komeili, M., Shuster, K., Weston, J.: Internet-augmented dialogue generation. arXiv preprint arXiv:2107.07566 (2021)
Lewis, P., et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural. Inf. Process. Syst. 33, 9459–9474 (2020)
Nakano, R., et al.: Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021)
OpenAI, R.: Gpt-4 technical report (arxiv: 2303.08774). View in Article (2023)
Rae, J.W., et al.: Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446 (2021)
Shinn, N., Cassano, F., Labash, B., Gopinath, A., Narasimhan, K., Yao, S.: Reflexion: language agents with verbal reinforcement learning 4. arXiv preprint arXiv:2303.11366 (2023)
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., Zhou, D.: Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022)
Wei, J., et al.: Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural. Inf. Process. Syst. 35, 24824–24837 (2022)
Xu, B., Peng, Z., Lei, B., Mukherjee, S., Liu, Y., Xu, D.: Rewoo: Decoupling reasoning from observations for efficient augmented language models. arXiv preprint arXiv:2305.18323 (2023)
Yang, Z., et al.: Hotpotqa: a dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018)
Yao, S., et al.: Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601 (2023)
Yao, S., et al.: React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 (2022)
Acknowledgement
The computations and data handling were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725. Acknowledgment is also extended to the ReWOO project8 for providing the code base used to conduct the experiments.(8https://github.com/billxbf/ReWOO)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, Y., Henriksson, A. (2024). Selecting from Multiple Strategies Improves the Foreseeable Reasoning of Tool-Augmented Large Language Models. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14943. Springer, Cham. https://doi.org/10.1007/978-3-031-70352-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-70352-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70351-5
Online ISBN: 978-3-031-70352-2
eBook Packages: Computer ScienceComputer Science (R0)