Abstract
Multi-hop complex question answering (QA) involves answering questions that require reasoning over multiple pieces of information. Despite the excellent performance of the combination of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) techniques in many natural language understanding tasks, they still encounter challenges when tackling multi-hop complex question answering tasks. The current approach breaks down complex questions into several sub-questions and then uses the RAG framework to retrieve k documents based on semantic similarity to prompt LLMs to answer. However, this method has several limitations. Firstly, LLMs tend to excessively decompose the original multi-hop complex question while neglecting the background, which leads to an inability to focus on information that is crucial for answering the question. Secondly, current RAG often utilize a static top-k retrieval strategy, which does not adequately accommodate the varying informational demands of different questions. On one hand, insufficient retrieved documents may lead to a scarcity of essential knowledge. On the other hand, an excessive number of retrieved documents can introduce significant noise, complicating the information extraction process. To overcome these challenges, this paper introduces a novel solution. We first generate pseudo-documents for the original complex questions and integrate them with the original question to enhance question decomposition. This method resolves the problem of LLMs overly decomposing questions without consideration of the background. Additionally, to mitigate the impact of static parameters during the retrieval process, we incorporate an adaptive retrieval strategy that provides real-time relevance assessment for retrieved documents, ensuring that key information is not overlooked when solving complex multi-hop questions. Through extensive comparative experiments and ablation studies, our method has demonstrated significant effectiveness and achieved state-of-the-art (SOTA) performance in the zero-shot setting for multi-hop complex QA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chao, W., Zheng, Z., Zhu, H., Liu, H.: Make large language model a better ranker. arXiv preprint arXiv:2403.19181 (2024)
Dong, Q., Li, L., Dai, D., et al.: A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2022)
Glass, M., Rossiello, G., Chowdhury, M.F.M., et al.: Re2g: Retrieve, rerank, generate. arXiv preprint arXiv:2207.06300 (2022)
Hendrycks, D., et al.: Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300 (2020)
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)
Lewis, P., Perez, E., Piktus, A., Petroni, F.: Retrieval-augmented generation for knowledge-intensive NLP tasks. Adv. Neural. Inf. Process. Syst. 33, 9459–9474 (2020)
Mavi, V., Jangra, A., Jatowt, A.: A survey on multi-hop question answering and generation. arXiv preprint arXiv:2204.09140 (2022)
Min, S., Lewis, M., Zettlemoyer, L., et al.: Metaicl: Learning to learn in context. arXiv preprint arXiv:2110.15943 (2021)
Min, S., Lyu, X., Holtzman, A., et al.: Rethinking the role of demonstrations: what makes in-context learning work? arXiv preprint arXiv:2202.12837 (2022)
Ouyang, L., Wu, J., Jiang, X., et al.: Training language models to follow instructions with human feedback. Adv. Neural. Inf. Process. Syst. 35, 27730–27744 (2022)
Qiu, L., Xiao, Y., Qu, Y., et al.: Dynamically fused graph network for multi-hop reasoning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6140–6150 (2019)
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)
Wang, L., Yang, N., Wei, F.: Query2doc: query expansion with large language models. arXiv preprint arXiv:2303.07678 (2023)
Wei, J., Wang, X., Schuurmans, D., et al.: Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural. Inf. Process. Syst. 35, 24824–24837 (2022)
Wu, L., Qiu, Z., Zheng, Z., Zhu, H., Chen, E.: Exploring large language model for graph data understanding in online job recommendations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 9178–9186 (2024)
Wu, L., et al.: A survey on large language models for recommendation. World Wide Web 27(5), 60 (2024)
Yang, Z., et al.: HotpotQA: a dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018)
Yu, W., Zhang, H., Pan, X., Ma, K., Wang, H., Yu, D.: Chain-of-note: enhancing robustness in retrieval-augmented language models. arXiv preprint arXiv:2311.09210 (2023)
Zhan, J., Mao, J., Liu, Y., et al.: Learning to retrieve: how to train a dense retrieval model effectively and efficiently. arXiv preprint arXiv:2010.10469 (2020)
Zhao, W.X., Zhou, K., Li, J., et al.: A survey of large language models. arXiv preprint arXiv:2303.18223 (2023)
Zheng, Z., Chao, W., Qiu, Z., Zhu, H., Xiong, H.: Harnessing large language models for text-rich sequential recommendation. In: Proceedings of the ACM on Web Conference 2024, pp. 3207–3216 (2024)
Zheng, Z., Qiu, Z., Hu, X., Wu, L., Zhu, H., Xiong, H.: Generative job recommendations with large language model. arXiv preprint arXiv:2307.02157 (2023)
Zhou, J., Cui, G., Hu, S.: Graph neural networks: a review of methods and applications. AI open 1, 57–81 (2020)
Acknowledgement
This work was supported in part by the grants from National Natural Science Foundation of China (No. 62222213, 62072423).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhou, J., Zheng, Z., Lyu, Y., Xu, T. (2025). Enhancing Complex Question Answering via LLM Pseudo-Document and Adaptive Retrieval. In: Barhamgi, M., Wang, H., Wang, X. (eds) Web Information Systems Engineering – WISE 2024. WISE 2024. Lecture Notes in Computer Science, vol 15436. Springer, Singapore. https://doi.org/10.1007/978-981-96-0579-8_19
Download citation
DOI: https://doi.org/10.1007/978-981-96-0579-8_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0578-1
Online ISBN: 978-981-96-0579-8
eBook Packages: Computer ScienceComputer Science (R0)