DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index

Zhou, Yu-Jia; Yao, Jing; Dou, Zhi-Cheng; Wu, Ledell; Wen, Ji-Rong

doi:10.1007/s11633-022-1373-9

DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index

Research Article
Published: 11 January 2023

Volume 20, pages 276–288, (2023)
Cite this article

Machine Intelligence Research Aims and scope Submit manuscript

186 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Web search provides a promising way for people to obtain information and has been extensively studied. With the surge of deep learning and large-scale pre-training techniques, various neural information retrieval models are proposed, and they have demonstrated the power for improving search (especially, the ranking) quality. All these existing search methods follow a common paradigm, i.e., index-retrieve-rerank, where they first build an index of all documents based on document terms (i.e., sparse inverted index) or representation vectors (i.e., dense vector index), then retrieve and rerank retrieved documents based on the similarity between the query and documents via ranking models. In this paper, we explore a new paradigm of information retrieval without an explicit index but only with a pre-trained model. Instead, all of the knowledge of the documents is encoded into model parameters, which can be regarded as a differentiable indexer and optimized in an end-to-end manner. Specifically, we propose a pre-trained model-based information retrieval (IR) system called DynamicRetriever, which directly returns document identifiers for a given query. Under such a framework, we implement two variants to explore how to train the model from scratch and how to combine the advantages of dense retrieval models. Compared with existing search methods, the model-based IR system parameterizes the traditional static index with a pre-training model, which converts the document semantic mapping into a dynamic and updatable process. Extensive experiments conducted on the public search benchmark Microsoft machine reading comprehension (MS MARCO) verify the effectiveness and potential of our proposed new paradigm for information retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

Recommendation system based on deep learning methods: a systematic review and new directions

Article 03 August 2019

Modeling Relational Data with Graph Convolutional Networks

References

S. Robertson, H. Zaragoza. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, vol. 3, no. 4, pp. 333–389, 2009. DOI: https://doi.org/10.1561/1500000019.
Article Google Scholar
T. Mikolov, K. Chen, G. Corrado, J. Dean. Efficient estimation of word representations in vector space. [Online], Available: https://arxiv.org/abs/1301.3781, 2013.
O Y. Xiong, Z. Y. Dai, J. Callan, Z. Y. Liu, R. Power. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, pp. 55–64, 2017. DOI: https://doi.org/10.1145/3077136.3080809.
Z. Y. Dai, C. Y. Xiong, J. Callan, Z. Y. Liu. Convolutional neural networks for soft-matching N-grams in Ad-Hoc search. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining, Marina Del Rey, USA, pp. 126–134, 2018. DOI: https://doi.org/10.1145/3159652.3159659.
J. T. Zhan, J. X. Mao, Y. Q. Liu, J. F. Guo, M. Zhang, S. P. Ma. Optimizing dense retrieval model training with hard negatives. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1503–1512, 2021. DOI: https://doi.org/10.1145/3404835.3462880.
L. Xiong, C. Y. Xiong, Y. Li, K. F. Tang, J. L. Liu, P. N. Bennett, J. Ahmed, A. Overwijk. Approximate nearest neighbor negative contrastive learning for dense text retrieval. In Proceedings of the 9th International Conference on Learning Representations, 2021.
L. Y. Gao, Z. Y. Dai, T. F. Chen, Z. Fan, B. Van Durme, J. Callan. Complement lexical retrieval model with semantic residual embeddings. In Proceedings of the 43rd European Conference on Information Retrieval, Springer, pp. 146–160, 2021. DOI: https://doi.org/10.1007/978-3-030-72113-8_10.
K. Guu, K. Lee, Z. Tung, P. Pasupat, M. W. Chang. REALM: Retrieval-augmented language model pre-training. [Online], Available: https://arxiv.org/abs/2002.08909, 2020.
J. Devlin, M. W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA, pp. 4171–4186, 2019. DOI: https://doi.org/10.18653/v1/N19-1423.
A. Radford, K. Narasimhan, T. Salimans, I. Sutskever. Improving language understanding by generative pre-training. [Online], Available: https://www.cs.ubc.ca/∼amu-ham01/LING530/papers/radford2018improving.pdf, 2018.
K. Clark, M. T. Luong, Q. V. Le, C. D. Manning. ELEC-TRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
R. Nogueira, W. Yang, K. Cho, J. Lin. Multi-stage document ranking with BERT. [Online], Available: https://arxiv.org/abs/1910.14424, 2019.
W. C. Chang, F. X. Yu, Y. W. Chang, Y. M. Yang, S. Kumar. Pre-training tasks for embedding-based large-scale retrieval. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
X. Y. Ma, J. F. Guo, R. Q. Zhang, Y. X. Fan, X. Ji, X. Q. Cheng. PROP: Pre-training with representative words prediction for Ad-Hoc retrieval. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 283–291, 2021. DOI: https://doi.org/10.1145/3437963.3441777.
W. Yang, H. T. Zhang, J. Lin. Simple applications of BERT for ad hoc document retrieval. [Online], Available: https://arxiv.org/abs/1903.10972, 2019.
R. Nogueira, K. Cho. Passage re-ranking with BERT. [Online], Available: https://arxiv.org/abs/1901.04085, 2019
L. Y. Gao, Z. Y. Dai, J. Callan. Rethink training of BERT rerankers in multi-stage retrieval pipeline. In Proceedings of the 43rd European Conference on Information Retrieval, Springer, pp. 280–286, 2021. DOI: https://doi.org/10.1007/978-3-030-72240-1_26.
J. T. Zhan, J. X. Mao, Y. Q. Liu, M. Zhang, S. P. Ma. Rep-BERT: Contextualized text embeddings for first-stage retrieval. [Online], Available: https://arxiv.org/abs/2006.15498, 2020.
B. Miutra, N. Craswell. An introduction to neural information retrieval. Foundations and Trends in Information Retrieval, vol. 13, no. 1, pp. 1–126, 2018. DOI: https://doi.org/10.1561/1500000061.
Article Google Scholar
D. Metzler, Y. Tay, D. Bahri, M. Najork. Rethinking search: Making domain experts out of dilettantes. ACM SIGIR Forum, vol. 55, no. 1, Article number 13, 2021. DOI: https://doi.org/10.1145/3476415.3476428.
Y. Tay, V. Q. Tran, M. Dehghani, J. M. Ni, D. Bahri, H. Mehta, Z. Qin, K. Hui, Z. Zhao, J. Gupta, T. Schuster, W. W. Cohen, D. Metzler. Transformer memory as a differentiate search index. [Online], Available: https://arxiv.org/abs/2202.06991, 2022.
J. Pennington, R. Socher, C. Manning. GloVe: Global vectors for word representation. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1532–1543, 2014. DOI: https://doi.org/10.3115/v1/D14-1162.
G. Q. Zheng, J. Callan. Learning to reweight terms with distributed representations. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, pp. 575–584, 2015. DOI: https://doi.org/10.1145/2766462.2767700.
J. F. Guo, Y. X. Fan, Q. Y. Ai, W. B. Croft. A deep relevance matching model for Ad-Hoc retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, USA, pp. 55–64, 2016. DOI: https://doi.org/10.1145/2983323.2983769.
M. Dehghani, H. Zamani, A. Severyn, J. Kamps, W. B. Croft. Neural ranking models with weak supervision. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Japan, pp. 65–74, 2017. DOI: https://doi.org/10.1145/3077136.3080832.
R. Nogueira, J. Lin. From doc2query to docTTTTTquery. [Online], Available: https://cs.uwaterloo.ca/∼jimmylin/publications/Nogueira_Lin_2019_docTTTTTquery-v2.pdf 2019.
R. Nogueira, W. Yang, J. Lin, K. Cho. Document expansion by query prediction. [Online], Available: https://arxiv.org/abs/1904.08375, 2019.
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Q. Zhou, W. Li, P. J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, vol. 21, no. 1, Article number 140, 2020.
Z. Y. Dai, J. Callan. Context-aware sentence/passage term importance estimation for first stage retrieval. [Online], Available: https://arxiv.org/abs/1910.10687, 2019.
Z. Y. Dai, J. Callan. Context-aware document term weighting for ad-hoc search. In Proceedings of the Web Conference, Taiwan, China, pp. 1897–1907, 2020. DOI: https://doi.org/10.1145/3366423.3380258.
J. Johnson, M. Douze, H. Jégou. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, 2021. DOI: https://doi.org/10.1109/TBDATA.2019.2921572.
Article Google Scholar
H. Jégou, M. Douze, C. Schmid. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117–128, 2011. DOI: https://doi.org/10.1109/TPAMI.2010.57.
Article Google Scholar
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
J. F. Guo, Y. X. Fan, L. Pang, L. Yang, Q. Y. Ai, H. Zamani, C. Wu, W. B. Croft, X. Q. Cheng. A deep look into neural ranking models for information retrieval. Information Processing & Management, vol. 57, no. 6, Article number 102067, 2020. DOI: https://doi.org/10.1016/j.ipm.2019.102067.
K. Lee, M. W. Chang, K. Toutanova. Latent retrieval for weakly supervised open domain question answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 6086–6096, 2019. DOI: https://doi.org/10.18653/v1/P19-1612.
J. M. Ni, G. H. Ábrego, N. Constant, J. Ma, K. B. Hall, D. Cer, Y. F. Yang. Sentence-T5: Scalable sentence encoders from pre-trained text-to-text models. In Proceedings of the Findings of the Association for Computational Linguistics, Dublin, Ireland, pp. 1864–1874, 2022. DOI: https://doi.org/10.18653/v1/2022.findings-acl.146.
V. Karpukhin, B. Oğuz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Q. Chen, W. T. Yih. Dense passage retrieval for open-domain question answering. In Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 6769–6781, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.550.
N. De Cao, G. Izacard, S. Riedel, F. Petroni. Autoregressive entity retrieval. In Proceedings of the 9th International Conference on Learning Representations, 2021.
J. G. Chen, R. Q. Zhang, J. F. Guo, Y. X. Fan, X. Q. Cheng. GERE: Generative evidence retrieval for fact verification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, pp. 2184–2189, 2022. DOI: https://doi.org/10.1145/3477495.3531827.
J. P. Callan. Passage-level evidence in document retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Springer-Verlag, Dublin, Ireland, pp. 302–310, 1994.
Google Scholar
T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. Devlin, K. Lee, K. Toutanova, L. Jones, M. Kelcey, M. W. Chang, A. M. Dai, J. Uszkoreit, Q. Le, S. Petrov. Natural questions: A benchmark for question answering research. Transactions of the Association for Computational Linguistics, vol.7, pp.453–466, 2019. DOI: https://doi.org/10.1162/tacl_a_00276.
Article Google Scholar

Download references

Acknowledegements

This work was supported by National Natural Science Foundation of China (Nos. 61872370 and 61832017), Beijing Outstanding Young Scientist Program (No. BJJWZYJH012019100020098), Beijing Academy of Artificial Intelligence (BAAI), the Outstanding Innovative Talents Cultivation Funded Programs 2021 of Renmin University of China, and Intelligent Social Governance Platform, Major Innovation & Planning Interdisciplinary Platform for the “Double-First Class” Initiative, Renmin University of China.

Author information

These authors contribute equally to this work

Authors and Affiliations

Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, 100872, China
Yu-Jia Zhou, Jing Yao, Zhi-Cheng Dou & Ji-Rong Wen
Beijing Academy of Artificial Intelligence, Beijing, 100084, China
Ledell Wu

Authors

Yu-Jia Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jing Yao
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Cheng Dou
View author publications
You can also search for this author in PubMed Google Scholar
Ledell Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ji-Rong Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi-Cheng Dou.

Additional information

Yu-Jia Zhou received the B. Eng. degree in computer science and technology from School of Information, Renmin University of China, China in 2019. He is currently a Ph.D. degree candidate in computer science at School of Information, Renmin University of China. He won the best student paper award in CCIR 2018. He has been invited as a reviewer of international conferences SIGIR, KDD, WSDM.

His research interests include information retrieval, personalized search, deep learning, and data mining.

Jing Yao received the B. Eng. degree in computer science and technology from School of Information, Renmin University of China, China in 2019, and the M. Sc. degree in computer application technology from School of Information, Renmin University of China, Chian in 2022. She has been invited as a reviewer of international conferences SIGIR, WSDM. She is working at Microsoft Research Asia as a researcher now.

Her research interests include information retrieval, personalized search, explainable search/recommendation.

Zhi-Cheng Dou received the B. Sc. and Ph. D. degrees in computer science and technology from Nankai University, China in 2003 and 2008, respectively. He is an associate professor in School of Information, Renmin University of China. He worked at Microsoft Research as a researcher from July 2008 to September 2014. He is a member of the IEEE.

His research interests include information retrieval, data mining, and big data analytics.

Ledell Wu received the B. Sc. degree in mathematics from Peking University, China in 2009, received the the M. Sc. degree in computer science from and University of Toronto, Canada in 2011. She is currently a research scientist manager at Beijing Academy of Artificial Intelligence (BAAI), China. She worked as a research engineer at Facebook AI Research from 2013–2021. She worked on a couple of research projects that also have boarder impact at Facebook, including general purpose embedding system, large-scale graph embedding system, mono/multilingual entity linking system and dense passage retrieval system. She also studies fairness and biases in machine learning and NLP models.

Her research interests include approximation algorithms, the hardness of approximation, privacy, and machine learning.

Ji-Rong Wen received the B. Sc. and M. Sc. degrees in computer science from Renmin University of China, China, in 1994 and 1996, and the Ph. D. degree in computer science from Chinese Academy of Sciences, China in 1999. He is a professor at Renmin University of China. He was a senior researcher and research manager with Microsoft Research from 2000 to 2014. He is a senior member of the IEEE.

His research interests include web data management, information retrieval (especially web IR), and data mining.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, YJ., Yao, J., Dou, ZC. et al. DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index. Mach. Intell. Res. 20, 276–288 (2023). https://doi.org/10.1007/s11633-022-1373-9

Download citation

Received: 30 June 2022
Accepted: 31 August 2022
Published: 11 January 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11633-022-1373-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

Recommendation system based on deep learning methods: a systematic review and new directions

Modeling Relational Data with Graph Convolutional Networks

References

Acknowledegements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DynamicRetriever: A Pre-trained Model-based IR System Without an Explicit Index

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

Recommendation system based on deep learning methods: a systematic review and new directions

Modeling Relational Data with Graph Convolutional Networks

References

Acknowledegements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation