Abstract
This paper proposes an automatic query expansion method that combines document re-ranking and standard Rocchio’s relevance feedback. The document re-ranking method ranks the top retrieved documents based on the intrinsic manifold structure collectively revealed by a great amount of data. This is done by using a semi-supervised learning algorithm to integrate pseudo relevant documents with documents to be re-ranked. Given an initial ranked list of retrieved documents, the document re-ranking approach picks a set of documents from the top ones (including query itself) as pseudo relevant documents. In this way, the intrinsic relationship of all the retrieved documents to be re-ranked with the pseudo relevant documents (pseudo irrelevant documents are missing) can be determined via a semi-supervised learning algorithm. Finally, all the retrieved documents can be re-ranked according to above relationship. Evaluation on benchmark corpora show that the approach can achieve much better performance than standard Rocchio’s relevance feedback and performance better than other related approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Carpineto, C., Demori, R., Romano, G., Bigi, B.: An Information-Theoretic Approach to Automatic Query Expansion. ACM Transactions on Information Systems 19(1), 1–27 (2001)
Crouch, C., Crouch, D., Chen, Q., Holtz, S.: Improving the Retrieval Effectiveness of Very Short Queries. In: Information Processing and Management, vol. 38 (2002)
Kurland, O., Lee, L.: PageRank without Hyper-links: Structural Re-ranking using Links Induced by Language models. In: The Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2005)
Mitra, M., Singhal, A., Buckley, C.: Improving Automatic Query Expansion. In: The proceedings of the 21th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1998)
Rocchio, J.: Relevance Feedback in Information Retrieval. In: Salton, G. (ed.) The SMART retrieval system – Experiments in Automatic Query Expansion. Prentice Hall, Englewood Cliffs (1971)
Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. Journal of the American Society of Information Science 41, 288–297 (1990)
Xu, J., Croft, B.: Improving the Effectiveness of Information Retrieval with Local Context Analysis. ACM Transactions on Information Systems 18(1), 79–112 (2000)
Yang, L.P., Ji, D.H., Leong, M.K.: Chinese Document Re-ranking Based on Term Distribution and Maximal Marginal Relevance. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 299–311. Springer, Heidelberg (2005)
Zhang, B.Y., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.: Improving Search Results using Affinity Graph. In: The Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2005)
Zhou, D.Y., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on Data Manifolds. Advances in Neural Information Processing Systems 16, 169–176 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, L., Ji, D., Nie, Y., He, T. (2006). Automatic Query Expansion Using Data Manifold. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_59
Download citation
DOI: https://doi.org/10.1007/11880592_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45780-0
Online ISBN: 978-3-540-46237-8
eBook Packages: Computer ScienceComputer Science (R0)