Abstract
In this paper, we propose a document re-ranking method for Chinese information retrieval where a query is a short natural language description. The method bases on term distribution where each term is weighted by its local and global distribution, including document frequency, document position and term length. The weight scheme lifts off the worry that very fewer relevant documents appear in top retrieved documents, and allows randomly setting a larger portion of the retrieved documents as relevance feedback. It also helps to improve the performance of MMR model in document re-ranking. The experiments show our method can get significant improvement against standard baselines, and outperforms relevant methods consistently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Balinski, J., Danilowicz, C.: Re-ranking Method Based on Inter-document Distance. Information Processing and Management 41, 759–775 (2005)
Bear, J., Israel, D., Petit, J., Martin, D.: Using Information Extraction to Improve Document Retrieval. In: Proceedings of the Sixth Text Retrieval Conference (1997)
Crouch, C., Crouch, D., Chen, Q., Holtz, S.: Improving the Retrieval Effectiveness of Very Short Queries. Information Processing and Management 38 (2002)
Kamps, J.: Improving Retrieval Effectiveness by Reranking Documents Based on Controlled Vocabulary. In: The 21th European Conference on Information Retrieval (2004)
Kwok, K.L.: Comparing Representation in Chinese Information Retrieval. In: Proceedings of the ACM SIGIR 1997, pp. 34–41 (1997)
Lee, K., Park, Y., Choi, K.S.: Document Re-ranking Model Using Clusters. Information Processing and Management 37(1), 1–14 (2001)
Luk, R.W.P., Wong, K.F.: Pseudo-Relevance Feedback and Title Re-Ranking for Chinese IR. In: Proceedings of NTCIR Workshop 4
Mitra, M., Singhal, A., Buckley, C.: Improving Automatic Query Expansion. In: Proc. ACM SIGIR 1998 (August 1998)
Nie, J.Y., Gao, J., Zhang, J., Zhou, M.: On the Use of Words and N-grams for Chinese Information Retrieval. In: Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, IRAL 2000, pp. 141–148 (2000)
Qu, Y.L., Xu, G.W., Wang, J.: Rerank Method Based on Individual Thesaurus. In: Proceedings of NTCIR2 Workshop (2000)
Schutze, H.: The Hypertext Concordance: A Better Back-of-the-Book Index. In: Proceedings of First Workshop on Computational Terminology, pp. 101–104 (1998)
Xu, J., Croft, W.B.: Query Expansion Using Local and Global Document Analysis. In: Proc. ACM SIGIR 1996 (1996)
Xu, J., Croft, W.B.: Improving the Effectiveness of Information Retrieval with Local Context Analysis. ACM Transactions on Information Systems 18(1), 79–112 (2000)
Yang, L.P., Ji, D.H., Tang, L.: Document Re-ranking Based on Automatically Acquired Key Terms in Chinese Information Retrieval. In: Proceedings of 20th International Conference on Computational Linguistics (COLING) (2004)
Yang, L.P., Ji, D.H., Zhou, G.D., Nie, Y.: Improving Retrieval Effectiveness by Using Key Terms in Top Retrieved Documents. In: Proceedings of 27th European Conference on Information Retrieval (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, L., Ji, D., Leong, M. (2005). Chinese Document Re-ranking Based on Term Distribution and Maximal Marginal Relevance. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_23
Download citation
DOI: https://doi.org/10.1007/11562382_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29186-2
Online ISBN: 978-3-540-32001-2
eBook Packages: Computer ScienceComputer Science (R0)