Skip to main content

Chinese Document Re-ranking Based on Term Distribution and Maximal Marginal Relevance

  • Conference paper
Information Retrieval Technology (AIRS 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3689))

Included in the following conference series:

Abstract

In this paper, we propose a document re-ranking method for Chinese information retrieval where a query is a short natural language description. The method bases on term distribution where each term is weighted by its local and global distribution, including document frequency, document position and term length. The weight scheme lifts off the worry that very fewer relevant documents appear in top retrieved documents, and allows randomly setting a larger portion of the retrieved documents as relevance feedback. It also helps to improve the performance of MMR model in document re-ranking. The experiments show our method can get significant improvement against standard baselines, and outperforms relevant methods consistently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balinski, J., Danilowicz, C.: Re-ranking Method Based on Inter-document Distance. Information Processing and Management 41, 759–775 (2005)

    Article  MATH  Google Scholar 

  2. Bear, J., Israel, D., Petit, J., Martin, D.: Using Information Extraction to Improve Document Retrieval. In: Proceedings of the Sixth Text Retrieval Conference (1997)

    Google Scholar 

  3. Crouch, C., Crouch, D., Chen, Q., Holtz, S.: Improving the Retrieval Effectiveness of Very Short Queries. Information Processing and Management 38 (2002)

    Google Scholar 

  4. Kamps, J.: Improving Retrieval Effectiveness by Reranking Documents Based on Controlled Vocabulary. In: The 21th European Conference on Information Retrieval (2004)

    Google Scholar 

  5. Kwok, K.L.: Comparing Representation in Chinese Information Retrieval. In: Proceedings of the ACM SIGIR 1997, pp. 34–41 (1997)

    Google Scholar 

  6. Lee, K., Park, Y., Choi, K.S.: Document Re-ranking Model Using Clusters. Information Processing and Management 37(1), 1–14 (2001)

    Article  MATH  Google Scholar 

  7. Luk, R.W.P., Wong, K.F.: Pseudo-Relevance Feedback and Title Re-Ranking for Chinese IR. In: Proceedings of NTCIR Workshop 4

    Google Scholar 

  8. Mitra, M., Singhal, A., Buckley, C.: Improving Automatic Query Expansion. In: Proc. ACM SIGIR 1998 (August 1998)

    Google Scholar 

  9. Nie, J.Y., Gao, J., Zhang, J., Zhou, M.: On the Use of Words and N-grams for Chinese Information Retrieval. In: Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, IRAL 2000, pp. 141–148 (2000)

    Google Scholar 

  10. Qu, Y.L., Xu, G.W., Wang, J.: Rerank Method Based on Individual Thesaurus. In: Proceedings of NTCIR2 Workshop (2000)

    Google Scholar 

  11. Schutze, H.: The Hypertext Concordance: A Better Back-of-the-Book Index. In: Proceedings of First Workshop on Computational Terminology, pp. 101–104 (1998)

    Google Scholar 

  12. Xu, J., Croft, W.B.: Query Expansion Using Local and Global Document Analysis. In: Proc. ACM SIGIR 1996 (1996)

    Google Scholar 

  13. Xu, J., Croft, W.B.: Improving the Effectiveness of Information Retrieval with Local Context Analysis. ACM Transactions on Information Systems 18(1), 79–112 (2000)

    Article  Google Scholar 

  14. Yang, L.P., Ji, D.H., Tang, L.: Document Re-ranking Based on Automatically Acquired Key Terms in Chinese Information Retrieval. In: Proceedings of 20th International Conference on Computational Linguistics (COLING) (2004)

    Google Scholar 

  15. Yang, L.P., Ji, D.H., Zhou, G.D., Nie, Y.: Improving Retrieval Effectiveness by Using Key Terms in Top Retrieved Documents. In: Proceedings of 27th European Conference on Information Retrieval (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, L., Ji, D., Leong, M. (2005). Chinese Document Re-ranking Based on Term Distribution and Maximal Marginal Relevance. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_23

Download citation

  • DOI: https://doi.org/10.1007/11562382_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29186-2

  • Online ISBN: 978-3-540-32001-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics