Skip to main content

LSM: Language Sense Model for Information Retrieval

  • Conference paper
Advances in Web-Age Information Management (WAIM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4016))

Included in the following conference series:

Abstract

A lot of work has been done on drawing word senses into retrieval to deal with the word sense ambiguity problem, but most of them achieved negative results. In this paper, we first implement a WSD system for nouns and verbs, then the language sense model (LSM) for information retrieval is proposed. The LSM combines the terms and senses of a document seamlessly through an EM algorithm. Retrieval on TREC collections shows that the LSM outperforms both the vector space model (BM25) and the traditional language model significantly for both medium and long queries (7.53%-16.90%). Based on the experiments, we can also empirically draw the conclusion that the fine-grained senses will improve the retrieval performance when they are properly used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Voorhees, E.M.: Using wordnet to disambiguate word senses for text retrieval. In: Korfhage, R., Rasmussen, E.M., Willett, P. (eds.) Proceedings of the 16th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA, USA, June 27 - July 1, pp. 171–180. ACM, New York (1993)

    Chapter  Google Scholar 

  2. Wallis, P.: Information retrieval based on paraphrase (1993)

    Google Scholar 

  3. Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: CIKM 1993: Proceedings of the second international conference on Information and knowledge management, pp. 67–74. ACM Press, New York (1993)

    Chapter  Google Scholar 

  4. Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing withWordNet synsets can improve text retrieval. In: Proceedings of the COLING/ACL 1998 Workshop on Usage of WordNet for NLP, Montreal, Canada, pp. 38–44 (1998)

    Google Scholar 

  5. Sanderson, M.: Word sense disambiguation and information retrieval. In: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland, pp. 142–151 (1994)

    Google Scholar 

  6. Krovetz, R.: Viewing Morphology as an Inference Process. In: Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 191–203 (1993)

    Google Scholar 

  7. Kim, S.B., Seo, H.C., Rim, H.C.: Information retrieval using word senses: root sense tagging approach. In: SIGIR 2004: Proceedings of the 27th annual international conference on Research and development in information retrieval, pp. 258–265. ACM Press, New York (2004)

    Chapter  Google Scholar 

  8. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Research and Development in Information Retrieval, pp. 275–281 (1998)

    Google Scholar 

  9. Sanderson, M.: Retrieval with good sense. Information Retrieval 2, 47–67 (2000)

    Article  Google Scholar 

  10. Stokoe, C., Oakes, M.P., Tait, T.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Text representation, pp. 159–166 (2003)

    Google Scholar 

  11. Rosenfeld, R.: Two decades of statistical language modeling. In: Where do we go from here (2000)

    Google Scholar 

  12. Song, F., Croft, W.B.: A general language model for information retrieval. In: Proceedings of the eighth international conference on Information and knowledge management, pp. 316–321 (1999)

    Google Scholar 

  13. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems 22, 179–214 (2004)

    Article  Google Scholar 

  14. Kurland, O., Lee, L.: Corpus structure, language models, and ad hoc information. In: Proceedings of the 27th International ACM SIGIR Conference, pp. 194–201 (2004)

    Google Scholar 

  15. Xu, J., Croft, W.: Cluster-based retrieval using language models. In: Proceedings of the 27th International ACM SIGIR conference (2004)

    Google Scholar 

  16. Srikanth, M., Srihari, R.K.: Exploiting syntactic structure of queries in a language modeling approach to ir. In: Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management, New Orleans, Louisiana, USA, pp. 476–483. ACM, New York (2003)

    Chapter  Google Scholar 

  17. Gao, J., Nie, J.Y., Wu, G., Cao, G.: Dependence language model for information retrieval. In: Proceedings of the 27th annual international conference on Research and development in information retrieval (2004)

    Google Scholar 

  18. Cao, G., Nie, J.Y., Bai, J.: Integrating word relationships into language models. In: Proceedings of 17th ACM SIGIR conference, pp. 298–305 (2005)

    Google Scholar 

  19. Mihalcea, R.F., Moldovan, D.I.: A highly accurate bootstrapping algorithm for word sense disambiguation. International Journal on Artificial Intelligence Tools 10, 5–21 (2001)

    Article  Google Scholar 

  20. Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In: SIGIR 2004: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 266–272. ACM Press, New York (2004)

    Google Scholar 

  21. Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Joshi, A., Palmer, M. (eds.) Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics, pp. 310–318. Morgan Kaufmann Publishers, San Francisco (1996)

    Google Scholar 

  22. Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gull, A., Lau, M.: Okapi at TREC. In: Text REtrieval Conference, pp. 21–30 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bao, S., Zhang, L., Chen, E., Long, M., Li, R., Yu, Y. (2006). LSM: Language Sense Model for Information Retrieval. In: Yu, J.X., Kitsuregawa, M., Leong, H.V. (eds) Advances in Web-Age Information Management. WAIM 2006. Lecture Notes in Computer Science, vol 4016. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11775300_9

Download citation

  • DOI: https://doi.org/10.1007/11775300_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35225-9

  • Online ISBN: 978-3-540-35226-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics