Skip to main content

Query Expansion for the Language Modelling Framework Using the Naïve Bayes Assumption

  • Conference paper
  • 2441 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Abstract

Language modelling is new form of information retrieval that is rapidly becoming the preferred choice over probabilistic and vector space models, due to the intuitiveness of the model formulation and its effectiveness. The language model assumes that all terms are independent, therefore the majority of the documents returned to the ser will be those that contain the query terms. By making this assumption, related documents that do not contain the query terms will never be found, unless the related terms are introduced into the query using a query expansion technique. Unfortunately, recent attempts at performing a query expansion using a language model have not been in-line with the language model, being complex and not intuitive to the user. In this article, we introduce a simple method of query expansion using the naïve Bayes assumption, that is in-line with the language model since it is derived from the language model. We show how to derive the query expansion term relationships using probabilistic latent semantic analysis (PLSA). Through experimentation, we show that using PLSA query expansion within the language model framework, we can provide a significant increase in precision.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using SMART: TREC 3. In: Harman, D. (ed.) The Third Text REtrieval Conference (TREC-3), Gaithersburg, Md. 20899, National Institute of Standards and Technology Special Publication 500-226, pp. 69–80 (1994)

    Google Scholar 

  2. Robertson, S.E., Walker, S.: Okapi/keenbow at TREC-8. In: Voorhees, E.M., Harman, D.K. (eds.) The Eighth Text REtrieval Conference (TREC-8), Gaithersburg, Md. 20899, National Institute of Standards and Technology Special Publication 500-246, Department of Commerce, National Institute of Standards and Technology, pp. 151–162 (1999)

    Google Scholar 

  3. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 275–281. ACM Press, New York (1998)

    Chapter  Google Scholar 

  4. Cao, G., Nie, J.Y., Bai, J.: Integrating word relationships into language models. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 298–305. ACM Press, New York (2005)

    Google Scholar 

  5. Bai, J., Song, D., Bruza, P., Nie, J.Y., Cao, G.: Query expansion using term relationships in language models for information retrieval. In: CIKM 2005: Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 688–695. ACM Press, New York (2005)

    Chapter  Google Scholar 

  6. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 50–57. ACM Press, New York (1999)

    Chapter  Google Scholar 

  7. Park, L.A.F., Ramamohanarao, K.: Query expansion using a collection dependent probabilistic latent semantic thesaurus. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 224–235. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Park, L.A.F., Ramamohanarao, K. (2008). Query Expansion for the Language Modelling Framework Using the Naïve Bayes Assumption. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_64

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics