Skip to main content

Robust Word Similarity Estimation Using Perturbation Kernels

  • Conference paper
Book cover Advances in Information Retrieval Theory (ICTIR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5766))

Included in the following conference series:

Abstract

We introduce perturbation kernels, a new class of similarity measure for information retrieval that casts word similarity in terms of multi-task learning. Perturbation kernels model uncertainty in the user’s query by choosing a small number of variations in the relative weights of the query terms to build a more complete picture of the query context, which is then used to compute a form of expected distance between words. Our approach has a principled mathematical foundation, a simple analytical form, and makes few assumptions about the underlying retrieval model, making it easy to apply in a broad family of existing query expansion and model estimation algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ando, R.K., Dredze, M., Zhang, T.: TREC 2005 genomics track experiments at IBM Watson. In: Proceedings of TREC 2005, NIST Special Publication (2006)

    Google Scholar 

  2. Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learning Research 6, 1817–1853 (2005)

    MathSciNet  MATH  Google Scholar 

  3. Baxter, J.: The canonical distortion measure for vector quantization and function approximation. In: ICML 1997, pp. 39–47 (1997)

    Google Scholar 

  4. Collins-Thompson, K.: Estimating robust query models using convex optimization. In: Advances in Neural Information Processing Systems (NIPS), vol. 21, pp. 329–336. MIT Press, Cambridge (2008)

    Google Scholar 

  5. Collins-Thompson, K.: Robust Model Estimation Methods for Information Retrieval, PhD thesis. Carnegie Mellon University (2008)

    Google Scholar 

  6. Dillon, J., Mao, Y., Lebanon, G., Zhang, J.: Statistical translation, heat kernels, and expected distances. In: UAI 2007, pp. 93–100 (2007)

    Google Scholar 

  7. Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems (NIPS), vol. 11, pp. 487–493. MIT Press, Cambridge (1999)

    Google Scholar 

  8. Jebara, T., Kondor, R., Howard, A.: Probability product kernels. J. Machine Learning Research 5, 819–844 (2004)

    MathSciNet  MATH  Google Scholar 

  9. Lafferty, J.D., Lebanon, G.: Information diffusion kernels. In: Advances in Neural Information Processing Systems (NIPS), vol. 15, pp. 375–382. MIT Press, Cambridge (2002)

    Google Scholar 

  10. Lavrenko, V.: A Generative Theory of Relevance. PhD thesis, Univ. of Massachusetts, Amherst (2004)

    Google Scholar 

  11. Minka, T.: Distance measures as prior probabilities. Technical report (2000)

    Google Scholar 

  12. Sahami, M., Heilman, T.: A web-based kernel function for measuring the similarity of short text snippets. In: Proc. of WWW 2006, pp. 377–386 (2006)

    Google Scholar 

  13. Tsuda, K., Kawanabe, M.: The leave-one-out kernel. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, pp. 727–732. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  14. Tsuda, K., Kin, T., Asai, K.: Marginalized kernels for biological sequences. Bioinformatics 18(1), 268–275 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Collins-Thompson, K. (2009). Robust Word Similarity Estimation Using Perturbation Kernels. In: Azzopardi, L., et al. Advances in Information Retrieval Theory. ICTIR 2009. Lecture Notes in Computer Science, vol 5766. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04417-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04417-5_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04416-8

  • Online ISBN: 978-3-642-04417-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics