Skip to main content

Hidden Markov Model for Term Weighting in Verbose Queries

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7488))

Abstract

It has been observed that short queries generally have better performance than their corresponding long versions when retrieved by the same IR model. This is mainly because most of the current models do not distinguish the importance of different terms in the query. Observed that sentence-like queries encode information related to the term importance in the grammatical structure, we propose a Hidden Markov Model (HMM) based method to extract such information to do term weighting. The basic idea of choosing HMM is motivated by its successful application in capturing the relationship between adjacent terms in NLP field. Since we are dealing with queries of natural language form, we think that HMM can also be used to capture the dependence between the weights and the grammatical structures. Our experiments show that our assumption is quite reasonable and that such information, when utilized properly, can greatly improve retrieval performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   72.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  2. Kumaran, G., Allan, J.: A Case for Shorter Queries and Helping Users Create Them. In: Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Rochester, pp. 220–227 (2007)

    Google Scholar 

  3. Kumaran, G., Allan, J.: Effective and Efficient User Interaction for Long Queries. In: 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 11–18. ACM Press, Singapore (2008)

    Google Scholar 

  4. Bendersky, M., Croft, W.B.: Discovering Key Concepts in Verbose Queries. In: 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 491–498. ACM Press, Singapore (2008)

    Google Scholar 

  5. Cao, G., Nie, J., Gao, J., Robertson, S.: Selecting Good Expansion Terms for Pseudo-Relevance Feedback. In: 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250. ACM Press, Singapore (2008)

    Google Scholar 

  6. Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  7. Metzler, D., Croft, W.B.: Combining the Language Model and Inference Network Approaches to Retrieval. Information Processing and Management 40(5), 735–750 (2004)

    Article  Google Scholar 

  8. Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In: Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, pp. 252–259 (2003)

    Google Scholar 

  9. Metzler, D., Strohman, T., Zhou, Y., Croft, W.B.: Indri at TREC 2005: Terabyte Track. In: 14th Text Retrieval Conference, Gaithersburg, pp. 175–180 (2005)

    Google Scholar 

  10. Jones, K.S., Walker, S., Robertson, S.E.: A Probabilistic Model of Information Retrieval: Development and Comparative Experiments. Information Processing and Management 36(6), 779–840 (2000)

    Article  Google Scholar 

  11. Croft, W.B.: Combining Approaches to Information Retrieval. In: Croft, W.B. (ed.) Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, pp. 1–36. Kluwer Academic Publishers (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yan, X., Gao, G., Su, X., Wei, H., Zhang, X., Lu, Q. (2012). Hidden Markov Model for Term Weighting in Verbose Queries. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. CLEF 2012. Lecture Notes in Computer Science, vol 7488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33247-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33247-0_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33246-3

  • Online ISBN: 978-3-642-33247-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics