Skip to main content

Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval

  • Conference paper
Distributed Computing and Internet Technology (ICDCIT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8956))

Abstract

Pseudo Relevance feedback (PRF) based query expansion approaches assumes that the top ranked retrieved documents are relevant. But this assumption is not always true; it may also possible that a PRF document may contain different topics, which may or may not be relevant to the query terms even if the documents are judged relevant. In this paper our focus is to capture the limitation of PRF based query expansion and propose a hybrid method to improve the performance of PRF based query expansion by combining corpus based term co-occurrence information and semantic information of term. Firstly, the paper suggest use of corpus based term co-occurrence approach to select an optimal combination of query terms from a pool of terms obtained using PRF based query expansion. Second, we use semantic similarity approach to rank the query expansion terms obtained from top feedback documents. Third, we combine co-occurrence and semantic similarity together to rank the query expansion terms obtained from first step on the basis of semantic similarity. The experiments were performed on FIRE ad hoc and TREC-3 benchmark datasets of information retrieval. The results show significant improvement in terms of precision, recall and mean average precision (MAP). This experiments shows that the combination of both techniques in an intelligent way gives us goodness of both of them. As this is the first attempt in this direction there is a large scope of improving these techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Van Rijsbergen, C.J.: A theoretical basis for the use of co-occurrence data in information Retrieval. Journal of Documentation 33, 106–119 (1977)

    Article  Google Scholar 

  2. Robertson, S.E., Walker, S., Beaulieu, M.H.: Okapi at TREC-7. In: Proceedings of the Seventh Text REtrieval Conference. Gaithersburg, USA (1998)

    Google Scholar 

  3. Kobayakawa, M., Kinjo, S., Hoshi, M., Ohmori, T., Yamamoto, A.: Fast Computation of Similarity Based on Jaccard Coefficient for Composition-Based Image Retrieval. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds.) PCM 2009. LNCS, vol. 5879, pp. 949–955. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Miller, G.A., Beckwith, R., Fellbaum, C.D., Gross, D., Miller, K.: WordNet: An online lexical database. Int. J. Lexicograph. 3(4), 235–244 (1990)

    Article  Google Scholar 

  5. Resnik, P.: Semantic Similarity in Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research 11, 95–130 (1999)

    MATH  Google Scholar 

  6. Wu, Z., Palmer, M.: Verb Semantics and Lexical Selection. In: Annual Meeting of the Associations for Computational Linguistic, Las Cruces, New, Mexico, pp. 133–138 (1994)

    Google Scholar 

  7. Leacock, C., Miller, G.A., Chodorow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification. Journal of Computational Linguistic, 265–283 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Singh, J., Sharan, A. (2015). Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval. In: Natarajan, R., Barua, G., Patra, M.R. (eds) Distributed Computing and Internet Technology. ICDCIT 2015. Lecture Notes in Computer Science, vol 8956. Springer, Cham. https://doi.org/10.1007/978-3-319-14977-6_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14977-6_45

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14976-9

  • Online ISBN: 978-3-319-14977-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics