Skip to main content

Integrating Semantic Term Relations into Information Retrieval Systems Based on Language Models

  • Conference paper
Book cover Information Retrieval Technology (AIRS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8870))

Included in the following conference series:

Abstract

Most information retrieval systems rely on the strict equality of terms between document and query in order to retrieve relevant documents to a given query. The term mismatch problem appears when users and documents’ authors use different terms to express the same meaning. Statistical translation models are proposed as an effective way to adapt language models in order to mitigate term mismatch problem by exploiting semantic relations between terms. However, translation probability estimation is shown as a crucial and a hard practice within statistical translation models. Therefore, we present an alternative approach to statistical translation models that formally incorporates semantic relations between indexing terms into language models. Experiments on different CLEF corpora from the medical domain show a statistically significant improvement over the ordinary language models, and mostly better than translation models in retrieval performance. The improvement is related to the rate of general terms and their distribution inside the queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aronson, A.R.: Metamap: Mapping text to the umls metathesaurus (2006)

    Google Scholar 

  2. Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: SIGIR 2008, pp. 491–498. ACM, New York (2008), http://doi.acm.org/10.1145/1390334.1390419

    Google Scholar 

  3. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: SIGIR 1999, pp. 222–229. ACM, New York (1999), http://doi.acm.org/10.1145/312624.312681

    Google Scholar 

  4. Chevallet, J.-P.: X-iota: An open xml framework for ir experimentation. In: Myaeng, S.-H., Zhou, M., Wong, K.-F., Zhang, H.-J. (eds.) AIRS 2004. LNCS, vol. 3411, pp. 263–280. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Chevallet, J.P., Lim, J.H., Le, D.T.H.: Domain knowledge conceptual inter-media indexing: Application to multilingual multimedia medical reports. In: CIKM 2007, pp. 495–504. ACM (2007), http://doi.acm.org/10.1145/1321440.1321511

  6. Crestani, F.: Exploiting the similarity of non-matching terms at retrieval time. Journal of Information Retrieval 2, 25–45 (2000)

    Article  Google Scholar 

  7. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  8. Jing, Y., Croft, W.B.: An association thesaurus for information retrieval, pp. 146–160 (1994)

    Google Scholar 

  9. Karimzadehgan, M., Zhai, C.: Estimation of statistical translation models based on mutual information for ad hoc information retrieval. ACM (2010), http://doi.acm.org/10.1145/1835449.1835505

  10. Krovetz, R.: Viewing morphology as an inference process, pp. 191–202. ACM Press (1993)

    Google Scholar 

  11. Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR 2001, pp. 120–127. ACM, New York (2001), http://doi.acm.org/10.1145/383952.383972

    Google Scholar 

  12. Lin, J., Demner-Fushman, D.: The role of knowledge in conceptual retrieval: A study in the domain of clinical medicine. In: SIGIR 2006 (2006), http://doi.acm.org/10.1145/1148170.1148191

  13. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Google Scholar 

  14. Peng, F., Ahmed, N., Li, X., Lu, Y.: Context sensitive stemming for web search. In: SIGIR 2007, pp. 639–646. ACM, New York (2007), http://doi.acm.org/10.1145/1277741.1277851

    Google Scholar 

  15. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998, pp. 275–281. ACM (1998), http://doi.acm.org/10.1145/290941.291008

  16. Porter, M.F.: An algorithm for suffix stripping. In: Readings in Information Retrieval, pp. 313–316. Morgan Kaufmann Publishers Inc. (1997), http://dl.acm.org/citation.cfm?id=275537.275705

  17. Salton, G. (ed.): The SMART Retrieval System - Experiments in Automatic Document Processing. Prentice Hall, Englewood (1971)

    Google Scholar 

  18. Smucker, M.D., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: CIKM 2007. ACM (2007), http://doi.acm.org/10.1145/1321440.1321528

  19. Widdows, D.: Geometry and Meaning. Center for the Study of Language and Inf. (November 2004), http://www.amazon.ca/exec/obidos/redirect?tag=citeulike04-20&path=ASIN/1575864487

  20. Zhai, C.: Statistical Language Models for Information Retrieval. Now Publishers Inc., Hanover (2008)

    Google Scholar 

  21. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval 22(2), 179–214 (2004), http://doi.acm.org/10.1145/984321.984322

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

ALMasri, M., Tan, K., Berrut, C., Chevallet, JP., Mulhem, P. (2014). Integrating Semantic Term Relations into Information Retrieval Systems Based on Language Models. In: Jaafar, A., et al. Information Retrieval Technology. AIRS 2014. Lecture Notes in Computer Science, vol 8870. Springer, Cham. https://doi.org/10.1007/978-3-319-12844-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12844-3_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12843-6

  • Online ISBN: 978-3-319-12844-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics