How Complementary Are Different Information Retrieval Techniques? A Study in Biomedicine Domain

An, Xiangdong; Cercone, Nick

doi:10.1007/978-3-642-54903-8_31

Xiangdong An¹⁷ &
Nick Cercone¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8404))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1667 Accesses
1 Citations

Abstract

In this paper, we make an empirical study on the submitted runs to the TREC Genomics Track, a gathering for information retrieval research in biomedicine. Based on the evaluation criteria provided by the track, we investigate how much relevant information is generally lost from a run, and how well the relevant nominees are actually ranked w.r.t. the level of relevancy and how they are distributed among the irrelevant ones in a run. We examine whether the relevancy or the level of relevancy play a more important role in the performance evaluation. Answering these questions may give us some insight into and help us improve the current IR technologies. The study reveals that the recognition of relevancy is more important than that of level of relevancy. It indicates that on average more than 60% of relevant information is lost from each run w.r.t. to either the amount of relevant information or the amount of aspects (subtopics, novelty or diversity), which suggests the big potential room for performance improvement. The study shows that the submitted runs from different groups are quite complementary, which implies ensemble IRs could significantly improve retrieval performance. The experiments illustrate that a run performs “good” or “bad” mainly due to its performance on its top 10% rankings, and the rest of the run only contributes to the performance marginally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Hersh, W., Cohen, A., Roberts, P.: TREC 2007 genomics track overview. In: TREC 2007, pp. 98–115 (2007)
Google Scholar
Hersh, W., Cohen, A., Roberts, P., Rekapalli, H.K.: TREC 2006 genomics track overview. In: TREC 2006, pp. 68–87 (2006)
Google Scholar
Zhai, C., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR 2003, pp. 10–17 (2003)
Google Scholar
Yang, L., Ji, D., Tang, L.: Document re-ranking based on automatically acquired key terms in chinese information retrieval. In: COLING 2004, pp. 480–486 (2004)
Google Scholar
Goldberg, A.B., Andrzejewski, D., Gael, J.V., Settles, B., Zhu, X.: Ranking biomedical passages for relevance and diversity: University of Wisconsin, Madison at TREC genomics 2006. In: TREC 2006, pp. 129–136 (2006)
Google Scholar
Hu, Q., Huang, X.: A reranking model for genomics aspect search. In: SIGIR 2008, pp. 783–784 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering & Computer Science, York University, Toronto, ON, M3J 1P3, Canada
Xiangdong An & Nick Cercone

Authors

Xiangdong An
View author publications
You can also search for this author in PubMed Google Scholar
Nick Cercone
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Bátiz, Col. Nueva Industrial Vallejo, 07738, Mexico D.F, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

An, X., Cercone, N. (2014). How Complementary Are Different Information Retrieval Techniques? A Study in Biomedicine Domain. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54903-8_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-54903-8_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54902-1
Online ISBN: 978-3-642-54903-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics