Abstract
Passage retrieval is important for the users of the biomedical literature. How to extract a passage from a natural paragraph presents a challenge problem. In this paper, we focus on analyzing the gold standard of the TREC 2006 Genomics Track and simulating the distributions of standard passages. Hence, we present an efficient dynamic window based algorithm with a WordSentenceParsed method to extract passages. This algorithm has two important characteristics. First, we obtain the criteria for passage extraction through learning the gold standard, then do a comprehensive study on the 2006 and 2007 Genomics datasets. Second, the algorithm we proposed is dynamic with the criteria, which can adjust to the length of passage. Finally, we find that the proposed dynamic algorithm with the WordSentenceParsed method can boost the passage-level retrieval performance significantly on the 2006 and 2007 Genomics datasets.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beaulieu, M., Gatford, M., Huang, X., Robertson, S.E., Walker, S., Williams, P.: (1996) Okapi at TREC-5. In: Proceedings of 5th Text REtrieval Conference. NIST Special Publication, Gaithersburg, pp. 143–166 (November 1997)
Hersh, W., Cohen, A., Yang, J.: TREC 2005 Genomics Track Overview. In: Proceedings of 14th Text REtrieval Conference. NIST Special Publication, Gaithersburg (November 2005)
Hersh, W., Cohen, A.M., Roberts, P.: TREC 2006 Genomics Track Overview. In: Proceedings of 15th Text REtrieval Conference,, November 2006, NIST Special Publication, Gaithersburg (2006)
Hersh, W., Cohen, A.M., Roberts, P.: TREC 2007 Genomics Track Overview. In: Proceedings of 16th Text REtrieval Conference, NIST Special Publication, Gaithersburg (November 2007)
Huang, X., Zhong, M., Luo, S.: York University at TREC 2005: Genomics Track. In: Proceedings of the 14th Text Retrieval Conference, NIST Special Publication, Gaithersburg (November 2005)
Huang, X., Hu, B., Rohian, H.: York University at TREC 2006: Genomics Track. In: Proceedings of the 15th Text Retrieval Conference, NIST Special Publication, Gaithersburg (November 2006)
Huang, X., Huang, Y., Wen, M., An, A., Liu, Y., Poon, J.: Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval. In: Perner, P. (ed.) ICDM 2006. LNCS (LNAI), vol. 4065, Springer, Heidelberg (2006)
Jiang, J., Zhai, C.: An Empirical Study of Tokenization Strategies for Biomedical Information Retrieval. In: Information Retrieval (2007)
Si, L., Kanungo, T., Huang, X.: Boosting Performance of Bio-Entity Recongition by Combining Results from Multiple Systems. In: Proceedings of the 5th ACM SIGKDD Workshop on Data Mining in Bioinformatics (2005)
Zhong, M., Huang, X.: Concept-Based Biomedical Text Retrieval. In: Proceedings of the 29th ACM SIGIR Conference, Washington, August 6-11 (2006)
Zhou, W., Yu, C., Neil, S., Vetle, T., Jie, H.: Knowledge-Intensive Conceptual Retrieval and Passage Extraction of Biomedical Literature. In: Proceedings of the 30th ACM SIGIR Conference, Amsterdam, July 23-27 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hu, Q., Huang, X. (2008). A Dynamic Window Based Passage Extraction Algorithm for Genomics Information Retrieval. In: An, A., Matwin, S., Raś, Z.W., Ślęzak, D. (eds) Foundations of Intelligent Systems. ISMIS 2008. Lecture Notes in Computer Science(), vol 4994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68123-6_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-68123-6_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68122-9
Online ISBN: 978-3-540-68123-6
eBook Packages: Computer ScienceComputer Science (R0)