Abstract
Passage retrieval has been expected to be an alternative method to resolve length-normalization problem, since passages have more uniform lengths and topics, than documents. An important issue in the passage retrieval is to determine the type of the passage. Among several different passage types, the arbitrary passage type which dynamically varies according to query has shown the best performance. However, the previous arbitrary passage type is not fully examined, since it still uses the fixed-length restriction such as n consequent words. This paper proposes a new type of passage, namely completely-arbitrary passages by eliminating all possible restrictions of passage on both lengths and starting positions, and by extremely relaxing the type of the original arbitrary passage. The main advantage using completely-arbitrary passages is that the proximity feature of query terms can be well-supported in the passage retrieval, while the non-completely arbitrary passage cannot clearly support. Experimental result extensively shows that the passage retrieval using the completely-arbitrary passage significantly improves the document retrieval, as well as the passage retrieval using previous non-completely arbitrary passages, on six standard TREC test collections, in the context of language modeling approaches.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: SIGIR 1996: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 21–29 (1996)
Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: SIGIR 1994: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 232–241 (1994)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 275–281 (1998)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: SIGIR 2001: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 334–342 (2001)
Salton, G., Allan, J., Buckley, C.: Approaches to passage retrieval in full text information systems. In: SIGIR 1993: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 49–58 (1993)
Callan, J.: Passage-level evidence in document retrieval. In: SIGIR 1994: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 302–310. Springer-Verlag New York, Inc., New York (1994)
Kaszkiel, M., Zobel, J.: Effective ranking with arbitrary passages. Journal of the American Society for Information Science and Technology (JASIST) 52(4), 344–364 (2001)
Liu, X., Croft, W.B.: Passage retrieval based on language models. In: CIKM 2002: Proceedings of the eleventh international conference on Information and knowledge management, pp. 375–382 (2002)
Hearst, M.A., Plaunt, C.: Subtopic structuring for full-length document access. In: SIGIR 1993: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 59–68 (1993)
Clarke, C.L.A., Cormack, G.V., Tudhope, E.A.: Relevance ranking for one to three term queries. Inf. Process. Manage. 36(2), 291–311 (2000)
Tao, T., Zhai, C.: An exploration of proximity measures in information retrieval. In: SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 295–302 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Na, SH., Kang, IS., Lee, YH., Lee, JH. (2008). Completely-Arbitrary Passage Retrieval in Language Modeling Approach. In: Li, H., Liu, T., Ma, WY., Sakai, T., Wong, KF., Zhou, G. (eds) Information Retrieval Technology. AIRS 2008. Lecture Notes in Computer Science, vol 4993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68636-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-68636-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68633-0
Online ISBN: 978-3-540-68636-1
eBook Packages: Computer ScienceComputer Science (R0)