Skip to main content

Passage Retrieval Based on Density Distributions of Terms and Its Applications to Document Retrieval and Question Answering

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2956))

Abstract

A huge amount of electronic documents has created the demand of intelligent access to their information. Document retrieval has been investigated for providing a fundamental tool for the demand. However, it is not satisfactory due to (1) inaccuracies of retrieving long documents with short queries (a few terms), (2) a user’s burden on finding relevant parts from retrieved long documents. In this paper, we apply a passage retrieval method called “density distributions” (DD) to tackle these problems. For the first problem, it is experimentally shown that a passage-based method outperforms conventional document retrieval methods if long documents are retrieved with short queries. For the second problem, we apply DD to the question answering task: locating short passages in response to natural language queries of seeking facts. Preliminary experiments show that correct answers can be located within a window of 50 terms for about a half of such queries.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Callan, J.P.: Passage-level evidence in document retrieval. In: Proc. SIGIR 1994, pp. 302–310 (1994)

    Google Scholar 

  2. Salton, G., Allan, J., Buckley, C.: Approaches to passage retrieval in full text information systems. In: Proc. SIGIR 1993, pp. 49–58 (1993)

    Google Scholar 

  3. Salton, G., Singhal, A., Mitra, M.: Automatic text decomposition using text segments and text themes. In: Proc. Hypertext 1996, pp. 53-65 (1996)

    Google Scholar 

  4. Kaszkiel, M., Zobel, J.: Passage retrieval revisited. In: Proc. SIGIR 1997, pp. 178–185 (1997)

    Google Scholar 

  5. de Kretser, O., Moffat, A.: Effective document presentation with a locality-based similarity heuristic. In: Proc. SIGIR 1999, pp. 113–120 (1999)

    Google Scholar 

  6. Mochizuki, H., Iwayama, M., Okumura, M.: Passage-level document retrieval using lexical chains. In: RIAO 2000, pp. 491–506 (2000)

    Google Scholar 

  7. Kise, K., Mizuno, H., Yamaguchi, M., Matsumoto, K.: On the use of density distribution of keywords for automated generation of hypertext links from arbitrary parts of documents. In: Proc. ICDAR 1999, pp. 301–304 (1999)

    Google Scholar 

  8. Kise, K., Junker, M., Dengel, A., Matsumoto, K.: Experimental evaluation of passage-based document retrieval. In: Proc. ICDAR 2001, pp. 592–596 (2001)

    Google Scholar 

  9. Kise, K., Junker, M., Dengel, A., Matsumoto, K.: Passage-based document retrieval as a tool for text mining with user’s information needs. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 155–169. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Kise, K., Junker, M., Dengel, A., Matsumoto, K.: Effectiveness of passage-based document retrieval for short queries. Trans. IEICE, Japan (2003) (to appear)

    Google Scholar 

  11. Voorhees, E.M., Tice, D.M.: The TREC-8 question answering track evaluation. In: Proc. TREC-8 (1999), available at http://trec.nist.gov/pubs/trec8/t8proceedings.html

  12. http://trec.nist.gov/

  13. Kurohashi, S., Shiraki, N., Nagao, M.: A Method for detecting important descriptions of a word based on its density distribution in text. Trans. Information Processing Society of Japan 38(4), 845–853 (1997) [in Japanese]

    Google Scholar 

  14. Kozima, H., Furugori, T.: Segmenting narrative text into coherent scenes. Literary and Linguistic Computing 9(1), 13–19 (1994)

    Article  Google Scholar 

  15. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Pub. Co., Reading (1999)

    Google Scholar 

  16. Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  17. Deerwester, S., Dumais, S., Landauer, T., Furnas, G., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  18. Berry, B., Drmac, Z., Jessup, E.: Matrices, vector spaces, and information retrieval. SIAM Review 41(2), 335–362 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  19. ftp://ftp.cs.cornell.edu/pub/smart/

  20. Voorhees, E.M., Buckley, C.: The effect of topic set size on retrieval experiment error. In: Proc. SIGIR 2002, pp.316–323 (2002)

    Google Scholar 

  21. Hull, D.: Using statistical testing in the evaluation of retrieval experiments. In: Proc. SIGIR 1993, pp.329–338 (1993)

    Google Scholar 

  22. Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proc. SIGIR 1999, pp.42–49 (1999)

    Google Scholar 

  23. http://trec.nist.gov/data/qa/t9_qadata.html

  24. Bikel, D.M., Schwartz, R.L., Weischedel, R.M.: An algorithm that learns what’s in a name. Machine Learning 34(1-3), 211–231 (1999)

    Article  MATH  Google Scholar 

  25. Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proc. NAACL 2001, pp. 192–199 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kise, K., Junker, M., Dengel, A., Matsumoto, K. (2004). Passage Retrieval Based on Density Distributions of Terms and Its Applications to Document Retrieval and Question Answering. In: Dengel, A., Junker, M., Weisbecker, A. (eds) Reading and Learning. Lecture Notes in Computer Science, vol 2956. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24642-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24642-8_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21904-0

  • Online ISBN: 978-3-540-24642-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics