Skip to main content

Effective Retrieval of Structured Documents

  • Conference paper
SIGIR ’94

Abstract

Information systems usually retrieve whole documents as answers to queries. However, it may in some circumstances be more appropriate to retrieve parts of documents. We consider formulas for retrieving whole documents and parts of documents horn a large structured document collection. We consider what information is needed to retrieve effectively and show that knowledge of the structure of documents can lead to improved retrieval performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Agosti, R. Colotti, and G. Gradenigo. A two-level hypertext retreival model for legal data. In A. Bookstein, Y. Chiaramella, G. Salton, and V. V. Raghavan, editors, Proceedings of the 14th Annual International Conference on Research and Development in Information Retrieval, pages 316–325, Chicago, Illinois USA, 13–16 October 1991. ACM Press.

    Google Scholar 

  2. J. Allan, C. Buckley, and G. Salton. Automatic routing and ad-hoc retrieval using SMART: TREC 2. In Harman [7]. To appear as a NIST Special Publication.

    Google Scholar 

  3. W. B. Croft, N. J. Belkin, M.-F. Bruandet, R. Kuhlen, and T. Oren. Hypertext and information retrieval: what are the fundamental concepts? pages 362–366, INRIA, Versailles, France, November 28–30 1990. Cambridge University Press, The Cambridge Series on Electronic Publishing.

    Google Scholar 

  4. D. B. Crouch, C. J. Crouch, and G. Andreas. The use of cluster hierarchies in hypertext information retrieval. In Proceedings of the ACM Hypertext ‘89 Conference, pages 225–237, Pittsburgh, Pennsylvania, November 5–8 1989. ACM.

    Google Scholar 

  5. N. Fuhr and C. Buckley. Optimizing document indexing and search term weighting based on probabilistic models. In D. Harman, editor, Proceedings of the First Text Retrieval Conference, Gaithersburg, Maryland, 1993. NIST Special Publication 500–207.

    Google Scholar 

  6. M. Fuller, E. Mackie, R. Sacks-Davis, and R. Wilkinson. Coherent answers for a large structured document collection. In Korfhage et al. [9], pages 204–213.

    Google Scholar 

  7. D. Harman, editor. Proceedings of the Second Text Retrieval Conference, Gaithersburg, Maryland, 1994. To appear as a NIST Special Publication.

    Google Scholar 

  8. M.A. Hearst and C. Flaunt. Subtopic structuring for full-length document access. In Korfhage et al. [9], pages 59–68.

    Google Scholar 

  9. R. Korfhage, E. Rasmussen, and P. Willett. editors. Proceedings of the 16th Annual International Conference on Research and Development in Information Retrieval, Pittsburg, U.S.A., June 27 - July 1 1993. ACM.

    Google Scholar 

  10. I.A. Macleod. Storage and retrieval of structured documents. Information Processing and Management, 26 (2): 197–208, 1990.

    Article  Google Scholar 

  11. A. Moffat and J. Zobel. Fast ranking hi limited space,. In Proc. IEEE International Conference on Data Engineering, 1994. (To appear).

    Google Scholar 

  12. G. Salton. Automatic Text Processing. Addison-Wesley, Reading, Massachusetts, 1989.

    Google Scholar 

  13. G. Salton, J. Allan, and C. Buckley. Approaches to passage retrieval in full text information systems. In Korfhage et al. [9], pages 49–58.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag London Limited

About this paper

Cite this paper

Wilkinson, R. (1994). Effective Retrieval of Structured Documents. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_32

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2099-5_32

  • Publisher Name: Springer, London

  • Print ISBN: 978-3-540-19889-5

  • Online ISBN: 978-1-4471-2099-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics