Skip to main content

Information Retrieval and Structured Documents

  • Chapter
  • First Online:
Lectures on Information Retrieval (ESSIR 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1980))

Included in the following conference series:

Abstract

Standard Information Retrieval considers documents as atomic units of information that are indexed and retrieved as a whole. Modern evolution of document design and storage have since a long time introduced more elaborate representations of documents; standards such as SGML, then HTML and now XML are of course major contributions in this domain. These standards underly today evolutions towards modern electronic documents. In this context, retrieving structured documents refers to index and retrieve information according to a given structure of documents. This means that documents are no longer considered as atomic entities, but as aggregates of interrelated objects that can be retrieved separately: given a retrieval query, one may retrieve the set of document components that are most relevant to this query.

In this chapter we shall first emphasise some aspects which, in our opinion, relate explicit use of document structure to interactive retrieval performances, such as efficiency while browsing or querying information. In a second step we shall investigate two classes of implementation approaches dealing with indexing and retrieving structured documents: passage retrieval and explicit use of hierarchical structures of documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Agosti, R. Colotti, and G. Gradenigo. A Tow-Level Hypertext Retrieval Model for Legal Data. ACM, 1991, p. 316–325.

    Google Scholar 

  2. B. Amann, M. Scholl. GRAM: A graph model and query language. ECHT’92, ACM, December 1992, p. 201–211.

    Google Scholar 

  3. F. Burkowski. Retrieval activities in a database consisting of heterogeneous collections of structured text. In Proc. 15th annual international ACM-SIGIR Conference on Research and Development in Information Retrieval. Copenhagen, 1992.

    Google Scholar 

  4. J. P. Chevallet. Un modèle logique de Recherche d’Informations appliqué au formalisme des Graphes Conceptuels, le prototype ELEN et son expérimentation sur un corpus de composants logiciels. PhD thesis in Computer Science. University Joseph Fourier, Grenoble, May 15 1992.

    Google Scholar 

  5. Y. Chiaramella, A. Kheirbek An IntegratedModel for Hypermedia and Information Retrieval. Chapter in “Information Retrieval and Hypertext” (pp 139–176). Edited by Maristella Agosti and Alan Smeaton.1996. Kluwer Academic Press.

    Google Scholar 

  6. Y. Chiaramella, Ph. Mulhem, F. Fourel. A model for Multimedia Information Retrieval. Technical Report of ESPRIT Project 8134 “FERMI”. University of Glasgow. Technical Report Series No. 4/96. December 1996.

    Google Scholar 

  7. Y. Chiaramella. Browsing and Querying: two complementary approaches for Multimedia Information Retrieval. In Proc. HIM’97 International Conference, Dortmund. 1997.

    Google Scholar 

  8. G. V. Cormak, C. L. Clarke, C. R. Palmer. Passage-based query refinement (Multitext experiments for TREC-6). Information Processing & Management; 36(1). January 2000. 133–153.

    Article  Google Scholar 

  9. W. B. Croft and H. Turtle. A Retrieval Model for Incorporating Hypertext Links. Proceedings of the second ACM conference on Hypertext, Hypertext’ 89, Pittsburg USA, p 213–224.

    Google Scholar 

  10. Z. Li, H. Davis, W. Hall. Hypermedia Links and Information Retrieval. 14th Information Retrieval Colloquium, Lancaster 1992, p. 169–180.

    Google Scholar 

  11. M. D. Dunlop, C. J. Van Rijsbergen. Hypermedia and Free Text Retrieval. Information Processing & Management, Vol. 29, No. 3, p. 287–298, 1993.

    Article  Google Scholar 

  12. P. Garg. Abstraction mechanisms in Hypertext. Communications of the ACM, Vol. 31, No. 7, July 1988.

    Google Scholar 

  13. F. G. Halasz. Reflections on NoteCards: Seven Issues for the Next Generation of Hypermedia System. Communication of the ACM, Vol. 31, No. 7, July 1988, p. 836–852.

    Article  Google Scholar 

  14. M. A. Hearst, J. O. Pedersen. Reexamining the cluster hypothesis: scatter/gather on retrieval results. In Proc. 19th ACM-SIGIR International Conference on Research and Development in Information Retrieval. Zurich, 1996.

    Google Scholar 

  15. D. Kerkouba. Indexation automatique et aspects structurels des textes. In Proc. International conference RIAO 85, (version anglaise disponible), Grenoble, March 1985, pp. 227–249. (english version available).

    Google Scholar 

  16. A. Kheirbek. Two-level Hypermedia Model Based on Conceptual Graph Theory. Workshop on Intelligent Hypertext, CIKM’93, 5 Nov. 1993.

    Google Scholar 

  17. A. Kheirbek, Y. Chiaramella. Integrating Hypermedia and Information Retrieval using Conceptual Graphs. Proc. HIM’95 Conference, pp 47–60. Konstanz, Germany. April 1995.

    Google Scholar 

  18. D. Lucarella, S. Parisotto, A. Zanzi. MORE: Multimedia Object Retrieval Environment. Hypertext’93 Proc., p. 39–50, Seattle, Washington USA, Nov. 1993, ACM.

    Google Scholar 

  19. E. Mittendorf, P. Schaüble. Document and passage retrieval based on Hidden Markov Models. In Proc. ACM SIGIR International Conference on Research and Developments in Information Retrieval. 1994. p 318–327.

    Google Scholar 

  20. J. Nanard, M. Nanard. Using structured Types to Incorporate Knowledge in Hypertext. Hypertext’91, p. 329–343, San Antonio, Texas, USA, Dec. 1991, ACM.

    Google Scholar 

  21. J. Nie. An Information Retrieval Model based on Modal Logic. Information Processing & Management, Vol 25, No. 5, p.477–491, 1990.

    Article  Google Scholar 

  22. L. T. Nowell, R. K. France, D. Hix, L. S. Heath, E. A. Fox. Visualizing search results: some alternatives to query-document similarity. In Proc. 19th ACM-SIGIR International Conference on Research and Development in Information Retrieval. Zurich, 1996.

    Google Scholar 

  23. J. O’Connor. Retrieval of answer-sentences and answer-figures from papers by text searching. Information Processing & Management; 11(5/7). 1975. p 155–164.

    Article  Google Scholar 

  24. J. O’Connor. Answer-passage retrieval by text searching. Journal of the American Society for Information Science; 31(4). July 1980. p 227–239.

    Article  Google Scholar 

  25. G. Richard, A. Rizk. Quelques idées pour une modélisation des systemes hyper-textes. T.S.I. Technique et Science Informatique, Vol. 9, No. 6, 1990.

    Google Scholar 

  26. G. Salton, M. J. Mc Gill. Introduction to Modern Information Retrieval. McGraw Hill Book Company, 1983

    Google Scholar 

  27. J. F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley publishing company, 1984.

    Google Scholar 

  28. T. P. Van Der Weide, P. D. Bruza. Two level Hypermedia: An improved architecture for Hypertext. Proc. of the Database and Expert System Applications, DEXA’90, Springer Verlag, Vienne, Autriche, September 1990.

    Google Scholar 

  29. C. J. Van Rijsbergen. A New Theoretical Framework for Information Retrieval. Proc. of the ACM Conference on Research and Development in Information Retrieval, Pisa, September 1986, p. 194–200.

    Google Scholar 

  30. A. Veerasamy, N. J. Belkin. Evaluation of a tool for visualization of information retrieval results. In Proc. 19th ACM-SIGIR International Conference on Research and Development in Information Retrieval. Zurich, 1996.

    Google Scholar 

  31. V. Wuwongse, M. Manzano. Fuzzy conceptual graphs. Proc. of ICCS’93, Quebec City, Canada, August 1993. Lecture Notes in Artificial Intelligence, 699.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Chiaramella, Y. (2000). Information Retrieval and Structured Documents. In: Agosti, M., Crestani, F., Pasi, G. (eds) Lectures on Information Retrieval. ESSIR 2000. Lecture Notes in Computer Science, vol 1980. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45368-7_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-45368-7_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41933-4

  • Online ISBN: 978-3-540-45368-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics