Skip to main content

Integrated DB and IR Approaches

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems

Synonyms

Using efficient database technology (DB) for effective information retrieval (IR) of semi-structured text

Definition

Integrated DB&IR semi-structured text retrieval combines IR-style scoring and ranking methods for effective search with indexing techniques and processing algorithms from the database world for efficient query evaluation.

Historical Background

Database research has traditionally focused on semi-structured documents that represent structured data with a well-defined schema and only little unstructured, textual content (aka. “data-centric” XML). Typical examples for such documents are invoices, purchase orders, or even complete bibliographies.

Early work in the field concentrated on “classical” data management problems for XML: storing XML data in relational or native XML systems, defining query languages that integrate conditions on the structure and the content of results (like SQL for relational data), efficiently processing these queries on huge collections of...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Abiteboul S, Quass D, McHugh J, Widom J, Wiener JL. The lorel query language for semistructured data. Int J Digit Libr. 1997;1(1):68–88.

    Article  Google Scholar 

  2. Amer-Yahia S, Botev C, Shanmugasundaram J. TeXQuery: a full-text search extension to XQuery. In: Proceedings of the 12th International World Wide Web Conference; 2004. p. 583–94.

    Google Scholar 

  3. Amer-Yahia S, Cho S, Srivastava D. Tree pattern relaxation. In: Advances in Database Technology, Proceedings of the 8th International Conference on Extending Database Technology; 2002. p. 496–513.

    Chapter  Google Scholar 

  4. Amer-Yahia S, Lakshmanan LVS, Pandit S. FleXPath: flexible structure and full-text querying for XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 83–94.

    Google Scholar 

  5. Cohen S, Mamou J, Kanza Y, Sagiv Y. XSEarch: a semantic search engine for XML. In: Proceedings of the 29th International Conference on Very Large Data Bases; 2003. p. 45–56.

    Chapter  Google Scholar 

  6. Fuhr N, Großjohann K. XIRQL: a query language for information retrieval in XML documents. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001. p. 172–80.

    Google Scholar 

  7. Guo L, Shao F, Botev C, Shanmugasundaram J. XRANK: ranked keyword search over XML documents. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2003. p. 16–27.

    Google Scholar 

  8. Hiemstra D, Rode H, Van Os R, Flokstra J PF/Tijah: text search in an XML database system. In: Proceedings of the 2nd International Workshop on Open Source Information Retrieval; 2006.

    Google Scholar 

  9. Hristidis V, Papakonstantinou Y, and Balmin A. Keyword proximity search on XML graphs. In: Proceedings of the 19th International Conference on Data Engineering; 2003. p. 367–78.

    Google Scholar 

  10. Marian A, Amer-Yahia S, Koudas N, Srivastava D. Adaptive processing of Top-k queries in XML. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 162–73.

    Google Scholar 

  11. Schlieder T, Meuss H. Querying and ranking XML documents. J Am Soc Inf Sci Tech. 2002;53(6):489–503.

    Article  Google Scholar 

  12. Theobald M, Bast H, Majumdar D, Schenkel R, Weikum G. TopX: efficient and versatile top-k query processing for semistructured data. VLDB J. 2008;17(1):81–115.

    Article  Google Scholar 

  13. Theobald A, Weikum G. Adding relevance to XML. In: Proceedings of the 3rd International Workshop on the World Wide Web and Databases; 2000. p. 105–124.

    Google Scholar 

  14. Xu Y, Papakonstantinou Y. Efficient keyword search for smallest LCAs in XML databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2005. p. 537–8.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ralf Schenkel .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Schenkel, R., Theobald, M. (2018). Integrated DB and IR Approaches. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_206

Download citation

Publish with us

Policies and ethics