Skip to main content

On the Difference between Navigating Semi-structured Data and Querying It

  • Conference paper
  • First Online:
Book cover Research Issues in Structured and Semistructured Database Programming (DBPL 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1949))

Included in the following conference series:

Abstract

Currently, there is tremendous interest in semi-structured (SS)data management. This is spurred by data sources, such as the ACeDB [29], that are inherently less rigidly structured than traditional DBMS, by WWW documents where no hard rules or constraints are imposed and “anything goes,” and by integration of information coming from disparate sources exhibiting considerable differences in the way they structure information. Significant strides have been made in the development of data models and query languages [2, 11, 17, 6, 7], and to some extent, the theory of queries on semi-structured data [1, 23, 3, 13, 9]. The OEM model of the Stanford TSIMMIS project [2] (equivalently, its variant, independently developed at U.Penn. [11]) has emerged as the de facto standard model for semi-structured data. OEM is a light-weight object model,which unlike the ODMG model that it extends, does not impose the latter’s rigid type constraints. Both OEM and the Penn model essentially correspond to labeled digraphs. Amain theme emerging from the popular query languages such as Lorel [2], UnQL [11], StruQL [17], WebOQL [6], and the Ulixes/Penelope pair of the ADM model [7], is that navigation is considered an integral and essential part of querying. Indeed, given the lac of rigid schema of semi-structured data, navigation brings many benefits, including the ability to retrieve data regardless of the depth at which it resides in a tree (e.g.,see [4]). This is achieved with programming primitives such as regular path expressions and wildcards. A second, somewhat subtle, aspect of the emerging trend is that query expressions are often dependent on the particular instance they are applied to. This is not surprising, given the lac of rigid structure and the absence of the notion of a predefined schema for semi-structured data. In fact, it has been argued [4] that it is unreasonable to impose a predefined schema.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., and V. Vianu. Queries and computation on the Web. ICDT’ 97. 271, 293

    Google Scholar 

  2. Abiteboul, S. et al. The Lorel query language for semistructured data. Intl. J. on Digital Libraries, 1:68–88, 1997. 271

    Google Scholar 

  3. Abiteboul, S., and V Vianu. Regular path queries with constraints. PODS’ 97. 271

    Google Scholar 

  4. Abiteboul, S. Querying Semi-Structured Data. ICDT’ 97. 271, 273

    Google Scholar 

  5. Andries, M., and Paredaens, J.. On instance-completeness for database query languages involving object creation. J. Comp. Syst. Sci., 52(2):357–373, 1996. 294

    Article  MathSciNet  Google Scholar 

  6. Arocena, G., and A. O. Mendelzon. WebOQL: Restructuring documents, databases and webs. ICDE’ 98. 271, 274, 294

    Google Scholar 

  7. Atzeni, P., G. Mecca, and P. Merialdo. To weave the web. VLDB’ 97. 271

    Google Scholar 

  8. Bancilhon, F.. On the Completeness of Query Languages for Relational Data Bases. MFOCS’ 78. 274, 290, 294

    Google Scholar 

  9. Beeri, C., and T. Milo. Schemas for Integration and Translation of Structured and Semi-Structured data. ICDT’ 99. 271, 276

    Google Scholar 

  10. Brey, T., J. Paoli, and S. Sperberg-McQuenn. Extensible Markup Language (XML) 1.0. http://www.w3.org/TR/REC-xml 292

  11. Buneman, P. et al. A query language and optimization techniques for unstructured data. SIGMOD’ 96. 271, 274, 291

    Google Scholar 

  12. Buneman, P., and A. Ohori. Polymorphism and Type Inference in Database Programming. ACM TODS, 21:30–76, 1996. 286

    Article  Google Scholar 

  13. Buneman, P. et al. Adding structure too unstructured data. ICDT’ 97. 271

    Google Scholar 

  14. Bussche, J. Van den et al. On the Completeness of Object-Creating Database Transformation Languages. JACM, 44(2):272–319. 294

    Google Scholar 

  15. Bussche, J. Van den, and E. Waller. Type inference in the polymorphic relational algebra. PODS 99. 292

    Google Scholar 

  16. Chandra A.K and D. Harel. Computable queries for relational data bases. J. Comp. Syst. Sci., 21:156–178, 1980. 272, 294

    Article  MATH  MathSciNet  Google Scholar 

  17. Fernandez, M., D. Florescu, A. Levy, and D. Suciu. A query language for a web-site management system. SIGMOD Record, 26(3):4–11, September 1997. 271, 274

    Article  Google Scholar 

  18. Florescu, D., A. Levy, and D. Suciu. Query containment for conjunctive queries with regular expressions. PODS’ 98. 272

    Google Scholar 

  19. Grahne, G., and L.V.S. Lakshmanan. On the Difference between Navigating Semi-structured Data and Querying It.Technical report, Concordia University, Montreal, Canada, August 1999 in preparation. 290

    Google Scholar 

  20. Gyssens, M., L.V.S. Lakshmanan, and I.N. Subramanian Tables as a paradigm for querying and restructuring. PODS’ 96. 280

    Google Scholar 

  21. Hull, R.. Managing Semantic Heterogeneity in Databases: A Theoretical Perspective. PODS’ 97. 274, 294

    Google Scholar 

  22. Manna, Z., Mathematical Theory of Computation.McGraw-Hill, New York, Montreal, 1974. 281

    MATH  Google Scholar 

  23. Mendelzon, A. O., and T. Milo. Formal models of web queries. PODS’ 97. 271, 293

    Google Scholar 

  24. Lerat, N., and W. Lipski. Nonapplicable nulls. TCS, 46:67–82, 1986. 291

    Article  MATH  MathSciNet  Google Scholar 

  25. Ley, M.. Database Systems and Logic Programming Bibliography Server. http://www.informatik.uni-trier.de/ ley/db/. 275

  26. Nestorov, S., S. Abiteboul, and R. Motwani. Extracting Schema from Semistructured Data. SIGMOD’ 98. 292

    Google Scholar 

  27. Paredaens, J. On the expressive power of the relational algebra. IPL, 7:107–110, 1978. 274, 290, 294

    Article  MATH  MathSciNet  Google Scholar 

  28. Rajaraman, A., and J.D. Ullman. Integrating Information by Outerjoins and Full Disjunctions. PODS’ 96. 274

    Google Scholar 

  29. Thierry-Mieg, J., and R. Durbin. A C.elegans database: syntactic definitions for the ACEDB data base manager, 1992. 271

    Google Scholar 

  30. Ullman, J. D. Database Theory-Past and Future. PODS’ 87. 273

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grahne, G., Lakshmanan, L.V.S. (2000). On the Difference between Navigating Semi-structured Data and Querying It. In: Connor, R., Mendelzon, A. (eds) Research Issues in Structured and Semistructured Database Programming. DBPL 1999. Lecture Notes in Computer Science, vol 1949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44543-9_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-44543-9_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41481-0

  • Online ISBN: 978-3-540-44543-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics