On the Difference between Navigating Semi-structured Data and Querying It

Grahne, Gösta; Lakshmanan, Laks V. S.

doi:10.1007/3-540-44543-9_17

Gösta Grahne⁶ &
Laks V. S. Lakshmanan^6,7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1949))

Included in the following conference series:

International Symposium on Database Programming Languages

246 Accesses
4 Citations

Abstract

Currently, there is tremendous interest in semi-structured (SS)data management. This is spurred by data sources, such as the ACeDB [29], that are inherently less rigidly structured than traditional DBMS, by WWW documents where no hard rules or constraints are imposed and “anything goes,” and by integration of information coming from disparate sources exhibiting considerable differences in the way they structure information. Significant strides have been made in the development of data models and query languages [2, 11, 17, 6, 7], and to some extent, the theory of queries on semi-structured data [1, 23, 3, 13, 9]. The OEM model of the Stanford TSIMMIS project [2] (equivalently, its variant, independently developed at U.Penn. [11]) has emerged as the de facto standard model for semi-structured data. OEM is a light-weight object model,which unlike the ODMG model that it extends, does not impose the latter’s rigid type constraints. Both OEM and the Penn model essentially correspond to labeled digraphs. Amain theme emerging from the popular query languages such as Lorel [2], UnQL [11], StruQL [17], WebOQL [6], and the Ulixes/Penelope pair of the ADM model [7], is that navigation is considered an integral and essential part of querying. Indeed, given the lac of rigid schema of semi-structured data, navigation brings many benefits, including the ability to retrieve data regardless of the depth at which it resides in a tree (e.g.,see [4]). This is achieved with programming primitives such as regular path expressions and wildcards. A second, somewhat subtle, aspect of the emerging trend is that query expressions are often dependent on the particular instance they are applied to. This is not surprising, given the lac of rigid structure and the absence of the notion of a predefined schema for semi-structured data. In fact, it has been argued [4] that it is unreasonable to impose a predefined schema.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abiteboul, S., and V. Vianu. Queries and computation on the Web. ICDT’ 97. 271, 293
Google Scholar
Abiteboul, S. et al. The Lorel query language for semistructured data. Intl. J. on Digital Libraries, 1:68–88, 1997. 271
Google Scholar
Abiteboul, S., and V Vianu. Regular path queries with constraints. PODS’ 97. 271
Google Scholar
Abiteboul, S. Querying Semi-Structured Data. ICDT’ 97. 271, 273
Google Scholar
Andries, M., and Paredaens, J.. On instance-completeness for database query languages involving object creation. J. Comp. Syst. Sci., 52(2):357–373, 1996. 294
Article MathSciNet Google Scholar
Arocena, G., and A. O. Mendelzon. WebOQL: Restructuring documents, databases and webs. ICDE’ 98. 271, 274, 294
Google Scholar
Atzeni, P., G. Mecca, and P. Merialdo. To weave the web. VLDB’ 97. 271
Google Scholar
Bancilhon, F.. On the Completeness of Query Languages for Relational Data Bases. MFOCS’ 78. 274, 290, 294
Google Scholar
Beeri, C., and T. Milo. Schemas for Integration and Translation of Structured and Semi-Structured data. ICDT’ 99. 271, 276
Google Scholar
Brey, T., J. Paoli, and S. Sperberg-McQuenn. Extensible Markup Language (XML) 1.0. http://www.w3.org/TR/REC-xml 292
Buneman, P. et al. A query language and optimization techniques for unstructured data. SIGMOD’ 96. 271, 274, 291
Google Scholar
Buneman, P., and A. Ohori. Polymorphism and Type Inference in Database Programming. ACM TODS, 21:30–76, 1996. 286
Article Google Scholar
Buneman, P. et al. Adding structure too unstructured data. ICDT’ 97. 271
Google Scholar
Bussche, J. Van den et al. On the Completeness of Object-Creating Database Transformation Languages. JACM, 44(2):272–319. 294
Google Scholar
Bussche, J. Van den, and E. Waller. Type inference in the polymorphic relational algebra. PODS 99. 292
Google Scholar
Chandra A.K and D. Harel. Computable queries for relational data bases. J. Comp. Syst. Sci., 21:156–178, 1980. 272, 294
Article MATH MathSciNet Google Scholar
Fernandez, M., D. Florescu, A. Levy, and D. Suciu. A query language for a web-site management system. SIGMOD Record, 26(3):4–11, September 1997. 271, 274
Article Google Scholar
Florescu, D., A. Levy, and D. Suciu. Query containment for conjunctive queries with regular expressions. PODS’ 98. 272
Google Scholar
Grahne, G., and L.V.S. Lakshmanan. On the Difference between Navigating Semi-structured Data and Querying It.Technical report, Concordia University, Montreal, Canada, August 1999 in preparation. 290
Google Scholar
Gyssens, M., L.V.S. Lakshmanan, and I.N. Subramanian Tables as a paradigm for querying and restructuring. PODS’ 96. 280
Google Scholar
Hull, R.. Managing Semantic Heterogeneity in Databases: A Theoretical Perspective. PODS’ 97. 274, 294
Google Scholar
Manna, Z., Mathematical Theory of Computation.McGraw-Hill, New York, Montreal, 1974. 281
MATH Google Scholar
Mendelzon, A. O., and T. Milo. Formal models of web queries. PODS’ 97. 271, 293
Google Scholar
Lerat, N., and W. Lipski. Nonapplicable nulls. TCS, 46:67–82, 1986. 291
Article MATH MathSciNet Google Scholar
Ley, M.. Database Systems and Logic Programming Bibliography Server. http://www.informatik.uni-trier.de/ ley/db/. 275
Nestorov, S., S. Abiteboul, and R. Motwani. Extracting Schema from Semistructured Data. SIGMOD’ 98. 292
Google Scholar
Paredaens, J. On the expressive power of the relational algebra. IPL, 7:107–110, 1978. 274, 290, 294
Article MATH MathSciNet Google Scholar
Rajaraman, A., and J.D. Ullman. Integrating Information by Outerjoins and Full Disjunctions. PODS’ 96. 274
Google Scholar
Thierry-Mieg, J., and R. Durbin. A C.elegans database: syntactic definitions for the ACEDB data base manager, 1992. 271
Google Scholar
Ullman, J. D. Database Theory-Past and Future. PODS’ 87. 273
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Concordia University, H3G 1M8, Montreal, Canada
Gösta Grahne & Laks V. S. Lakshmanan
Computer Science and Engineering Department, I.I.T., Powai, 400076, Mumbai
Laks V. S. Lakshmanan

Authors

Gösta Grahne
View author publications
You can also search for this author in PubMed Google Scholar
Laks V. S. Lakshmanan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Strathclyde, G1 1XH, Glasgow, Scotland, UK
Richard Connor
Department of Computer Science, University of Toronto, 6 King’s College Road, M5S 3H5, Ontario, Toronto, Canada
Alberto Mendelzon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grahne, G., Lakshmanan, L.V.S. (2000). On the Difference between Navigating Semi-structured Data and Querying It. In: Connor, R., Mendelzon, A. (eds) Research Issues in Structured and Semistructured Database Programming. DBPL 1999. Lecture Notes in Computer Science, vol 1949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44543-9_17

Download citation

DOI: https://doi.org/10.1007/3-540-44543-9_17
Published: 13 April 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41481-0
Online ISBN: 978-3-540-44543-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics