Skip to main content

Partition Based Path Join Algorithms for XML Data

  • Conference paper
Book cover Database and Expert Systems Applications (DEXA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2736))

Included in the following conference series:

Abstract

Path expression is an important component in querying XML data. The extended preorder numbering scheme enables us to quickly determine the ancestor-descendant relationship between elements in the hierarchy of XML data. Using the numbering scheme, a path expression can be evaluated by join operations to avoid potentially high cost of tree traversals. In this paper, we first formulate XML path queries as range-point join queries. Then we discuss the partition based algorithms that can utilize the range containment property to efficiently process the range-point join queries. Under the partition based framework, we propose three algorithms, namely Descendant partition join, Segment-tree partition join and Ancestor Link partition join, which can be chosen by a query optimizer for different input data characteristics. The experimental results show that the partition based algorithms can make better use of the buffer memory than sort-merge algorithms, and the proposed Ancestor Link join algorithm yields the best performance by using small in-memory data structures and by taking advantage of unevenly sized inputs.

This work was sponsored in part by the National Science Foundation CAREER Award (IIS-9876037), NSF Grant No. IIS-0100436, and NSF Research Infrastructure program EIA-0080123. The authors assume all responsibility for the contents of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chaudhuri, S., Motwani, R., Narasayya, V.R.: Random sampling for histogram construction: How much is enough? In: SIGMOD 1998, Seattle, Washington, USA, June 2–4, pp. 436–447 (1998)

    Google Scholar 

  2. DeWitt, D.J., Naughton, J.F., Scheneider, D.A.: An evaluation of nonequijoin algorithms. In: VLDB 1991, Barcelona, Spain (September 1991)

    Google Scholar 

  3. Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: VLDB 1997, Athens, Greece (September 1997)

    Google Scholar 

  4. Graefe, G., Linville, A., Shapiro, L.D.: Sort versus hash revisited. IEEE Transactions on Knowledge and Data Engineering 6(6), 934–944 (1994)

    Article  Google Scholar 

  5. Gunadhi, H., Segev, A.: Query processing algorithms for temporal intersection joins. In: ICDE 1991, Kobe, Japan (April 1991)

    Google Scholar 

  6. Li, Q., Moon, B.: Indexing and querying xml data for regular path expressions. In: VLDB 2001, Rome, Italy (September 2001)

    Google Scholar 

  7. Lipton, R.J., Naughton, J.F., Schneider, D.A., Seshadri, S.: Efficient sampling strategies for relational database operations. Theoretical Computer Science 116, 195–226 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  8. McHugh, J., Widom, J.: Query optimization for XML. In: VLDB 1999, Edinburgh, Scotland, pp. 315–326 (September 1999)

    Google Scholar 

  9. Ley, M.: DBLP Computer Science Biblography (February 2003), http://www.informatik.uni-trier.de/~ley/db/index.html

  10. Preparata, F.P., Shamos, M.I.: Computational Geometry - An Introduction. Springer, Berlin (1985)

    Google Scholar 

  11. Soo, M.D., Snodgrass, R.T., Jensen, C.S.: Efficient evaluation of the valid-time natural join. In: ICDE 1994, Houston, Texas, USA, February 14-18 (1994)

    Google Scholar 

  12. Srivastava, D., Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Wu, Y.: Structural joins: A primitive for efficient xml query pattern matching. In: ICDE 2002, San Jose, California (February 2002)

    Google Scholar 

  13. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment queries in relational database management systems. In: SIGMOD 2001, Santa Barbara, CA (May 2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, Q., Moon, B. (2003). Partition Based Path Join Algorithms for XML Data. In: Mařík, V., Retschitzegger, W., Štěpánková, O. (eds) Database and Expert Systems Applications. DEXA 2003. Lecture Notes in Computer Science, vol 2736. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45227-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45227-0_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40806-2

  • Online ISBN: 978-3-540-45227-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics