Skip to main content

Using Prefix-Trees for Efficiently Computing Set Joins

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3453))

Abstract

Joins on set-valued attributes (set joins) have numerous database applications. In this paper we propose PRETTI (PREfix Tree based seT joIn) – a suite of set join algorithms for containment, overlap and equality join predicates. Our algorithms use prefix trees and inverted indices. These structures are constructed on-the-fly if they are not already precomputed. This feature makes our algorithms usable for relations without indices and when joining intermediate results during join queries with more than two relations. Another feature of our algorithms is that results are output continuously during their execution and not just at the end. Experiments on real life datasets show that the total execution time of our algorithms is significantly less than that of previous approaches, even when the indices required by our algorithms are not precomputed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (September 1994)

    Google Scholar 

  2. Cai, J., Chakaravarthy, V.T., Kaushik, R., Naughton, J.F.: On the complexity of join predicates. In: ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems (2001)

    Google Scholar 

  3. Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI) (2003)

    Google Scholar 

  4. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2000)

    Google Scholar 

  5. Helmer, S., Moerkotte, G.: Evaluation of main memory join algorithms for joins with set comparison join predicates. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (1997)

    Google Scholar 

  6. Helmer, S., Moerkotte, G.: A study of four index structures for set-valued attributes of low cardinality. Technical report, University of Mannheim (1999)

    Google Scholar 

  7. Mamoulis, N.: Efficient processing of joins on set-valued attributes. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2003)

    Google Scholar 

  8. Melnik, S.: Set containment joins: Testbed, http://www-db.stanford.edu/~melnik/scj

  9. Melnik, S., Garcia-Molina, H.: Divide-and-conquer algorithm for computing set containment joins. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 427. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  10. Melnik, S., Garcia-Molina, H.: Adaptive algorithms for set containment joins. ACM Transactions on Database Systems (TODS) 28(2) (2003)

    Google Scholar 

  11. Ramasamy, K., Patel, J.M., Naughton, J.F., Kaushik, R.: Set containment joins: the good, the bad and the ugly. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (2000)

    Google Scholar 

  12. Rantzau, R.: Processing frequent itemset discovery queries by division and set containment join operators. In: 8th ACM SIGMOD DMKD Workshop (2003)

    Google Scholar 

  13. Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2004)

    Google Scholar 

  14. Stonebraker, M.: Object-relational DBMS: The Next Great Wave. Morgan Kaufmann, San Francisco (1996)

    MATH  Google Scholar 

  15. Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: Intl. Conf. on Knowledge Discovery and Data Mining (KDD) (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jampani, R., Pudi, V. (2005). Using Prefix-Trees for Efficiently Computing Set Joins. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_69

Download citation

  • DOI: https://doi.org/10.1007/11408079_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25334-1

  • Online ISBN: 978-3-540-32005-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics