Using Prefix-Trees for Efficiently Computing Set Joins

Jampani, Ravindranath; Pudi, Vikram

doi:10.1007/11408079_69

Ravindranath Jampani¹⁹ &
Vikram Pudi¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3453))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1169 Accesses
21 Citations

Abstract

Joins on set-valued attributes (set joins) have numerous database applications. In this paper we propose PRETTI (PREfix Tree based seT joIn) – a suite of set join algorithms for containment, overlap and equality join predicates. Our algorithms use prefix trees and inverted indices. These structures are constructed on-the-fly if they are not already precomputed. This feature makes our algorithms usable for relations without indices and when joining intermediate results during join queries with more than two relations. Another feature of our algorithms is that results are output continuously during their execution and not just at the end. Experiments on real life datasets show that the total execution time of our algorithms is significantly less than that of previous approaches, even when the indices required by our algorithms are not precomputed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Set containment join revisited

Article 26 October 2015

FreshJoin: An Efficient and Adaptive Algorithm for Set Containment Join

Article Open access 09 November 2019

Leveraging set relations in exact and dynamic set similarity join

Article 11 December 2018

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (September 1994)
Google Scholar
Cai, J., Chakaravarthy, V.T., Kaushik, R., Naughton, J.F.: On the complexity of join predicates. In: ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems (2001)
Google Scholar
Grahne, G., Zhu, J.: Efficiently using prefix-trees in mining frequent itemsets. In: IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI) (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2000)
Google Scholar
Helmer, S., Moerkotte, G.: Evaluation of main memory join algorithms for joins with set comparison join predicates. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (1997)
Google Scholar
Helmer, S., Moerkotte, G.: A study of four index structures for set-valued attributes of low cardinality. Technical report, University of Mannheim (1999)
Google Scholar
Mamoulis, N.: Efficient processing of joins on set-valued attributes. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2003)
Google Scholar
Melnik, S.: Set containment joins: Testbed, http://www-db.stanford.edu/~melnik/scj
Melnik, S., Garcia-Molina, H.: Divide-and-conquer algorithm for computing set containment joins. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 427. Springer, Heidelberg (2002)
Chapter Google Scholar
Melnik, S., Garcia-Molina, H.: Adaptive algorithms for set containment joins. ACM Transactions on Database Systems (TODS) 28(2) (2003)
Google Scholar
Ramasamy, K., Patel, J.M., Naughton, J.F., Kaushik, R.: Set containment joins: the good, the bad and the ugly. In: Proc. of Intl. Conf. on Very Large Databases (VLDB) (2000)
Google Scholar
Rantzau, R.: Processing frequent itemset discovery queries by division and set containment join operators. In: 8th ACM SIGMOD DMKD Workshop (2003)
Google Scholar
Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data (2004)
Google Scholar
Stonebraker, M.: Object-relational DBMS: The Next Great Wave. Morgan Kaufmann, San Francisco (1996)
MATH Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: Intl. Conf. on Knowledge Discovery and Data Mining (KDD) (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Data Engineering, International Institute of Information Technology, Hyderabad, India
Ravindranath Jampani & Vikram Pudi

Authors

Ravindranath Jampani
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Pudi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Institute of Information Technology, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Lizhu Zhou
National University of Singapore, Singapore
Beng Chin Ooi
School of Information, Renmin University of China,
Xiaofeng Meng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jampani, R., Pudi, V. (2005). Using Prefix-Trees for Efficiently Computing Set Joins. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_69

Download citation

DOI: https://doi.org/10.1007/11408079_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics