Skip to main content
Log in

Efficient Join Processing Using Partial Precomputation

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we generalize conventional join indexes to a cluster-based join index, in which objects are grouped into clusters based on proximity. Each record of our join index represents a pair of clusters in which the join condition is satisfied by some members of the cluster. This strategy is especially useful for spatial and high-dimensional databases because of their typically large data volume and complex operations. Our approach leverages on the structure of R-trees by exploiting the internal nodes of an R-tree in effectively determining the precomputed clusters which can be used in our join index. By varying the size of the cluster, we are able to fine-tune the join index to achieve a balance between update cost and retrieval cost to suit individual applications. Different implementations of the join index are examined to determine how the join index can be efficiently maintained. To this end, we also conduct a number of experiments on intersection join and window queries, and the results confirm that semi-precomputation of join results is a robust and cost effective approach to join processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. N. Beckmann, H. Kriegel, R. Schneider, B. Seeger. The r*-tree: An efficient and robust access method for points and rectangles. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, N.J., 1990, pp. 322–331.

  2. T. Brinkhoff, H. Kriegel, R. Schneider, B. Seeger. Multi-step processing of spatial joins. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Minneapolis, Minnesota, 1994, pp. 197–208.

  3. T. Brinkhoff, H. Kriegel, B. Seeger. Efficient processing of spatial joins using r-trees. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Washington DC, 1993, pp. 237–246.

  4. S. Chauduri, U. Dayal. Decision support, data warehousing, and olap. In: Tutorial Notes of Intl. Conf. on Very Large Data Bases, Mumbay, India, 1996.

    Google Scholar 

  5. D. DeWitt, R. Katz, F. Olken, L. Shapiro, M. Stonebraker, D. Wood. Implementation techniques for main memory database systems. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, NY, June 1984, pp. 1–8.

  6. P. Goyal, H. F. Li, E. Regener, F. Sadri. Scheduling of page fetches in join operations using bc-trees. In: Proc. 3rd International Conference on Data Engineering, 1988, pp. 304–310.

  7. O. Guenther. Efficient computation of spatial joins. In: Proc. 9th Int. Conf. on Data Engineering, Vienna, Austria, 1993, pp. 50–59.

  8. A. Guttman. R-trees: A dynamic index structure for spatial searching. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, MA, 1984, pp. 47–57.

  9. M-L. Lo, C.V. Ravishankar. Spatial joins using seeded trees. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Minneapolis, Minnesota, 1994, pp. 209–220.

  10. M-L. Lo, C.V. Ravishankar. Spatial hash-joins. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 1996, pp. 247–258.

  11. H. Lu, R. Luo, B.C. Ooi. Spatial joins by precomputation of approximation. In: Proc. 6th Australasian Database Conference, Glenelg, South Australia, 1995, pp. 132–142.

  12. H. Lu, B.C. Ooi, K.L. Tan. On spatially partitioned temporal joins. In: Proc. 20th Int’l. Conf. on Very Large Data Bases, Santiago, Chile, August 1994, pp. 546–557.

  13. W. Lu, J. Han. Distance-associated join indices for spatial range search. In: Proc. 9th Int. Conf. on Data Engineering, Vienna, Austria, 1992, pp. 284–292.

  14. P. Mishra, M. Eich. Join processing in relational databases,: ACM Computing Surveys 24(1): 63–113, March 1992.

    Article  Google Scholar 

  15. M. Murphy, D. Rotem. Effective resource utilization for multiprocessor join execution. In: Proc. Intl. Conf. on Very Large Data Bases, Amsterdam, 1989, pp. 67–76.

  16. M. Murphy, D. Rotem. Processor scheduling for multiprocessor joins. In: Proc. Fifth International Conference on Data Engineering, Los Angeles, California, 1989, pp. 140–148,.

  17. J. Nievergelt, H. Hinterberger, K.C. Sevcik. The grid file: An adaptable, symmetric multikey file structure, ACM Transactions on Database Systems 9(3):369–391, 1984.

    Article  Google Scholar 

  18. D. Papadias, Y. Theodoridis, T. Sellis, M.J. Egenhofer. Topological relations in the world of minimum bounding rectangles: A study with r-trees. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, San Jose, California, 1995, pp. 92–103.

    Google Scholar 

  19. J.M. Patel, D. DeWitt. Partition-based spatial merge join. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 1996, pp. 259–270.

  20. D. Rotem. Spatial join indices. In: Proc. Int. Conf. on Data Engineering, Kobe, Japan, 1991, pp.500–509.

  21. T. Sellis, N. Roussopoulos, C. Faloutsos. The r+-tree: A dynamic index for multidimensional objects. In: Proc. Int’l. Conf. on Very Large Data Bases, Brighton, England, 1987, pp. 507–518.

  22. D. Shasha, T-L. Wang. Optimizing equijoin queries in distributed databases where relations are hash partitioned, ACM Transactions on Database Systems 16(2):279–308, 1991.

    Article  MathSciNet  Google Scholar 

  23. P. Valduriez. Join indices, ACM Transactions on Database Systems 12(2):218–246, 1987.

    Article  Google Scholar 

  24. G. K. Zipf. Human Behaviour and the Principle of Least Effort, Addison-Wesley, 1949.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kian-Lee Tan.

Additional information

Deceased

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tan, KL., Goh, C.H., Lee, M.L. et al. Efficient Join Processing Using Partial Precomputation. Knowledge and Information Systems 1, 481–514 (1999). https://doi.org/10.1007/BF03325111

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF03325111

Keywords

Navigation