Abstract
In this paper, we generalize conventional join indexes to a cluster-based join index, in which objects are grouped into clusters based on proximity. Each record of our join index represents a pair of clusters in which the join condition is satisfied by some members of the cluster. This strategy is especially useful for spatial and high-dimensional databases because of their typically large data volume and complex operations. Our approach leverages on the structure of R-trees by exploiting the internal nodes of an R-tree in effectively determining the precomputed clusters which can be used in our join index. By varying the size of the cluster, we are able to fine-tune the join index to achieve a balance between update cost and retrieval cost to suit individual applications. Different implementations of the join index are examined to determine how the join index can be efficiently maintained. To this end, we also conduct a number of experiments on intersection join and window queries, and the results confirm that semi-precomputation of join results is a robust and cost effective approach to join processing.
Similar content being viewed by others
References
N. Beckmann, H. Kriegel, R. Schneider, B. Seeger. The r*-tree: An efficient and robust access method for points and rectangles. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, N.J., 1990, pp. 322–331.
T. Brinkhoff, H. Kriegel, R. Schneider, B. Seeger. Multi-step processing of spatial joins. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Minneapolis, Minnesota, 1994, pp. 197–208.
T. Brinkhoff, H. Kriegel, B. Seeger. Efficient processing of spatial joins using r-trees. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Washington DC, 1993, pp. 237–246.
S. Chauduri, U. Dayal. Decision support, data warehousing, and olap. In: Tutorial Notes of Intl. Conf. on Very Large Data Bases, Mumbay, India, 1996.
D. DeWitt, R. Katz, F. Olken, L. Shapiro, M. Stonebraker, D. Wood. Implementation techniques for main memory database systems. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, NY, June 1984, pp. 1–8.
P. Goyal, H. F. Li, E. Regener, F. Sadri. Scheduling of page fetches in join operations using bc-trees. In: Proc. 3rd International Conference on Data Engineering, 1988, pp. 304–310.
O. Guenther. Efficient computation of spatial joins. In: Proc. 9th Int. Conf. on Data Engineering, Vienna, Austria, 1993, pp. 50–59.
A. Guttman. R-trees: A dynamic index structure for spatial searching. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, MA, 1984, pp. 47–57.
M-L. Lo, C.V. Ravishankar. Spatial joins using seeded trees. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Minneapolis, Minnesota, 1994, pp. 209–220.
M-L. Lo, C.V. Ravishankar. Spatial hash-joins. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 1996, pp. 247–258.
H. Lu, R. Luo, B.C. Ooi. Spatial joins by precomputation of approximation. In: Proc. 6th Australasian Database Conference, Glenelg, South Australia, 1995, pp. 132–142.
H. Lu, B.C. Ooi, K.L. Tan. On spatially partitioned temporal joins. In: Proc. 20th Int’l. Conf. on Very Large Data Bases, Santiago, Chile, August 1994, pp. 546–557.
W. Lu, J. Han. Distance-associated join indices for spatial range search. In: Proc. 9th Int. Conf. on Data Engineering, Vienna, Austria, 1992, pp. 284–292.
P. Mishra, M. Eich. Join processing in relational databases,: ACM Computing Surveys 24(1): 63–113, March 1992.
M. Murphy, D. Rotem. Effective resource utilization for multiprocessor join execution. In: Proc. Intl. Conf. on Very Large Data Bases, Amsterdam, 1989, pp. 67–76.
M. Murphy, D. Rotem. Processor scheduling for multiprocessor joins. In: Proc. Fifth International Conference on Data Engineering, Los Angeles, California, 1989, pp. 140–148,.
J. Nievergelt, H. Hinterberger, K.C. Sevcik. The grid file: An adaptable, symmetric multikey file structure, ACM Transactions on Database Systems 9(3):369–391, 1984.
D. Papadias, Y. Theodoridis, T. Sellis, M.J. Egenhofer. Topological relations in the world of minimum bounding rectangles: A study with r-trees. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, San Jose, California, 1995, pp. 92–103.
J.M. Patel, D. DeWitt. Partition-based spatial merge join. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 1996, pp. 259–270.
D. Rotem. Spatial join indices. In: Proc. Int. Conf. on Data Engineering, Kobe, Japan, 1991, pp.500–509.
T. Sellis, N. Roussopoulos, C. Faloutsos. The r+-tree: A dynamic index for multidimensional objects. In: Proc. Int’l. Conf. on Very Large Data Bases, Brighton, England, 1987, pp. 507–518.
D. Shasha, T-L. Wang. Optimizing equijoin queries in distributed databases where relations are hash partitioned, ACM Transactions on Database Systems 16(2):279–308, 1991.
P. Valduriez. Join indices, ACM Transactions on Database Systems 12(2):218–246, 1987.
G. K. Zipf. Human Behaviour and the Principle of Least Effort, Addison-Wesley, 1949.
Author information
Authors and Affiliations
Corresponding author
Additional information
Deceased
Rights and permissions
About this article
Cite this article
Tan, KL., Goh, C.H., Lee, M.L. et al. Efficient Join Processing Using Partial Precomputation. Knowledge and Information Systems 1, 481–514 (1999). https://doi.org/10.1007/BF03325111
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF03325111