Abstract
The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. We show in this paper that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
D. J. Abel, V. Gaede, R. A. Power, and X. Zhou. Resequencing and clustering to improve the performance of spatial join. Technical report, CSIRO Mathematical and Information Sciences, Australia, 1997.
D. J. Abel, B. C. Ooi, K.-L. Tan, R. Power, and J. X. Yu. Spatial join strategies in distributed spatial dbms. In LNCS 951: Proceedings of 4th Int. Symp. on Large Spatial Databases (SSD'95), pages 346–367. Springer-Verlag, 1995.
D. J. Abel and J. L. Smith. A data structure and algorithm based on a linear key for a rectangle retrieval problem. Computer Vision, Graphics and Image Processing, 24(1):1–13, 1983.
T. Brinkhoff, H. P. Kriegel, and B. Seeger. Efficient processing of spatial joins using R-trees. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 237–246, 1993.
T. Brinkhoff, H. P. Kriegel, and B. Seeger. Parallel processing of spatial joins using R-trees. In Proceedings of 12th International Conference on Data Engineering, 1996.
D. J. DeWitt and J. Gray. Parallel database systems: the future of database processing. C. ACM, 35(6):85–98, 1992.
D. J. DeWitt et al. Practical skew handling in parallel join. In Proc. 18th Int. Conf. on Very Large Data Bases, pages 27–40, Vancouver, Canada, 1992.
O. Günther. Efficient computation of spatial joins. In Proceedings of 9th International Conference on Data Engineering, pages 50–59, Vienna, Austria, 1993.
R. H. Güting. An introduction to spatial database systems. VLDB Journal, 3(4):357–399, 1994.
A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 47–54, 1984.
E. G. Hoel and H. Samet. Performance of data-parallel spatial operations. In Proc. 20th Int. Conf. on Very Large Data Bases, pages 156–167, 1995.
E. Horowitz and S. Sahni. Fundamentals of Computer Algorithms. Computer Science Press, 1978.
K. A. Hua and C. Lee. Handling data skew in multiprocessor database computers using partition tuning. In Proceedings of 17th International Conference on Very Large Data Bases, pages 523–535, Barcelona, 1991.
M. Kitsuregawa and Y. Ogawa. Bucket spreading parallel hash: A new, robust, parallel hash join method for data skew in the super database computer (SDC). In Proc. 16th Int. Conf. on Very Large Data Bases, pages 210–221, 1990.
M. Kitsuregawa, H. Tanaka, and T. Motooka. Application of hash to database machine and its architecture. New Generation Computing, 1(1):66–74, 1983.
M. L. Lo and C. V. Ravishankar. Spatial joins using seeded trees. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 209–220, 1994.
M. L. Lo and C. V. Ravishankar. Spatial hash-join. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 247–258, Montreal, Canada, 1996.
H. J. Lu, B. C. Ooi, and K. L. Tan. Query Processing in Parallel Relational Database Systems. IEEE Computer Society Press, 1994.
J. Orenstein and F. A. Manola. Probe spatial data modeling and query processing in an image database application. IEEE Trans. Software Eng., 14(5):611–629, 1988.
J. M. Patel and D. J. DeWitt. Partition based spatial-merge join. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 259–270, Montreal, Canada, 1996.
F. P. Preparata and M. I. Shamos. Computational Geometry: on introduction. Springer-Verlag, 1985.
T. Sellis, N. Roussopoulos, and C. Faloutsos. The R+-tree: a dynamic index for multi-dimensional objects. In Proc. 13th Int. Conf. on Very Large Data Bases, pages 3–11, 1987.
M. Stonebraker, J. Frew, K. Gardels, and J. Meredith. The SEQUOIA 2000 storage benchmark. In Proceedings of ACM SIGMOD Int. Conf. on Management of Data, pages 2–11, Washington, DC, 1993.
X. Zhou. Parallel Processing in Relational Database Systems. PhD thesis, University of Queensland, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, X., Abel, D.J., Truffet, D. (1997). Data partitioning for parallel spatial join processing. In: Scholl, M., Voisard, A. (eds) Advances in Spatial Databases. SSD 1997. Lecture Notes in Computer Science, vol 1262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63238-7_30
Download citation
DOI: https://doi.org/10.1007/3-540-63238-7_30
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63238-2
Online ISBN: 978-3-540-69240-9
eBook Packages: Springer Book Archive