Data Partitioning for Parallel Spatial Join Processing

Zhou, Xiaofang; Abel, David J.; Truffet, David

doi:10.1023/A:1009755931056

Data Partitioning for Parallel Spatial Join Processing

Published: June 1998

Volume 2, pages 175–204, (1998)
Cite this article

GeoInformatica Aims and scope Submit manuscript

Xiaofang Zhou¹,
David J. Abel¹ &
David Truffet¹

420 Accesses
56 Citations
1 Altmetric
Explore all metrics

Abstract

The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. In this paper we show that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

D.J. Abel, V. Gaede, R.A. Power, and X. Zhou. “Resequencing and clustering to improve the performance of spatial join,” Technical report, CSIRO Mathematical and Information Sciences, Australia, 1997.
Google Scholar
D.J. Abel, B.C. Ooi, K.-L. Tan, R. Power, and J.X. Yu. “Spatial join strategies in distributed spatial dbms,” in LNCS 951: Proc. of 4th Int. Symp. on Large Spatial Databases (SSD'95), 346–367, Springer-Verlag, 1995.
D.J. Abel and J.L. Smith. “A data structure and algorithm based on a linear key for a rectangle retrieval problem,” Computer Vision, Graphics and Image Processing, 24(1):1–13, 1983.
Google Scholar
T. Brinkhoff, H.P. Kriegel, and B. Seeger. “Efficient processing of spatial joins using R-trees,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, 237–246, 1993.
T. Brinkhoff, H.P. Kriegel, and B. Seeger. “Parallel processing of spatial joins using R-trees,” in Proc. of 12th International Conference on Data Engineering, 1996.
D.J. DeWitt and J. Gray. “Parallel database systems: the future of database processing,” C. ACM, 35(6):85–98, 1992.
Google Scholar
D.J. DeWitt et al. “Practical skew handling in parallel join,” in Proc. 18th Int. Conf. on Very Large Data Bases, Vancouver, Canada, 27–40, 1992.
O. Günther. “Efficient computation of spatial joins,” in Proc. of 9th International Conference on Data Engineering, Vienna, Austria, 50–59, 1993.
R.H. Güting. “An introduction to spatial database systems,” VLDB Journal, 3(4):357–399, 1994.
Google Scholar
A. Guttman. “R-trees: A dynamic index structure for spatial searching,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, 47–54, 1984.
E.G. Hoel and H. Samet. “Performance of data-parallel spatial operations,” in Proc. 20th Int. Conf. on Very Large Data Bases, 156–167, 1995.
E. Horowitz and S. Sahni. Fundamentals of Computer Algorithms, Computer Science Press, 1978.
K.A. Hua and C. Lee. “Handing data skew in multiprocessor database computers using partition tuning,” in Proc. of 17th International Conference on Very Large Data Bases, Barcelona, 523–535, 1991.
H. Ishihata, T. Horie, S. Inano, T. Shimizu, and S. Kato. “CAP-IID architecture,” in Proc. of the 1st Fujitsu-ANU CAP Workshop, Kawasaki, Japan, 1990.
M. Kitsuregawa and Y. Ogawa. “Bucket spreading parallel hash: A new, robust, parallel hash join method for data skew in the super database computer (SDC),” in Proc. 16th Int. Conf. on Very Large Data Bases, 210–221, 1990.
M. Kitsuregawa, H. Tanaka, and T. Motooka. “Application of hash to database machine and its architecture,” New Generation Computing, 1(1):66–74, 1983.
Google Scholar
M.L. Lo and C.V. Ravishankar. “Spatial joins using seeded trees,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, 209–220, 1994.
M.L. Lo and C.V. Ravishankar. “Spatial hash-join,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 247–258, 1996.
J.H. Lu, B.C. Ooi, and K.L. Tan. Query Processing in Parallel Relational Database Systems, IEEE Computer Society Press, 1994.
J. Orenstein and F.A. Manola. “Probe spatial data modeling and query processing in an image database application,” IEEE TOSE, 14(5):611–629, 1988.
Google Scholar
J.M. Patel and D.J. DeWitt. “Partition based spatial-merge join,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, Montreal, Canada, 259–270, 1996.
F.P. Preparata and M.I. Shamos. Computational Geometry: an introduction. Springer-Verlag, 1985.
T. Sellis, N. Roussopoulos, and C. Faloutsos. “The R⁺-tree: a dynamic index for multi-dimensional objects, ” in Proc. 13th Int. Conf. on Very Large Data Bases, 3–11, 1987.
M. Stonebraker, J. Frew, K. Gardels, and J. Meredith. “The SEQUOIA 2000 storage benchmark,” in Proc. of ACM SIGMOD Int. Conf. on Management of Data, Washington, DC, 2–11, 1993.
X. Zhou. Parallel Processing in Relational Database Systems, Ph.D. thesis, University of Queensland, 1994.

Download references

Author information

Authors and Affiliations

CSIRO Mathematical and Information Sciences, GPO Box 664, Canberra, ACT, 2601, Australia
Xiaofang Zhou, David J. Abel & David Truffet

Authors

Xiaofang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
David J. Abel
View author publications
You can also search for this author in PubMed Google Scholar
David Truffet
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, X., Abel, D.J. & Truffet, D. Data Partitioning for Parallel Spatial Join Processing. GeoInformatica 2, 175–204 (1998). https://doi.org/10.1023/A:1009755931056

Download citation

Issue Date: June 1998
DOI: https://doi.org/10.1023/A:1009755931056

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data Partitioning for Parallel Spatial Join Processing

Abstract

Access this article

Similar content being viewed by others

Efficient spatial data partitioning for distributed $$k$$ NN joins

In-Memory Interval Joins

A Strategy for Optimizing a Multi-site Query in a Distributed Spatial Database

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Data Partitioning for Parallel Spatial Join Processing

Abstract

Access this article

Similar content being viewed by others

Efficient spatial data partitioning for distributed $$k$$ NN joins

In-Memory Interval Joins

A Strategy for Optimizing a Multi-site Query in a Distributed Spatial Database

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation