Skip to main content

Data partitioning for parallel spatial join processing

  • Spatial Query Processing
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1262))

Abstract

The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. We show in this paper that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. J. Abel, V. Gaede, R. A. Power, and X. Zhou. Resequencing and clustering to improve the performance of spatial join. Technical report, CSIRO Mathematical and Information Sciences, Australia, 1997.

    Google Scholar 

  2. D. J. Abel, B. C. Ooi, K.-L. Tan, R. Power, and J. X. Yu. Spatial join strategies in distributed spatial dbms. In LNCS 951: Proceedings of 4th Int. Symp. on Large Spatial Databases (SSD'95), pages 346–367. Springer-Verlag, 1995.

    Google Scholar 

  3. D. J. Abel and J. L. Smith. A data structure and algorithm based on a linear key for a rectangle retrieval problem. Computer Vision, Graphics and Image Processing, 24(1):1–13, 1983.

    Google Scholar 

  4. T. Brinkhoff, H. P. Kriegel, and B. Seeger. Efficient processing of spatial joins using R-trees. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 237–246, 1993.

    Google Scholar 

  5. T. Brinkhoff, H. P. Kriegel, and B. Seeger. Parallel processing of spatial joins using R-trees. In Proceedings of 12th International Conference on Data Engineering, 1996.

    Google Scholar 

  6. D. J. DeWitt and J. Gray. Parallel database systems: the future of database processing. C. ACM, 35(6):85–98, 1992.

    Google Scholar 

  7. D. J. DeWitt et al. Practical skew handling in parallel join. In Proc. 18th Int. Conf. on Very Large Data Bases, pages 27–40, Vancouver, Canada, 1992.

    Google Scholar 

  8. O. Günther. Efficient computation of spatial joins. In Proceedings of 9th International Conference on Data Engineering, pages 50–59, Vienna, Austria, 1993.

    Google Scholar 

  9. R. H. Güting. An introduction to spatial database systems. VLDB Journal, 3(4):357–399, 1994.

    Google Scholar 

  10. A. Guttman. R-trees: A dynamic index structure for spatial searching. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 47–54, 1984.

    Google Scholar 

  11. E. G. Hoel and H. Samet. Performance of data-parallel spatial operations. In Proc. 20th Int. Conf. on Very Large Data Bases, pages 156–167, 1995.

    Google Scholar 

  12. E. Horowitz and S. Sahni. Fundamentals of Computer Algorithms. Computer Science Press, 1978.

    Google Scholar 

  13. K. A. Hua and C. Lee. Handling data skew in multiprocessor database computers using partition tuning. In Proceedings of 17th International Conference on Very Large Data Bases, pages 523–535, Barcelona, 1991.

    Google Scholar 

  14. M. Kitsuregawa and Y. Ogawa. Bucket spreading parallel hash: A new, robust, parallel hash join method for data skew in the super database computer (SDC). In Proc. 16th Int. Conf. on Very Large Data Bases, pages 210–221, 1990.

    Google Scholar 

  15. M. Kitsuregawa, H. Tanaka, and T. Motooka. Application of hash to database machine and its architecture. New Generation Computing, 1(1):66–74, 1983.

    Google Scholar 

  16. M. L. Lo and C. V. Ravishankar. Spatial joins using seeded trees. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 209–220, 1994.

    Google Scholar 

  17. M. L. Lo and C. V. Ravishankar. Spatial hash-join. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 247–258, Montreal, Canada, 1996.

    Google Scholar 

  18. H. J. Lu, B. C. Ooi, and K. L. Tan. Query Processing in Parallel Relational Database Systems. IEEE Computer Society Press, 1994.

    Google Scholar 

  19. J. Orenstein and F. A. Manola. Probe spatial data modeling and query processing in an image database application. IEEE Trans. Software Eng., 14(5):611–629, 1988.

    Google Scholar 

  20. J. M. Patel and D. J. DeWitt. Partition based spatial-merge join. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 259–270, Montreal, Canada, 1996.

    Google Scholar 

  21. F. P. Preparata and M. I. Shamos. Computational Geometry: on introduction. Springer-Verlag, 1985.

    Google Scholar 

  22. T. Sellis, N. Roussopoulos, and C. Faloutsos. The R+-tree: a dynamic index for multi-dimensional objects. In Proc. 13th Int. Conf. on Very Large Data Bases, pages 3–11, 1987.

    Google Scholar 

  23. M. Stonebraker, J. Frew, K. Gardels, and J. Meredith. The SEQUOIA 2000 storage benchmark. In Proceedings of ACM SIGMOD Int. Conf. on Management of Data, pages 2–11, Washington, DC, 1993.

    Google Scholar 

  24. X. Zhou. Parallel Processing in Relational Database Systems. PhD thesis, University of Queensland, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Michel Scholl Agnès Voisard

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhou, X., Abel, D.J., Truffet, D. (1997). Data partitioning for parallel spatial join processing. In: Scholl, M., Voisard, A. (eds) Advances in Spatial Databases. SSD 1997. Lecture Notes in Computer Science, vol 1262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63238-7_30

Download citation

  • DOI: https://doi.org/10.1007/3-540-63238-7_30

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63238-2

  • Online ISBN: 978-3-540-69240-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics