Skip to main content
Log in

Merging R-Trees: Efficient Strategies for Local Bulk Insertion

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

A lot of recent work has focussed on bulk loading of data into multidimensional index structures in order to efficiently construct such structures for large datasets. In this paper, we address this problem with particular focus on R-trees—which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new datasets into an active R-tree. This technique, called STLT (for small-tree-large-tree), considers the new dataset as an R-tree itself (small tree), identifies and prepares a suitable location in the original R-tree (large tree) for insertion, and lastly performs the insert of the small tree into the large tree. Besides an analytical cost model of STLT, extensive experimental studies both on synthetic and real GIS data sets are also reported. These experiments not only compare STLT against the conventional technique, but also evaluate the suitability and limitations of STLT under different conditions, such as varying buffer sizes, ratio between existing and new data sizes, and skewness of new data with respect to the whole spatial region. We find that STLT does much better (in average, about 65%) than the existing technique for skewed datasets as well for large sizes of both the large tree and the small tree in terms of insertion time, while keeping comparable query tree quality. STLT consistently outperforms the alternate technique in all other circumstances in terms of bulk insertion time, especially, even up to 2,000% for the cases when the area of new data sets covers up to 4% of the global region covered by the existing index tree; however, at the cost of a deteriorating resulting tree quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. C.H. Ang and T.C. Tan. “New linar node splitting algorithm for R-trees,” Advances in Spatial Databases, 339–349, 1997.

  2. B. Seeger. Advances in Spatial Databases LNCS 525. Springer-Verlag: Berlin/Heidelberg: New York, 277–296, 1991.

    Google Scholar 

  3. B.C. Ooi, K.J. Mcdonnell, and R. Sacks-Davis. “Spatial kd-tree: An indexing mechanism for spatial databases,” in Proceedings of the IEEE Computer Software and Applications Conference, 433–438, 1987.

  4. N. Beckmann, H.P. Kriegel, R. Schneder, and B. Seeger. “The R*-tree: an efficient and robust access method for points and rectangles,” Proceedings of SIGMOD, 322–331, 1990.

  5. J. Bercken, P. Widmayer, and B. Seeger. “A generic approach to bulk loading multidimensional index structures,” Internatinal Conference on Very Large Data Bases, 406–415, 1997.

  6. C. Bhm and H.P. Kriegel. “Efficient bulk loading of large high-dimensional indexes,” Data Warehousing and Knowledge Discovery, 251–260, 1999.

  7. W. Chen. “Programming with logical queries, bulk updates, and hypothetical reasoning,” IEEE Transactions of Knowledge and Data Engineering, 587–599, July 1997.

  8. R. Choubey, L. Chen, and E.A. Rundensteiner. “GBI: A generalized R-Tree bulk-insertion strategy,” WPI Technical Report—TR–98–15-STLT, 1998.

  9. P. Ciaccia and M. Patella. “Bulk loading the M-tree,” Proceedings of the Australasian Database Conference, February 1998.

  10. D.B. Lomet and B. Salzberg. “The hB-tree: A robust multiattribute search structure,” in Proceedings of the fifth IEEE Inter-national Conference on Data Engineering, 296–304, 1989.

  11. C. Faloutsos and I. Kamel. “Beyond uniformity and independance: Analysis of R-tree using the concept of fractal dimension,” Proceedings of SIGMOD, 4–13, 1994.

  12. A. Guttman. “R-trees: A dynamic index structure for spatial searching,” Proceedings of SIGMOD, 47–57, 1984.

  13. Y.W. Huang, N. Jing, and E.A. Rundensteiner. “A cost model for estimating the performance of spatial joins using R-tree,” International Working Confernece on Scientific and Statistical Database Management, 30–38, August 1997.

  14. I. Kamel and C. Faloutsos. “Hilbert R-tree: An improved R-tree using fractals,” in Proceedings of the Twentieth International Conference on Very Large Data Bases, 500–509, 1994.

  15. J. Li, D. Rotem, and J. Srivastava. “Algorithms for loading parallel gridfiles,” Proceedings of SIGMOD, 347–356, 1998.

  16. I. Kamel and C. Faloutsos. “On packing R-trees,” Proceedings of International Conference on Information and Knowledge Management, 490–499, November 1993.

  17. L. Arge. “The buffer tree: A new technique for optimal I/O algorithms,” in Proc. Workshop on Algorithms and Data Structures, LNCS 955, 334–345, 1995.

  18. L. Arge, K. Hinrichs, J. Vahrenhold, and J. Vitter. “Efficient bulk operations on dynamic R-trees,” Algorithm Engineering and Experimentation, International Workshop (ALENEX), Baltimore, MD, USA, 328–348, 1999.

  19. L. Chen, R. Choubey, and E.A. Rundensteiner. “Bulk insertions into R-trees using the small-tree-large-tree approach,” Proceedings of ACM GIS Workshop, 1998.

  20. S.D. Lang, J.R. Driscoll, and J.H. Jou. “Batch insertion for tree structured file organizations-improving differential database representation,” Information Systems, Vol. 11(2):167–175, 1986.

    Google Scholar 

  21. S. Leutenegger and M. Lopez. “The Effect of buffering on the performance of R-Trees,” Proceedings of IEEE International Conference on Data Engineering, 164–483, 1998.

  22. S. Leutenegger, M. Lopez, and J. Edgigton. “STR: A simple and efficient algorithm for R-tree packing,” Proceedings of IEEE International Conference on Data Engineering, 497–506, 1997.

  23. S. Leutenegger, M. Lopez, and Y. Garcia. “A greedy algorithm for bulk loading R-trees,” Technical report, University of Denver Computer Science (Technical Report # 97–02), 1997.

  24. S. Leutenegger and D. Nicol. “Efficient bulk-loading of gridfiles,” IEEE Transactions on Knowledge and Data Engineering, 410–420, May 1997.

  25. K.P. Malmi, E. Soininen, and T. Ylonen. “Concurrency control in B-Trees with batch updates,” IEEE Transactions on Knowledge and Data Engineering, 975–983, 1996.

  26. A. Moitra. “Spatio-temporal data management using R-trees,” International Journal of Geographic Information Systems, 1993.

  27. B. Ooi. “Efficient query processing in geographic information systems,” Lecture Notes in Computer Science, 1990.

  28. N. Roussopoulos and D. Leifker. “Direct spatial search on pictorial databases using packed R-trees,” Proceedings of SIGMOD 17–32, 1985.

  29. N. Roussopoulos, M. Roussopoulos, and Y. Kotidis. “Cubetree: organization of and bulk incremental updates on the data cube,” Proceedings of SIGMOD, 89–99, 1997.

  30. J. Srivastava and C.V. Ramamoorthy. “Efficient algorithms for maintenance of large database indexes,” Proceedings of IEEE International Conference on Data Engineering, 402–409, 1988.

  31. T. Sellis, N. Roussopoulos and C. Faloutsos. “R+-tree: A dynamic index for multi-dimensional objects,” in Proceedings of the Thirteenth International Conference on Very Large Data bases, 507–518, 1987.

  32. Y. Theodoridis and T. Sellis. “Optimization issues in R-tree construction (extended abstract),” Lecture Notes in Computer Science, 270–273, 1994.

  33. V. Gaede and O. Gunther. “Multidimensional access methods,” ACM Computer Surveys, Vol. 30:170–231, 1998.

    Google Scholar 

  34. J.L. Wiener and J.F. Naughton. “Bulk loading into an OODB: A performance study,” International Conference on Very Large Data Bases, 120–131, 1994.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, L., Choubey, R. & Rundensteiner, E.A. Merging R-Trees: Efficient Strategies for Local Bulk Insertion. GeoInformatica 6, 7–34 (2002). https://doi.org/10.1023/A:1013764014000

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013764014000

Navigation